Pytorch_Part3_模型模块

VisualPytorch beta发布了！

功能概述：通过可视化拖拽网络层方式搭建模型，可选择不同数据集、损失函数、优化器生成可运行pytorch代码

扩展功能：1. 模型搭建支持模块的嵌套；2. 模型市场中能共享及克隆模型；3. 模型推理助你直观的感受神经网络在语义分割、目标探测上的威力；4.添加图像增强、快速入门、参数弹窗等辅助性功能

修复缺陷：1.大幅改进UI界面，提升用户体验；2.修改注销不跳转、图片丢失等已知缺陷；3.实现双服务器访问，缓解访问压力

访问地址：http://sunie.top:9000

发布声明详见：https://www.cnblogs.com/NAG2020/p/13030602.html

一、模型创建与nn.Module

1. 模型创建步骤

torch.nn
nn.Parameter	张量子类，表示可学习参数，如weight, bias
nn.Module	所有网络层基类，管理网络属性
nn.functional	函数具体实现，如卷积，池化，激活函数等
nn.init	参数初始化方法

2. nn.model

属性

parameters : 存储管理nn.Parameter类
modules : 存储管理nn.Module类
buffers：存储管理缓冲属性，如BN层中的running_mean
***_hooks ：存储管理钩子函数

调用步骤：

采用步进(Step into)的调试方法从创建网络模型开始（net =LeNet(classes=2)）进入到每一个被调用函数，观察net的_modules字段何时被构建并且赋值，记录其中所有进入的类与函数

net = LeNet(classes=2)

LeNet类 __init__()，super(LeNet, self).__init__()

 def __init__(self, classes):

        super(LeNet, self).__init__()

        self.conv1 = nn.Conv2d(3, 6, 5)

        self.conv2 = nn.Conv2d(6, 16, 5)

        self.fc1 = nn.Linear(16*5*5, 120)

        self.fc2 = nn.Linear(120, 84)

        self.fc3 = nn.Linear(84, classes)

Module类 __init__(), self._construct()，构造8个有序字典

    def _construct(self):

        """

        Initializes internal Module state, shared by both nn.Module and ScriptModule.

        """

        torch._C._log_api_usage_once("python.nn_module")

        self._backend = thnn_backend

        self._parameters = OrderedDict()

        self._buffers = OrderedDict()

        self._backward_hooks = OrderedDict()

        self._forward_hooks = OrderedDict()

        self._forward_pre_hooks = OrderedDict()

        self._state_dict_hooks = OrderedDict()

        self._load_state_dict_pre_hooks = OrderedDict()

        self._modules = OrderedDict()

LeNet类：构造卷积层 nn.Conv2d(3, 6, 5)

Conv2d类：__init()__，继承自_ConvNd类，调用父类构造

    def __init__(self, in_channels, out_channels, kernel_size, stride=1,

                 padding=0, dilation=1, groups=1,

                 bias=True, padding_mode='zeros'):

        kernel_size = _pair(kernel_size)

        stride = _pair(stride)

        padding = _pair(padding)

        dilation = _pair(dilation)

        super(Conv2d, self).__init__(

            in_channels, out_channels, kernel_size, stride, padding, dilation,

            False, _pair(0), groups, bias, padding_mode)

_ConvNd类：__init__()，继承自Module，调用父类构造，同二三步，再进行变量初始化

LeNet类：返回至self.conv1 = nn.Conv2d(3, 6, 5)，被父类(nn.Model)__setattr__()函数拦截

# name = 'conv1'

# value = Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))

 modules = self.__dict__.get('_modules')

            if isinstance(value, Module):

                if modules is None:

                    raise AttributeError(

                        "cannot assign module before Module.__init__() call")

                remove_from(self.__dict__, self._parameters, self._buffers)

                modules[name] = value

因而被记录到LeNet类的_modules中

继续构建其他网络层，最后得到的net如下：

总结

一个module可以包含多个子module
一个module相当于一个运算，必须实现forward()函数
每个module都有8个字典管理它的属性

def forward(self, x):

    out = F.relu(self.conv1(x))

    out = F.max_pool2d(out, 2)

    out = F.relu(self.conv2(out))

    out = F.max_pool2d(out, 2)

    out = out.view(out.size(0), -1)

    out = F.relu(self.fc1(out))

    out = F.relu(self.fc2(out))

    out = self.fc3(out)

    return out

二、模型容器与AlexNet构建

nn.Sequential：顺序性，各网络层之间严格按顺序执行，常用于block构建

nn.ModuleList：迭代性，常用于大量重复网构建，通过for循环实现重复构建

nn.ModuleDict：索引性，常用于可选择的网络层

1. 模型容器之Sequential

nn.Sequential 是 nn.module的容器，用于按顺序包装一组网络层

顺序性：各网络层之间严格按照顺序构建
自带forward()：自带的forward里，通过for循环依次执行前向传播运算

class LeNetSequential(nn.Module):

    def __init__(self, classes):

        super(LeNetSequential, self).__init__()

        self.features = nn.Sequential(

            nn.Conv2d(3, 6, 5),

            nn.ReLU(),

            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(6, 16, 5),

            nn.ReLU(),

            nn.MaxPool2d(kernel_size=2, stride=2),)

        '''

        或者如下：给每层起名，默认以序号排名

        self.features = nn.Sequential(OrderedDict({

            'conv1': nn.Conv2d(3, 6, 5),

            'relu1': nn.ReLU(inplace=True),

            'pool1': nn.MaxPool2d(kernel_size=2, stride=2),

            'conv2': nn.Conv2d(6, 16, 5),

            'relu2': nn.ReLU(inplace=True),

            'pool2': nn.MaxPool2d(kernel_size=2, stride=2),

        }))

        '''

        self.classifier = nn.Sequential(

            nn.Linear(16*5*5, 120),

            nn.ReLU(),

            nn.Linear(120, 84),

            nn.ReLU(),

            nn.Linear(84, classes),)

    def forward(self, x):

        x = self.features(x)

        x = x.view(x.size()[0], -1)

        x = self.classifier(x)

        return x

调用步骤

LeNetSequential.__init__()

Sequential.__init__()

    def __init__(self, *args):

        super(Sequential, self).__init__()

        if len(args) == 1 and isinstance(args[0], OrderedDict):

            for key, module in args[0].items():

                self.add_module(key, module)

        else:

            for idx, module in enumerate(args):

                self.add_module(str(idx), module)

Model.__init__()
Sequential.add_module() : self._modules[name] = module
LeNetSequential 中将Sequential赋值过程被 __setattr__() 拦截，而同样也是Model，被设为_models的一部分

2. 模型容器之ModuleList

nn.ModuleList是 nn.module的容器，用于包装一组网络层，以迭代方式调用网络层

主要方法：

append()：在ModuleList后面添加网络层
extend()：拼接两个ModuleList
insert()：指定在ModuleList中位置插入网络层

class ModuleList(nn.Module):

    def __init__(self):

        super(ModuleList, self).__init__()

        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(20)]) # 仅一行代码实现20层10单元全连接

    def forward(self, x):

        for i, linear in enumerate(self.linears):

            x = linear(x)

        return x

其中，ModuleList.__init__()

    def __init__(self, modules=None):

        super(ModuleList, self).__init__()

        if modules is not None:

            self += modules

3. 模型容器之ModuleDict

nn.ModuleDict是 nn.module的容器，用于包装一组网络层，以索引方式调用网络层

主要方法：

clear()：清空ModuleDict
items()：返回可迭代的键值对(key-value pairs)
keys()：返回字典的键(key)
values()：返回字典的值(value)
pop()：返回一对键值，并从字典中删除

class ModuleDict(nn.Module):

    def __init__(self):

        super(ModuleDict, self).__init__()

        self.choices = nn.ModuleDict({

            'conv': nn.Conv2d(10, 10, 3),

            'pool': nn.MaxPool2d(3)

        })

        self.activations = nn.ModuleDict({

            'relu': nn.ReLU(),

            'prelu': nn.PReLU()

        })

    def forward(self, x, choice, act):

        x = self.choices[choice](x)

        x = self.activations[act](x)

        return x

其中，每一个ModuleDict模块相当于多路选择器，在输入时要指定通路：

net = ModuleDict()

fake_img = torch.randn((4, 10, 32, 32))

output = net(fake_img, 'conv', 'relu')

4. AlexNet构建

AlexNet：2012年以高出第二名10多个百分点的准确率获得ImageNet分类任务冠

军，开创了卷积神经网络的新时代

AlexNet特点如下：

采用ReLU：替换饱和激活函数，减轻梯度消失
采用LRN(Local Response Normalization)：对数据归一化，减轻梯度消失
Dropout：提高全连接层的鲁棒性，增加网络的泛化能力
Data Augmentation：TenCrop，色彩修改

构建：使用了Sequential和其自带的forward()方法

class AlexNet(nn.Module):

    def __init__(self, num_classes=1000):

        super(AlexNet, self).__init__()

        self.features = nn.Sequential(

            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),

            nn.ReLU(inplace=True),

            nn.MaxPool2d(kernel_size=3, stride=2),

            nn.Conv2d(64, 192, kernel_size=5, padding=2),

            nn.ReLU(inplace=True),

            nn.MaxPool2d(kernel_size=3, stride=2),

            nn.Conv2d(192, 384, kernel_size=3, padding=1),

            nn.ReLU(inplace=True),

            nn.Conv2d(384, 256, kernel_size=3, padding=1),

            nn.ReLU(inplace=True),

            nn.Conv2d(256, 256, kernel_size=3, padding=1),

            nn.ReLU(inplace=True),

            nn.MaxPool2d(kernel_size=3, stride=2),

        )

        '''

        这样命名

        self.features = nn.Sequential(

            'conv1': nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),

            'relu1': nn.ReLU(inplace=True),

            'pool1': nn.MaxPool2d(kernel_size=3, stride=2),

            'conv2': nn.Conv2d(64, 192, kernel_size=5, padding=2),

            'relu2': nn.ReLU(inplace=True),

            'pool2': nn.MaxPool2d(kernel_size=3, stride=2),

            'conv3': nn.Conv2d(192, 384, kernel_size=3, padding=1),

            'relu3': nn.ReLU(inplace=True),

            'conv4': nn.Conv2d(384, 256, kernel_size=3, padding=1),

            'relu4': nn.ReLU(inplace=True),

            'conv5': nn.Conv2d(256, 256, kernel_size=3, padding=1),

            'relu5': nn.ReLU(inplace=True),

            'pool5': nn.MaxPool2d(kernel_size=3, stride=2),

        )

        '''

        self.avgpool = nn.AdaptiveAvgPool2d((6, 6))

        self.classifier = nn.Sequential(

            nn.Dropout(),

            nn.Linear(256 * 6 * 6, 4096),

            nn.ReLU(inplace=True),

            nn.Dropout(),

            nn.Linear(4096, 4096),

            nn.ReLU(inplace=True),

            nn.Linear(4096, num_classes),

        )

    def forward(self, x):

        x = self.features(x)

        x = self.avgpool(x)

        x = torch.flatten(x, 1)

        x = self.classifier(x)

        return x

同样在torchvision/models 下还有googlenet resnet 等经典网络的构建。

三、卷积层

1. 1d/2d/3d卷积

卷积运算：卷积核在输入信号（图像）上滑动，相应位置上进行乘加

卷积核：又称为滤波器，过滤器，可认为是某种模式，某种特征。

卷积过程类似于用一个模版去图像上寻找与它相似的区域，与卷积核模式越相似，激活值越高，从而实现特征提取（边缘，条纹，色彩这一些细节模式）

卷积维度：一般情况下，卷积核在几个维度上滑动，就是几维卷积

2. nn.Conv2d

nn.Conv2d(	in_channels,	# 输入通道数

            out_channels,	# 输出通道数，等价于卷积核个数

            kernel_size,	# 卷积核尺寸

            stride=1,		# 步长

            padding=0,		# 填充个数

            dilation=1,		# 空洞卷积大小

            groups=1,		# 分组卷积设置

            bias=True,		# 偏置

            padding_mode='zeros')

功能：对多个二维信号进行二维卷积

主要参数：

dilation：

groups：

尺寸计算：

set_seed(3)  # 设置随机种子

# =================== load img ============

path_img = os.path.join("lena.png")

img = Image.open(path_img).convert('RGB')  # 0~255

# convert to tensor

img_transform = transforms.Compose([transforms.ToTensor()])

img_tensor = img_transform(img)

img_tensor.unsqueeze_(dim=0)    # C*H*W to B*C*H*W

conv_layer = nn.Conv2d(3, 1, 3)   # input:(i, o, size) weights:(o, i , h, w)

nn.init.xavier_normal_(conv_layer.weight.data)

# calculation

img_conv = conv_layer(img_tensor)

不同的卷积核，运算结果不同:

同时卷积过程中，尺寸发生变化：

卷积前尺寸:torch.Size([1, 3, 512, 512])

卷积后尺寸:torch.Size([1, 1, 510, 510])

其中Conv2d对应Parameter是四维张量，进行二维卷积操作。大小是[1,3,3,3]（表示1个输出通道（卷积核个数），3个Channel，卷积核大小为3*3）

卷积过程如下：

3. 转置卷积

nn.ConvTranspose2d(	in_channels,

                    out_channels,

                    kernel_size,

                    stride=1,

                    padding=0,

                    output_padding=0,

                    groups=1,

                    bias=True,

                    dilation=1,

                    padding_mode='zeros')

转置卷积又称部分跨越卷积(Fractionallystrided Convolution) ，用于对图像进行上采样(UpSample)

注意：虽然转置卷积核对应的矩阵与卷积核对应的矩阵形状上乘转置关系，但数值上完全无关，即为不可逆过程。

conv_layer = nn.ConvTranspose2d(3, 1, 3, stride=2)   # input:(i, o, size)

# 卷积前尺寸:torch.Size([1, 3, 512, 512])

# 卷积后尺寸:torch.Size([1, 1, 1025, 1025])

图像尺寸变大，出现大量空格，称之为转置卷积的

[棋盘效应]: https://www.jianshu.com/p/36ff39344de5

四、nn网络层-池化-线性-激活函数

1. 池化层

最大值/平均值

nn.MaxPool2d(kernel_size,

			stride=None,

            padding=0,

            dilation=1, 			# 池化核间隔大小

            return_indices=False, 	# 记录池化像素索引

            ceil_mode=False			# 尺寸向上取整

            )

nn.AvgPool2d(kernel_size,

            stride=None,

            padding=0,

            ceil_mode=False,

            count_include_pad=True, 	# 填充值用于计算

            divisor_override=None		# 除法因子，除的不再是核的大小

            )

池化运算：对信号进行 “收集”（多变少）并“总结”，类似水池收集水资源，因而

得名池化层

反池化

nn.MaxUnpool2d(	kernel_size,

                stride=None,

                padding=0

              )

forward(self, input, indices, output_size=None)

功能：对二维信号（图像）进行最大值池化上采样

2. 线性层

nn.Linear(in_features, 		# 输入结点数

			out_features, 	# 输出结点数

			bias=True)

线性层又称全连接层，其每个神经元与上一层所有神经元相连，实现对前一层的线性组合，线性变换

Input = [1, 2, 3] shape = (1, 3)

W_0 =

shape = (3, 4)

Hidden = Input * W_0 shape = (1, 4) = [6, 12, 18, 24]

3. 激活函数层

激活函数对特征进行非线性变换，赋予多层神经网络具有深度的意义

nn.Sigmoid

计算公式：\(y = \frac{1}{1+e^{-x}}\)

梯度公式：′ = ∗ −

特性：

输出值在(0,1)，符合概率
导数范围是[0, 0.25],易导致梯度消失
输出为非0均值，破坏数据分布

nn.tanh

计算公式： =\(\frac{sinh x}{cosh x} = \frac{e^x-e^{-x}}{e^x+e^{-x}}=\frac{2}{1+e^{-2x}}+1\)

梯度公式：′ = − y

特性：

输出值在(-1,1)，数据符合0均值
导数范围是(0, 1),易导致梯度消失

nn.ReLU

计算公式： = max(, )

梯度公式：′ = , >

, =

, <

特性：

输出值均为正数，负半轴导致死神经元
导数是1,缓解梯度消失，但易引发梯度爆炸

nn.LeakyReLU

negative_slope: 负半轴斜率

nn.PReLU

init: 可学习斜率

nn.RReLU

lower: 均匀分布下限
upper:均匀分布上限