pytorch和tensorflow的爱恨情仇之参数初始化

pytorch版本：1.6.0

tensorflow版本：1.15.0

关于参数初始化，主要的就是一些数学中的分布，比如正态分布、均匀分布等等。

1、pytorch

（1）自定义可训练参数

torch.bernoulli(input, out=None) → Tensor	从伯努利分布中抽取二进制随机数 (0 或 1)
torch.multinomial(input, num_samples, replacement=False, out=None)→ LongTensor	返回一个张量, 其中每一行包含在 input 张量对应行中多项式分布取样的 num_samples 索引
torch.normal(means, std, out=None)	返回一个随机数张量, 随机数从给定平均值和标准差的离散正态分布中抽取.
torch.normal(mean=0.0, std, out=None)	功能与上面函数类似, 但所有被抽取的元素共享均值
torch.normal(means, std=1.0, out=None)	功能与上面函数类似, 但所有被抽取的元素共享标准差
torch.rand(sizes, out=None*) → Tensor	在区间 [0,1)中, 返回一个填充了均匀分布的随机数的张量.这个张量的形状由可变参数 sizes 来定义
torch.randn(sizes, out=None*) → Tensor	返回一个从正态分布中填充随机数的张量, 其均值为 0 , 方差为 1 .这个张量的形状被可变参数 sizes 定义
torch.randperm(n, out=None) → LongTensor	返回一个从 0 to n - 1 的整数的随机排列
In-place random sampling (直接随机采样)
torch.Tensor.bernoulli_()	torch.bernoulli() 的 in-place 版本
torch.Tensor.cauchy_()	从柯西分布中抽取数字
torch.Tensor.exponential_()	从指数分布中抽取数字
torch.Tensor.geometric_()	从几何分布中抽取元素
torch.Tensor.log_normal_()	对数正态分布中的样本
torch.Tensor.normal_()	是 torch.normal() 的 in-place 版本
torch.Tensor.random_()	离散均匀分布中采样的数字
torch.Tensor.uniform_()	正态分布中采样的数字

说明：像这种normal_()最后带下划线的是对原始的数据进行操作。

当然还有一些像：torch.zeros()、torch.zeros_()、torch.ones()、torch.ones_()等函数；

以下的例子是使用这些分布进行的参数初始化：

a = torch.Tensor(3, 3).bernoulli_()

tensor([[1., 1., 1.],

        [0., 1., 0.],

        [0., 1., 0.]])

a = torch.Tensor(3, 3).normal_(0,1)

tensor([[ 0.7777,  0.9153, -0.1495],

        [-0.0533,  1.6500, -1.2531],

        [-0.5321,  0.1954, -1.3835]])

然后我们将其放到torch.tensor()中，并设定可进行梯度计算：

b = torch.tensor(a,requires_grad=True)

E:\anaconda2\envs\python36\lib\site-packages\ipykernel_launcher.py:1: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).

  """Entry point for launching an IPython kernel.

Out[7]:

tensor([[ 0.7777,  0.9153, -0.1495],

        [-0.0533,  1.6500, -1.2531],

        [-0.5321,  0.1954, -1.3835]], requires_grad=True)

这里报了以上警告，我们按照提示修改成以下即可：

c = a.clone().detach().requires_grad_(True)

结果是一样的：

tensor([[ 0.7777,  0.9153, -0.1495],

        [-0.0533,  1.6500, -1.2531],

        [-0.5321,  0.1954, -1.3835]], requires_grad=True)

（2）在网络中初始化层参数

PyTorch 中参数的默认初始化在各个层的 reset_parameters() 方法中。

class Net(nn.Module):

    def __init__(self,input,hidden,classes):

        super(Net, self).__init__()

        self.input = input

        self.hidden = hidden

        self.classes = classes

        self.w0 = nn.Parameter(torch.Tensor(self.input,self.hidden))

        self.b0 = nn.Parameter(torch.Tensor(self.hidden))

        self.w1 = nn.Parameter(torch.Tensor(self.hidden,self.classes))

        self.b1 = nn.Parameter(torch.Tensor(self.classes))

        self.reset_parameters()

    def reset_parameters(self):

        nn.init.normal_(self.w0)

        nn.init.constant_(self.b0,0)

        nn.init.normal_(self.w1)

        nn.init.constant_(self.b1,0)

    def forward(self,x):

        out = torch.matmul(x,self.w0)+self.b0

        out = F.relu(out)

        out = torch.matmul(out,self.w1)+self.b1

        return out

nn.Parameter()函数的作用：使用这个函数的目的也是想让某些变量在学习的过程中不断的修改其值以达到最优化；

可以使用torch.nn.init()中的初始化方法：

w = torch.empty(2, 3)

# 1. 均匀分布 - u(a,b)

# torch.nn.init.uniform_(tensor, a=0, b=1)

nn.init.uniform_(w)

# tensor([[ 0.0578,  0.3402,  0.5034],

#         [ 0.7865,  0.7280,  0.6269]])

# 2. 正态分布 - N(mean, std)

# torch.nn.init.normal_(tensor, mean=0, std=1)

nn.init.normal_(w)

# tensor([[ 0.3326,  0.0171, -0.6745],

#        [ 0.1669,  0.1747,  0.0472]])

# 3. 常数 - 固定值 val

# torch.nn.init.constant_(tensor, val)

nn.init.constant_(w, 0.3)

# tensor([[ 0.3000,  0.3000,  0.3000],

#         [ 0.3000,  0.3000,  0.3000]])

# 4. 对角线为 1，其它为 0

# torch.nn.init.eye_(tensor)

nn.init.eye_(w)

# tensor([[ 1.,  0.,  0.],

#         [ 0.,  1.,  0.]])

# 5. Dirac delta 函数初始化，仅适用于 {3, 4, 5}-维的 torch.Tensor

# torch.nn.init.dirac_(tensor)

w1 = torch.empty(3, 16, 5, 5)

nn.init.dirac_(w1)

# 6. xavier_uniform 初始化

# torch.nn.init.xavier_uniform_(tensor, gain=1)

# From - Understanding the difficulty of training deep feedforward neural networks - Bengio 2010

nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))

# tensor([[ 1.3374,  0.7932, -0.0891],

#         [-1.3363, -0.0206, -0.9346]])

# 7. xavier_normal 初始化

# torch.nn.init.xavier_normal_(tensor, gain=1)

nn.init.xavier_normal_(w)

# tensor([[-0.1777,  0.6740,  0.1139],

#         [ 0.3018, -0.2443,  0.6824]])

# 8. kaiming_uniform 初始化

# From - Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - HeKaiming 2015

# torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu')

# tensor([[ 0.6426, -0.9582, -1.1783],

#         [-0.0515, -0.4975,  1.3237]])

# 9. kaiming_normal 初始化

# torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')

nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu')

# tensor([[ 0.2530, -0.4382,  1.5995],

#         [ 0.0544,  1.6392, -2.0752]])

# 10. 正交矩阵 - (semi)orthogonal matrix

# From - Exact solutions to the nonlinear dynamics of learning in deep linear neural networks - Saxe 2013

# torch.nn.init.orthogonal_(tensor, gain=1)

nn.init.orthogonal_(w)

# tensor([[ 0.5786, -0.5642, -0.5890],

#         [-0.7517, -0.0886, -0.6536]])

# 11. 稀疏矩阵 - sparse matrix

# 非零元素采用正态分布 N(0, 0.01) 初始化.

# From - Deep learning via Hessian-free optimization - Martens 2010

# torch.nn.init.sparse_(tensor, sparsity, std=0.01)

nn.init.sparse_(w, sparsity=0.1)

# tensor(1.00000e-03 *

#        [[-0.3382,  1.9501, -1.7761],

#         [ 0.0000,  0.0000,  0.0000]])

如果是pytorch中自带的层的参数，我们可以这么进行初始化：

for m in model.modules():

    if isinstance(m, (nn.Conv2d, nn.Linear)):

        nn.init.xavier_uniform_(m.weight)

上面这段代码的意思是：遍历模型的每一层，如果是nn.Conv2d和nn.Linear类型，则获取它的权重参数m.weight进行xavier_uniform初始化，同样的，可以通过m.bias来获取偏置项。下面看一下pytorch版本的残差网络进行参数初始化的代码：

for m in self.modules():

    if isinstance(m, nn.Conv2d):

        nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')

    elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):

        nn.init.constant_(m.weight, 1)

        nn.init.constant_(m.bias, 0)

该代码块是在__ini__中使用的，这里的self就指代了当前model。

参考：

https://blog.csdn.net/ys1305/article/details/94332007

2、tensorflow

（1）自定义参数初始化

创建一个2*3的矩阵，并让所有元素的值为0.（类型为tf.float）

a = tf.zeros([2,3], dtype = tf.float32)

创建一个3*4的矩阵，并让所有元素的值为1.

b = tf.ones([3,4])

创建一个1*10的矩阵，使用2来填充。（类型为tf.int32，可忽略）

c = tf.constant(2, dtype=tf.int32, shape=[1,10])

创建一个1*10的矩阵，其中的元素符合正态分布，平均值是20，标准偏差是3.

d = tf.random_normal([1,10],mean = 20, stddev = 3)

上面所有的值都可以用来初始化变量。例如用0.01来填充一个1*2的矩阵来初始化一个叫bias的变量。

bias = tf.Variable(tf.zeros([1,2]) + 0.01)

（2）谁用类型__initializer() 进行初始化

初始化常量

import tensorflow as tf

value = [0, 1, 2, 3, 4, 5, 6, 7]

init = tf.constant_initializer(value)

with tf.Session() as sess:

  x = tf.get_variable('x', shape=[8], initializer=init)

  x.initializer.run()

  print(x.eval())

#output:

#[ 0.  1.  2.  3.  4.  5.  6.  7.]

tf.zeros_initializer() 和 tf.ones_initializer() 类，分别用来初始化全0和全1的tensor对象。

import tensorflow as tf

init_zeros=tf.zeros_initializer()

init_ones = tf.ones_initializer

with tf.Session() as sess:

  x = tf.get_variable('x', shape=[8], initializer=init_zeros)

  y = tf.get_variable('y', shape=[8], initializer=init_ones)

  x.initializer.run()

  y.initializer.run()

  print(x.eval())

  print(y.eval())

#output:

# [ 0.  0.  0.  0.  0.  0.  0.  0.]

# [ 1.  1.  1.  1.  1.  1.  1.  1.]

初始化为正态分布

初始化参数为正太分布在神经网络中应用的最多，可以初始化为标准正太分布和截断正太分布。

tf中使用 tf.random_normal_initializer() 类来生成一组符合标准正太分布的tensor。

tf中使用 tf.truncated_normal_initializer() 类来生成一组符合截断正太分布的tensor。

mean：正太分布的均值，默认值0
stddev：正太分布的标准差，默认值1
seed：随机数种子，指定seed的值可以每次都生成同样的数据
dtype：数据类型

import tensorflow as tf

init_random = tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32)

init_truncated = tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32)

with tf.Session() as sess:

  x = tf.get_variable('x', shape=[10], initializer=init_random)

  y = tf.get_variable('y', shape=[10], initializer=init_truncated)

  x.initializer.run()

  y.initializer.run()

  print(x.eval())

  print(y.eval())

#output:

# [-0.40236568 -0.35864913 -0.94253045 -0.40153521  0.1552504   1.16989613

#   0.43091929 -0.31410623  0.70080078 -0.9620409 ]

# [ 0.18356581 -0.06860946 -0.55245203  1.08850253 -1.13627422 -0.1006074

#   0.65564936  0.03948414  0.86558545 -0.4964745 ]

初始化为均匀分布

tf中使用 tf.random_uniform_initializer 类来生成一组符合均匀分布的tensor。

minval: 最小值
maxval：最大值
seed：随机数种子
dtype：数据类型

import tensorflow as tf

init_uniform = tf.random_uniform_initializer(minval=0, maxval=10, seed=None, dtype=tf.float32)

with tf.Session() as sess:

  x = tf.get_variable('x', shape=[10], initializer=init_uniform)

  x.initializer.run()

  print(x.eval())

# output:

# [ 6.93343639  9.41196823  5.54009819  1.38017178  1.78720832  5.38881063

#   3.39674473  8.12443542  0.62157512  8.36026382]

其它的一些：

tf.orthogonal_initializer() 初始化为正交矩阵的随机数，形状最少需要是二维的

tf.glorot_uniform_initializer() 初始化为与输入输出节点数相关的均匀分布随机数

tf.glorot_normal_initializer（）初始化为与输入输出节点数相关的截断正太分布随机数

在使用时：

with tf.Session() as sess:

    init_op = tf.global_variables_initializer()

    sess.run(init_op)

使用以上方式将参数进行初始化。

补充：从两个方法的名称上，可以简单理解一下，Variable是定义变量，而get_variable是获取变量（只不过如果获取不到就重新定义一个变量）

具体差异可以参考：https://blog.csdn.net/kevindree/article/details/86936476

参考：

https://blog.csdn.net/dcrmg/article/details/80034075

pytorch和tensorflow的爱恨情仇之参数初始化的更多相关文章

pytorch和tensorflow的爱恨情仇之基本数据类型
自己一直以来都是使用的pytorch,最近打算好好的看下tensorflow,新开一个系列:pytorch和tensorflow的爱恨情仇(相爱相杀...) 无论学习什么框架或者是什么编程语言,最基础 ...
pytorch和tensorflow的爱恨情仇之定义可训练的参数
pytorch和tensorflow的爱恨情仇之基本数据类型 pytorch和tensorflow的爱恨情仇之张量 pytorch版本:1.6.0 tensorflow版本:1.15.0 之前我们就已 ...
pytorch和tensorflow的爱恨情仇之张量
pytorch和tensorflow的爱恨情仇之基本数据类型:https://www.cnblogs.com/xiximayou/p/13759451.html pytorch版本:1.6.0 ten ...
pytorch和tensorflow的爱恨情仇之一元线性回归例子（keras插足啦）
直接看代码: 一.tensorflow #tensorflow import tensorflow as tf import random import numpy as np x_data = np ...
Menu与ActionBar的爱恨情仇
最近在开发一款音乐播放器,在开发过程中遇到了一点小麻烦,通过android API搞清楚了Menu与ActionBar的爱恨情仇,写了个小Demo祭奠一下那些年我们陷进去的坑,有不对的地方请大神们批评 ...
web移动端fixed布局和input等表单的爱恨情仇 - 终极BUG，完美解决
[问题]移动端开发,ios下当fixed属性和输入框input(这里不限于input,只要可以调用移动端输入法的都包括,如:textarea.HTML5中contenteditable等),同时存在的 ...
注解：大话AOP与Android的爱恨情仇
转载:大话AOP与Android的爱恨情仇 1. AOP与OOP的区别平时我接触多的就是OOP(Object Oriented Programming面向对象).AOP(Aspect Oriente ...
除了love和hate，还能怎么表达那些年的“爱恨情仇”？
实用英语帮你全面提高英语水平关注童鞋们每次刷美剧的时候,相信都会被CP感满满的男女主角虐得体无完肤吧. 可是,一到我们自己表达爱意或者恨意的时候,却苦于词穷,只会用love, like, hat ...
对json的爱恨情仇
本文回想了对json的爱恨情仇. C++有风险,使用需慎重. 本文相关代码在:http://download.csdn.net/detail/baihacker/7862785 当中的測试数据不在里面 ...

随机推荐

点击穿透事件-----CSS新属性
面试被问,一脸懵,被提示,还蒙,好丢脸的感觉....赶紧百度了解 .noclick{ pointer-events: none; /* 上层加上这句样式可以实现点击穿透 */ } 就是说重叠在一起的两 ...
初学WebGL引擎-BabylonJS：第1篇-基础构造
继续上篇随笔步骤如下: 一:http://www.babylonjs.com/中下载源码.获取其中babylon.2.2.js.建立gulp项目
spring cloud 路由
Spring Cloud Feign:用于微服务之间,只映射内网ip Spring Cloud Gateway:用于服务端,对外开放的接口,对外统一访问gateway映射的ip 是这样吗? 但是这样权 ...
数据库系统第六章【关系数据理论】（B站视频）
目录数据库系统第六章[关系数据理论](B站视频) 一.前言二.规范化函数依赖三种分类如何确定函数依赖? 平凡函数依赖vs非平凡函数依赖完全函数依赖vs部分函数依赖传递函数依赖码超码 ...
太刺激了，面试官让我手写跳表，而我用两种实现方式吊打了TA！
前言本文收录于专辑:http://dwz.win/HjK,点击解锁更多数据结构与算法的知识. 你好,我是彤哥. 上一节,我们一起学习了关于跳表的理论知识,相信通过上一节的学习,你一定可以给面试官完完 ...
vue-cli3项目配置eslint代码规范
前言最近接手了一个项目,由于之前为了快速开发,没有做代码检查.为了使得代码更加规范以及更易读,所以就要eslint上场了. 安装依赖安装依赖有两种方法: 1. 在cmd中打上把相应的依赖加到dev ...
[Java]取得当前代码所在函数的函数名
要取得当前运行代码的函数名,可以用: Thread.currentThread().getStackTrace()[1].getMethodName(); 但是,这行代码有些过长,嵌入业务代码稍显突兀 ...
为什么ping不通google.com
前言为什么在ping不通Google的时候,我们却可以web直接访问Google (已开启SSR 翻墙) SSR访问Google 因为GFW的限制导致国内无法直接访问谷歌,那么SSR为什么能绕过限 ...
Java实现获取命令行中的指定数据
构造一个ping的命令类这个类中可以设置需要ping的目标域名类提供方法public void exec();方法执行完毕后可以读取ping的次数,ping的成功回应包个数ping的丢包个数,ping ...
python基础画图
python 画图 matplotlib 库只保存图片,不显示图片? 在导入库时,添加如下代码 import matplotlib matplotlib.use('Agg') 各种 symbol ? ...

pytorch和tensorflow的爱恨情仇之参数初始化

pytorch和tensorflow的爱恨情仇之参数初始化的更多相关文章

随机推荐

热门专题