[Converge] Weight Initialiser
From: http://www.cnblogs.com/denny402/p/6932956.html
[, ] fully connected w = tf.Variable(tf.truncated_normal([img_pixel_input, layersize], mean=0.0, stddev=1.0, dtype=tf.float32))
b = tf.Variable(tf.truncated_normal([layersize ], mean=0.0, stddev=1.0, dtype=tf.float32)) Epoch , Training Loss: 2.10244838786, Test accuracy: 0.514423076923, time: .95s, total time: .8s
Epoch , Training Loss: 1.86659669154, Test accuracy: 0.640424679487, time: .54s, total time: .12s
Epoch , Training Loss: 1.80024383674, Test accuracy: 0.680989583333, time: .49s, total time: .47s
Epoch , Training Loss: 1.77303568244, Test accuracy: 0.699318910256, time: .53s, total time: .63s
Epoch , Training Loss: 1.75938568276, Test accuracy: 0.712740384615, time: .4s, total time: .81s
Epoch , Training Loss: 1.74897368638, Test accuracy: 0.718449519231, time: .57s, total time: .27s
Epoch , Training Loss: 1.7434025914, Test accuracy: 0.722355769231, time: .37s, total time: .52s
Epoch , Training Loss: 1.71330288407, Test accuracy: 0.792668269231, time: .37s, total time: .75s
Epoch , Training Loss: 1.66116618999, Test accuracy: 0.850560897436, time: .51s, total time: .08s
Epoch , Training Loss: 1.600759656, Test accuracy: 0.88030849359, time: .46s, total time: .32s
Epoch , Training Loss: 1.58312522976, Test accuracy: 0.892327724359, time: .52s, total time: .63s
Epoch , Training Loss: 1.5736670608, Test accuracy: 0.896534455128, time: .54s, total time: .01s
Epoch , Training Loss: 1.56778478539, Test accuracy: 0.905749198718, time: .47s, total time: .37s
Epoch , Training Loss: 1.56342586715, Test accuracy: 0.905548878205, time: .52s, total time: .71s
Epoch , Training Loss: 1.55950221926, Test accuracy: 0.906049679487, time: .54s, total time: .06s
Epoch , Training Loss: 1.55725609423, Test accuracy: 0.910356570513, time: .49s, total time: .45s
Epoch , Training Loss: 1.55490833146, Test accuracy: 0.911959134615, time: .58s, total time: .89s
Epoch , Training Loss: 1.55294992346, Test accuracy: 0.913561698718, time: .56s, total time: .26s
Epoch , Training Loss: 1.55085181106, Test accuracy: 0.916967147436, time: .51s, total time: .63s
Epoch , Training Loss: 1.54926108397, Test accuracy: 0.911858974359, time: .52s, total time: .0s
Total training time: .0s w = tf.Variable(tf.truncated_normal([img_pixel_input, layersize], mean=0.01, stddev=1.0, dtype=tf.float32))
b = tf.Variable(tf.truncated_normal([layersize ], mean=0.01, stddev=1.0, dtype=tf.float32)) Epoch , Training Loss: 2.1900485101, Test accuracy: 0.443008814103, time: .84s, total time: .75s
Epoch , Training Loss: 1.93756918807, Test accuracy: 0.599659455128, time: .48s, total time: .1s
Epoch , Training Loss: 1.84595911986, Test accuracy: 0.653145032051, time: .47s, total time: .35s
Epoch , Training Loss: 1.8073041603, Test accuracy: 0.682291666667, time: .49s, total time: .63s
Epoch , Training Loss: 1.78734811036, Test accuracy: 0.688601762821, time: .43s, total time: .86s
Epoch , Training Loss: 1.7739427098, Test accuracy: 0.700520833333, time: .43s, total time: .02s
Epoch , Training Loss: 1.76551306776, Test accuracy: 0.711738782051, time: .34s, total time: .08s
Epoch , Training Loss: 1.74105782025, Test accuracy: 0.794771634615, time: .47s, total time: .33s
Epoch , Training Loss: 1.67201814229, Test accuracy: 0.808894230769, time: .53s, total time: .68s
Epoch , Training Loss: 1.66241001194, Test accuracy: 0.811698717949, time: .49s, total time: .0s
Epoch , Training Loss: 1.65713534489, Test accuracy: 0.814202724359, time: .54s, total time: .35s
Epoch , Training Loss: 1.65359901187, Test accuracy: 0.820713141026, time: .58s, total time: .73s
Epoch , Training Loss: 1.6501801603, Test accuracy: 0.820012019231, time: .49s, total time: .08s
Epoch , Training Loss: 1.64807084891, Test accuracy: 0.821915064103, time: .5s, total time: .41s
Epoch , Training Loss: 1.64611155364, Test accuracy: 0.821314102564, time: .54s, total time: .79s
Epoch , Training Loss: 1.62634825317, Test accuracy: 0.899539262821, time: .51s, total time: .05s
Epoch , Training Loss: 1.56398414065, Test accuracy: 0.909755608974, time: .41s, total time: .26s
Epoch , Training Loss: 1.55725724714, Test accuracy: 0.912459935897, time: .51s, total time: .57s
Epoch , Training Loss: 1.55478919553, Test accuracy: 0.91796875, time: .55s, total time: .95s
Epoch , Training Loss: 1.55242318568, Test accuracy: 0.917367788462, time: .5s, total time: .25s
Total training time: .25s w = tf.Variable(tf.truncated_normal([img_pixel_input, layersize], mean=0.01, stddev=5.0, dtype=tf.float32))
b = tf.Variable(tf.truncated_normal([layersize ], mean=0.01, stddev=1.0, dtype=tf.float32)) Epoch , Training Loss: 2.39008372369, Test accuracy: 0.0950520833333, time: .94s, total time: .65s
Epoch , Training Loss: 2.33227054167, Test accuracy: 0.153245192308, time: .54s, total time: .96s
Epoch , Training Loss: 2.28677356104, Test accuracy: 0.186498397436, time: .42s, total time: .25s
Epoch , Training Loss: 2.23217486891, Test accuracy: 0.269831730769, time: .38s, total time: .4s
Epoch , Training Loss: 2.13864973875, Test accuracy: 0.351061698718, time: .47s, total time: .65s
Epoch , Training Loss: 2.07637035874, Test accuracy: 0.401041666667, time: .58s, total time: .06s
Epoch , Training Loss: 2.04344919623, Test accuracy: 0.426582532051, time: .46s, total time: .29s
Epoch , Training Loss: 2.02300423842, Test accuracy: 0.44140625, time: .52s, total time: .58s
Epoch , Training Loss: 2.00804452852, Test accuracy: 0.455428685897, time: .45s, total time: .83s
Epoch , Training Loss: 1.99567352781, Test accuracy: 0.468549679487, time: .51s, total time: .19s
Epoch , Training Loss: 1.98683612969, Test accuracy: 0.476462339744, time: .59s, total time: .59s
Epoch , Training Loss: 1.980189987, Test accuracy: 0.485677083333, time: .57s, total time: .98s
Epoch , Training Loss: 1.97373542863, Test accuracy: 0.491185897436, time: .52s, total time: .29s
Epoch , Training Loss: 1.967556376, Test accuracy: 0.491887019231, time: .61s, total time: .7s
Epoch , Training Loss: 1.96045698958, Test accuracy: 0.497395833333, time: .49s, total time: .97s
Epoch , Training Loss: 1.95221617978, Test accuracy: 0.517528044872, time: .49s, total time: .18s
Epoch , Training Loss: 1.93845896371, Test accuracy: 0.521534455128, time: .46s, total time: .35s
Epoch , Training Loss: 1.92538965999, Test accuracy: 0.539963942308, time: .43s, total time: .5s
Epoch , Training Loss: 1.91551751801, Test accuracy: 0.546173878205, time: .43s, total time: .77s
Epoch , Training Loss: 1.90569505908, Test accuracy: 0.555989583333, time: .47s, total time: .05s
Total training time: .05s
[784, 10]
小方差是首选idea! - 对比可见效果好了很多。
w = tf.Variable(tf.truncated_normal([img_pixel_input, layersize], mean=0.0, stddev=0.01, dtype=tf.float32))
b = tf.Variable(tf.truncated_normal([layersize ], mean=0.0, stddev=1.00, dtype=tf.float32))
Epoch , Training Loss: 1.75807115331, Test accuracy: 0.895833333333, time: .84s, total time: .66s
Epoch , Training Loss: 1.60653405506, Test accuracy: 0.909855769231, time: .56s, total time: .02s
Epoch , Training Loss: 1.58358776375, Test accuracy: 0.913661858974, time: .55s, total time: .37s
Epoch , Training Loss: 1.57199550759, Test accuracy: 0.918669871795, time: .53s, total time: .69s
Epoch , Training Loss: 1.56478386464, Test accuracy: 0.921173878205, time: .49s, total time: .04s
Epoch , Training Loss: 1.55968606111, Test accuracy: 0.920773237179, time: .47s, total time: .36s
Epoch , Training Loss: 1.55553424692, Test accuracy: 0.923177083333, time: .47s, total time: .72s
Epoch , Training Loss: 1.55266008566, Test accuracy: 0.926181891026, time: .5s, total time: .94s
Epoch , Training Loss: 1.54992543289, Test accuracy: 0.92578125, time: .62s, total time: .36s
Epoch , Training Loss: 1.54779823315, Test accuracy: 0.928886217949, time: .59s, total time: .78s
Epoch , Training Loss: 1.5458223992, Test accuracy: 0.928084935897, time: .59s, total time: .26s
Epoch , Training Loss: 1.5444233951, Test accuracy: 0.925681089744, time: .61s, total time: .69s
Epoch , Training Loss: 1.54265678151, Test accuracy: 0.928886217949, time: .28s, total time: .67s
Epoch , Training Loss: 1.54120515999, Test accuracy: 0.929186698718, time: .33s, total time: .76s
Epoch , Training Loss: 1.54076098256, Test accuracy: 0.930889423077, time: .4s, total time: .99s
Epoch , Training Loss: 1.53926875958, Test accuracy: 0.928685897436, time: .54s, total time: .34s
Epoch , Training Loss: 1.53855588386, Test accuracy: 0.931490384615, time: .35s, total time: .48s
Epoch , Training Loss: 1.53713625878, Test accuracy: 0.932091346154, time: .5s, total time: .76s
Epoch , Training Loss: 1.53662548226, Test accuracy: 0.932892628205, time: .53s, total time: .14s
Epoch , Training Loss: 1.53609782221, Test accuracy: 0.930989583333, time: .49s, total time: .45s
Total training time: .45s
CNN中最重要的就是参数了,包括W,b。 我们训练CNN的最终目的就是得到最好的参数,使得目标函数取得最小值。参数的初始化也同样重要,因此微调受到很多人的重视,那么tf提供了哪些初始化参数的方法呢,我们能不能自己进行初始化呢?
所有的初始化方法都定义在tensorflow/python/ops/init_ops.py
1、tf.constant_initializer()
也可以简写为tf.Constant()
初始化为常数,这个非常有用,通常偏置项就是用它初始化的。
由它衍生出的两个初始化方法:
- tf.zeros_initializer(),也可以简写为tf.Zeros()
- tf.ones_initializer(),也可以简写为tf.Ones()
例:在卷积层中,将偏置项b初始化为0,则有多种写法:
conv1 = tf.layers.conv2d(batch_images,
filters=64,
kernel_size=7,
strides=2,
activation=tf.nn.relu,
kernel_initializer=tf.TruncatedNormal(stddev=0.01)
bias_initializer=tf.Constant(0),
)
或者:
bias_initializer=tf.constant_initializer(0)
或者:
bias_initializer=tf.zeros_initializer()
或者:
bias_initializer=tf.Zeros()
例:如何将W初始化成拉普拉斯算子?
value = [1, 1, 1, 1, -8, 1, 1, 1,1]
init = tf.constant_initializer(value)
W= tf.get_variable('W', shape=[3, 3], initializer=init)
2、tf.truncated_normal_initializer()
或者简写为tf.TruncatedNormal()
生成截断正态分布的随机数,这个初始化方法好像在tf中用得比较多。
它有四个参数(mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32),分别用于指定均值、标准差、随机数种子和随机数的数据类型,一般只需要设置stddev这一个参数就可以了。
例:
conv1 = tf.layers.conv2d(batch_images,
filters=64,
kernel_size=7,
strides=2,
activation=tf.nn.relu,
kernel_initializer=tf.TruncatedNormal(stddev=0.01)
bias_initializer=tf.Constant(0),
)
或者:
conv1 = tf.layers.conv2d(batch_images,
filters=64,
kernel_size=7,
strides=2,
activation=tf.nn.relu,
kernel_initializer=tf.truncated_normal_initializer(stddev=0.01)
bias_initializer=tf.zero_initializer(),
)
3、tf.random_normal_initializer()
可简写为 tf.RandomNormal()
生成标准正态分布的随机数,参数和truncated_normal_initializer一样。
4、random_uniform_initializer = RandomUniform()
可简写为tf.RandomUniform()
生成均匀分布的随机数,参数有四个(minval=0, maxval=None, seed=None, dtype=dtypes.float32),分别用于指定最小值,最大值,随机数种子和类型。
5、tf.uniform_unit_scaling_initializer()
可简写为tf.UniformUnitScaling()
和均匀分布差不多,只是这个初始化方法不需要指定最小最大值,是通过计算出来的。参数为(factor=1.0, seed=None, dtype=dtypes.float32)
max_val = math.sqrt(3 / input_size) * factor
这里的input_size是指输入数据的维数,假设输入为x, 运算为x * W,则input_size= W.shape[0]
它的分布区间为[ -max_val, max_val]
6、tf.variance_scaling_initializer()
可简写为tf.VarianceScaling()
参数为(scale=1.0,mode="fan_in",distribution="normal",seed=None,dtype=dtypes.float32)
scale: 缩放尺度(正浮点数)
mode : "fan_in", "fan_out", "fan_avg"中的一个,用于计算标准差stddev的值。
distribution:分布类型,"normal"或“uniform"中的一个。
当 distribution="normal" 的时候,生成truncated normal distribution(截断正态分布) 的随机数,其中stddev = sqrt(scale / n) ,n的计算与mode参数有关。
如果mode = "fan_in", n为输入单元的结点数;
如果mode = "fan_out",n为输出单元的结点数;
如果mode = "fan_avg",n为输入和输出单元结点数的平均值。
当distribution="uniform”的时候 ,生成均匀分布的随机数,假设分布区间为[-limit, limit],则
limit = sqrt(3 * scale / n)
7、tf.orthogonal_initializer()
简写为tf.Orthogonal()
生成正交矩阵的随机数。
当需要生成的参数是2维时,这个正交矩阵是由均匀分布的随机数矩阵经过SVD分解而来。
8、tf.glorot_uniform_initializer()
也称之为Xavier uniform initializer,由一个均匀分布(uniform distribution)来初始化数据。
假设均匀分布的区间是[-limit, limit],则
limit=sqrt(6 / (fan_in + fan_out))
其中的fan_in和fan_out分别表示输入单元的结点数和输出单元的结点数。
9、glorot_normal_initializer()
也称之为 Xavier normal initializer. 由一个 truncated normal distribution来初始化数据.
stddev = sqrt(2 / (fan_in + fan_out))
其中的fan_in和fan_out分别表示输入单元的结点数和输出单元的结点数。
权重的可视化调试(有必要复现实验效果)
[Converge] Training Neural Networks
复现效果:
ing...
[Converge] Weight Initialiser的更多相关文章
- 本人AI知识体系导航 - AI menu
Relevant Readable Links Name Interesting topic Comment Edwin Chen 非参贝叶斯 徐亦达老板 Dirichlet Process 学习 ...
- [UFLDL] *Train and Optimize
Deep learning:三十七(Deep learning中的优化方法) Deep learning:四十一(Dropout简单理解) Deep learning:四十三(用Hessian Fre ...
- [LeetCode] Nested List Weight Sum II 嵌套链表权重和之二
Given a nested list of integers, return the sum of all integers in the list weighted by their depth. ...
- Codeforces Round #326 (Div. 2) B. Pasha and Phone C. Duff and Weight Lifting
B. Pasha and PhonePasha has recently bought a new phone jPager and started adding his friends' phone ...
- weight属性你用的真的6嘛?
相信大家在日常开发中一定使用过weight这个属性,它的作用一个是权重,另一个就是渲染优先级,但是你真的能很6的使用它嘛?如果不是,那么请继续往下看!!! 我们知道,当weight起到不同作用的时候, ...
- Android Hack1 使用weight属性实现视图的居中显示
本文地址:http://www.cnblogs.com/wuyudong/p/5898403.html,转载请注明源地址. 如果要实现如下图所示的将按钮居中显示,并且占据父视图的一半,无论屏幕是否旋转 ...
- Android weight属性详解
android:layout_weight是一个经常会用到的属性,它只在LinearLayout中生效,下面我们就来看一下: 当我们把组件宽度设置都为”match_parent”时: <Butt ...
- LeetCode Nested List Weight Sum
原题链接在这里:https://leetcode.com/problems/nested-list-weight-sum/ 题目: Given a nested list of integers, r ...
- Java基础-继承-编写一个Java应用程序,设计一个汽车类Vehicle,包含的属性有车轮个数 wheels和车重weight。小车类Car是Vehicle的子类,其中包含的属性有载人数 loader。卡车类Truck是Car类的子类,其中包含的属性有载重量payload。每个 类都有构造方法和输出相关数据的方法。最后,写一个测试类来测试这些类的功 能。
#29.编写一个Java应用程序,设计一个汽车类Vehicle,包含的属性有车轮个数 wheels和车重weight.小车类Car是Vehicle的子类,其中包含的属性有载人数 loader.卡车类T ...
随机推荐
- Python基础语法-基本数据类型
此文档解决以下问题: 一.Python中数值数据类型——整型(int).浮点型(float).布尔型(bool).复数(complex) 1.float()函数的运用 2.int()函数的运用 3.t ...
- __getattr__和__setattt__使用
# coding:utf-8 """ __setattr__(self, name, value),如果要给name赋值,调用此方法 __getattr__(self, ...
- python: 序列化/反序列化及对象的深拷贝/浅拷贝
一.序列化/反序列化 python中内置了很多序列化/反序列化的方式,最常用的有json.pickle.marshal这三种,示例用法如下: import json import pickle imp ...
- 初识zookeeper(1)之zookeeper的安装及配置
初识zookeeper(一)之zookeeper的安装及配置 1.简要介绍 zookeeper是一个分布式的应用程序协调服务,是Hadoop和Hbase的重要组件,是一个树型的目录服务,支持变更推送. ...
- Delphi自写组件:可设置颜色的按钮
unit ColorButton; interface uses Windows, Messages, SysUtils, Classes, Graphics, Controls, StdCtrls; ...
- AngularJS中处理多个promise
在使用AngularJS中处理promise的时候,有时会碰到需要处理多个promise的情况. 最简单的处理就是每个promise都then.如下: var app = angular.module ...
- android: 发送自定义广播
5.3.1 发送标准广播 在发送广播之前,我们还是需要先定义一个广播接收器来准备接收此广播才行,不然发 出去也是白发.因此新建一个 MyBroadcastReceiver 继承自 Broadca ...
- 矩阵乘法code
VOJ1067 我们可以用上面的方法二分求出任何一个线性递推式的第n项,其对应矩阵的构造方法为:在右上角的(n-1)*(n-1)的小矩阵中的主对角线上填1,矩阵第n行填对应的系数,其它地方都填0.例如 ...
- 深入理解 Java try-with-resource 语法糖
背景 众所周知,所有被打开的系统资源,比如流.文件或者Socket连接等,都需要被开发者手动关闭,否则随着程序的不断运行,资源泄露将会累积成重大的生产事故. 在Java的江湖中,存在着一种名为fina ...
- 【GPU编解码】GPU硬解码---DXVA (转)
前面介绍利用NVIDIA公司提供的CUVID库进行视频硬解码,下面将介绍利用DXVA进行硬解码. 一.DXVA介绍 DXVA是微软公司专门定制的视频加速规范,是一种接口规范.DXVA规范制定硬件加速解 ...