上周我们讲了经典CNN网络AlexNet对图像分类的效果,2014年,在AlexNet出来的两年后,牛津大学提出了Vgg网络,并在ILSVRC 2014中的classification项目的比赛中取得了第2名的成绩(第一名是GoogLeNet,也是同年提出的)。在论文《Very Deep Convolutional Networks for Large-Scale Image Recognition》中,作者提出通过缩小卷积核大小来构建更深的网络。


Vgg网络结构

  VGGnet是Oxford的Visual Geometry Group的team,在ILSVRC 2014上的主要工作是证明了增加网络的深度能够在一定程度上影响网络最终的性能,如下图,文章通过逐步增加网络深度来提高性能,虽然看起来有一点小暴力,没有特别多取巧的,但是确实有效,很多pretrained的方法就是使用VGG的model(主要是16和19),VGG相对其他的方法,参数空间很大,所以train一个vgg模型通常要花费更长的时间,不过公开的pretrained model让我们很方便的使用,paper中的几种模型如下:

图1 vgg网络结构

  图中D和E分别为VGG-16和VGG-19,参数分别是138m和144m,是文中两个效果最好的网络结构,VGG网络结构可以看做是AlexNet的加深版,VGG在图像检测中效果很好(如:Faster-RCNN),这种传统结构相对较好的保存了图片的局部位置信息(不像GoogLeNet中引入Inception可能导致位置信息的错乱)。

  我们来仔细看一下vgg16的网络结构:

                      图2 vgg16网络结构                         

  从图中可以看到,每个卷积层都使用更小的3×3卷积核对图像进行卷积,并把这些小的卷积核排列起来作为一个卷积序列。通俗点来讲就是对原始图像进行3×3卷积,然后再进行3×3卷积,连续使用小的卷积核对图像进行多次卷积。

  在alexnet里我们一开始的时候是用11*11的大卷积核网络,为什么在这里要用3*3的小卷积核来对图像进行卷积呢?并且还是使用连续的小卷积核?VGG一开始提出的时候刚好与LeNet的设计原则相违背,因为LeNet相信大的卷积核能够捕获图像当中相似的特征(权值共享)。AlexNet在浅层网络开始的时候也是使用9×9、11×11卷积核,并且尽量在浅层网络的时候避免使用1×1的卷积核。但是VGG的神奇之处就是在于使用多个3×3卷积核可以模仿较大卷积核那样对图像进行局部感知。后来多个小的卷积核串联这一思想被GoogleNet和ResNet等吸收。

  从图1的实验结果也可以看到,VGG使用多个3x3卷积来对高维特征进行提取。因为如果使用较大的卷积核,参数就会大量地增加、运算时间也会成倍的提升。例如3x3的卷积核只有9个权值参数,使用7*7的卷积核权值参数就会增加到49个。因为缺乏一个模型去对大量的参数进行归一化、约减,或者说是限制大规模的参数出现,因此训练核数更大的卷积网络就变得非常困难了。

  VGG相信如果使用大的卷积核将会造成很大的时间浪费,减少的卷积核能够减少参数,节省运算开销。虽然训练的时间变长了,但是总体来说预测的时间和参数都是减少的了。


 Vgg的优势

  与AlexNet相比:

  • 相同点

    • 整体结构分五层;
    • 除softmax层外,最后几层为全连接层;
    • 五层之间通过max pooling连接。
  • 不同点

    • 使用3×3的小卷积核代替7×7大卷积核,网络构建的比较深;
    • 由于LRN太耗费计算资源,性价比不高,所以被去掉;
    • 采用了更多的feature map,能够提取更多的特征,从而能够做更多特征的组合。  

用PaddlePaddle实现Vgg

  1.网络结构

 #coding:utf-8
'''
Created by huxiaoman 2017.12.12
vggnet.py:用vgg网络实现cifar-10分类
''' import paddle.v2 as paddle def vgg(input):
def conv_block(ipt, num_filter, groups, dropouts, num_channels=None):
return paddle.networks.img_conv_group(
input=ipt,
num_channels=num_channels,
pool_size=2,
pool_stride=2,
conv_num_filter=[num_filter] * groups,
conv_filter_size=3,
conv_act=paddle.activation.Relu(),
conv_with_batchnorm=True,
conv_batchnorm_drop_rate=dropouts,
pool_type=paddle.pooling.Max()) conv1 = conv_block(input, 64, 2, [0.3, 0], 3)
conv2 = conv_block(conv1, 128, 2, [0.4, 0])
conv3 = conv_block(conv2, 256, 3, [0.4, 0.4, 0])
conv4 = conv_block(conv3, 512, 3, [0.4, 0.4, 0])
conv5 = conv_block(conv4, 512, 3, [0.4, 0.4, 0]) drop = paddle.layer.dropout(input=conv5, dropout_rate=0.5)
fc1 = paddle.layer.fc(input=drop, size=512, act=paddle.activation.Linear())
bn = paddle.layer.batch_norm(
input=fc1,
act=paddle.activation.Relu(),
layer_attr=paddle.attr.Extra(drop_rate=0.5))
fc2 = paddle.layer.fc(input=bn, size=512, act=paddle.activation.Linear())
return fc2

  2.训练模型

 #coding:utf-8
'''
Created by huxiaoman 2017.12.12
train_vgg.py:训练vgg16对cifar10数据集进行分类
''' import sys, os
import paddle.v2 as paddle
from vggnet import vgg with_gpu = os.getenv('WITH_GPU', '') != '' def main():
datadim = 3 * 32 * 32
classdim = 10 # PaddlePaddle init
paddle.init(use_gpu=with_gpu, trainer_count=8) image = paddle.layer.data(
name="image", type=paddle.data_type.dense_vector(datadim)) net = vgg(image) out = paddle.layer.fc(
input=net, size=classdim, act=paddle.activation.Softmax()) lbl = paddle.layer.data(
name="label", type=paddle.data_type.integer_value(classdim))
cost = paddle.layer.classification_cost(input=out, label=lbl) # Create parameters
parameters = paddle.parameters.create(cost) # Create optimizer
momentum_optimizer = paddle.optimizer.Momentum(
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0002 * 128),
learning_rate=0.1 / 128.0,
learning_rate_decay_a=0.1,
learning_rate_decay_b=50000 * 100,
learning_rate_schedule='discexp') # End batch and end pass event handler
def event_handler(event):
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "\nPass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
else:
sys.stdout.write('.')
sys.stdout.flush()
if isinstance(event, paddle.event.EndPass):
# save parameters
with open('params_pass_%d.tar' % event.pass_id, 'w') as f:
parameters.to_tar(f) result = trainer.test(
reader=paddle.batch(
paddle.dataset.cifar.test10(), batch_size=128),
feeding={'image': 0,
'label': 1})
print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics) # Create trainer
trainer = paddle.trainer.SGD(
cost=cost, parameters=parameters, update_equation=momentum_optimizer) # Save the inference topology to protobuf.
inference_topology = paddle.topology.Topology(layers=out)
with open("inference_topology.pkl", 'wb') as f:
inference_topology.serialize_for_inference(f) trainer.train(
reader=paddle.batch(
paddle.reader.shuffle(
paddle.dataset.cifar.train10(), buf_size=50000),
batch_size=128),
num_passes=200,
event_handler=event_handler,
feeding={'image': 0,
'label': 1}) # inference
from PIL import Image
import numpy as np
import os def load_image(file):
im = Image.open(file)
im = im.resize((32, 32), Image.ANTIALIAS)
im = np.array(im).astype(np.float32)
im = im.transpose((2, 0, 1)) # CHW
im = im[(2, 1, 0), :, :] # BGR
im = im.flatten()
im = im / 255.0
return im test_data = []
cur_dir = os.path.dirname(os.path.realpath(__file__))
test_data.append((load_image(cur_dir + '/image/dog.png'), )) probs = paddle.infer(
output_layer=out, parameters=parameters, input=test_data)
lab = np.argsort(-probs) # probs and lab are the results of one batch data
print "Label of image/dog.png is: %d" % lab[0][0] if __name__ == '__main__':
main()

  3.训练结果

 nohup: ignoring input
I1127 09:36:58.313799 13026 Util.cpp:166] commandline: --use_gpu=True --trainer_count=7
[INFO 2017-11-27 09:37:04,477 layers.py:2539] output for __conv_0__: c = 64, h = 32, w = 32, size = 65536
[INFO 2017-11-27 09:37:04,478 layers.py:3062] output for __batch_norm_0__: c = 64, h = 32, w = 32, size = 65536
[INFO 2017-11-27 09:37:04,479 layers.py:2539] output for __conv_1__: c = 64, h = 32, w = 32, size = 65536
[INFO 2017-11-27 09:37:04,480 layers.py:3062] output for __batch_norm_1__: c = 64, h = 32, w = 32, size = 65536
[INFO 2017-11-27 09:37:04,480 layers.py:2667] output for __pool_0__: c = 64, h = 16, w = 16, size = 16384
[INFO 2017-11-27 09:37:04,481 layers.py:2539] output for __conv_2__: c = 128, h = 16, w = 16, size = 32768
[INFO 2017-11-27 09:37:04,482 layers.py:3062] output for __batch_norm_2__: c = 128, h = 16, w = 16, size = 32768
[INFO 2017-11-27 09:37:04,483 layers.py:2539] output for __conv_3__: c = 128, h = 16, w = 16, size = 32768
[INFO 2017-11-27 09:37:04,484 layers.py:3062] output for __batch_norm_3__: c = 128, h = 16, w = 16, size = 32768
[INFO 2017-11-27 09:37:04,485 layers.py:2667] output for __pool_1__: c = 128, h = 8, w = 8, size = 8192
[INFO 2017-11-27 09:37:04,485 layers.py:2539] output for __conv_4__: c = 256, h = 8, w = 8, size = 16384
[INFO 2017-11-27 09:37:04,486 layers.py:3062] output for __batch_norm_4__: c = 256, h = 8, w = 8, size = 16384
[INFO 2017-11-27 09:37:04,487 layers.py:2539] output for __conv_5__: c = 256, h = 8, w = 8, size = 16384
[INFO 2017-11-27 09:37:04,488 layers.py:3062] output for __batch_norm_5__: c = 256, h = 8, w = 8, size = 16384
[INFO 2017-11-27 09:37:04,489 layers.py:2539] output for __conv_6__: c = 256, h = 8, w = 8, size = 16384
[INFO 2017-11-27 09:37:04,490 layers.py:3062] output for __batch_norm_6__: c = 256, h = 8, w = 8, size = 16384
[INFO 2017-11-27 09:37:04,490 layers.py:2667] output for __pool_2__: c = 256, h = 4, w = 4, size = 4096
[INFO 2017-11-27 09:37:04,491 layers.py:2539] output for __conv_7__: c = 512, h = 4, w = 4, size = 8192
[INFO 2017-11-27 09:37:04,492 layers.py:3062] output for __batch_norm_7__: c = 512, h = 4, w = 4, size = 8192
[INFO 2017-11-27 09:37:04,493 layers.py:2539] output for __conv_8__: c = 512, h = 4, w = 4, size = 8192
[INFO 2017-11-27 09:37:04,494 layers.py:3062] output for __batch_norm_8__: c = 512, h = 4, w = 4, size = 8192
[INFO 2017-11-27 09:37:04,495 layers.py:2539] output for __conv_9__: c = 512, h = 4, w = 4, size = 8192
[INFO 2017-11-27 09:37:04,495 layers.py:3062] output for __batch_norm_9__: c = 512, h = 4, w = 4, size = 8192
[INFO 2017-11-27 09:37:04,496 layers.py:2667] output for __pool_3__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,497 layers.py:2539] output for __conv_10__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,498 layers.py:3062] output for __batch_norm_10__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,499 layers.py:2539] output for __conv_11__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,499 layers.py:3062] output for __batch_norm_11__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,502 layers.py:2539] output for __conv_12__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,502 layers.py:3062] output for __batch_norm_12__: c = 512, h = 2, w = 2, size = 2048
[INFO 2017-11-27 09:37:04,503 layers.py:2667] output for __pool_4__: c = 512, h = 1, w = 1, size = 512
I1127 09:37:04.563228 13026 MultiGradientMachine.cpp:99] numLogicalDevices=1 numThreads=7 numDevices=8
I1127 09:37:04.822993 13026 GradientMachine.cpp:85] Initing parameters..
I1127 09:37:05.728123 13026 GradientMachine.cpp:92] Init parameters done. Pass 0, Batch 0, Cost 2.407296, {'classification_error_evaluator': 0.8828125}
...................................................................................................
Pass 0, Batch 100, Cost 1.994910, {'classification_error_evaluator': 0.84375}
...................................................................................................
Pass 0, Batch 200, Cost 2.199248, {'classification_error_evaluator': 0.8671875}
...................................................................................................
Pass 0, Batch 300, Cost 1.982006, {'classification_error_evaluator': 0.8125}
..........................................................................................
Test with Pass 0, {'classification_error_evaluator': 0.8999999761581421} ```
```
Pass 199, Batch 0, Cost 0.012132, {'classification_error_evaluator': 0.0}
...................................................................................................
Pass 199, Batch 100, Cost 0.021121, {'classification_error_evaluator': 0.0078125}
...................................................................................................
Pass 199, Batch 200, Cost 0.068369, {'classification_error_evaluator': 0.0078125}
...................................................................................................
Pass 199, Batch 300, Cost 0.015805, {'classification_error_evaluator': 0.0}
..........................................................................................I1128 01:57:44.727157 13026 MultiGradientMachine.cpp:99] numLogicalDevices=1 numThreads=7 numDevices=8 Test with Pass 199, {'classification_error_evaluator': 0.10890000313520432}
Label of image/dog.png is: 5

  从训练结果来看,开了7个线程,8个Tesla K80,迭代200次,耗时16h21min,相比于之前训练的lenet和alexnet的几个小时来说,时间消耗很高,但是结果很好,准确率是89.11%,在同设备和迭代次数情况下,比lenet的和alexnet的精度都要高。


用Tensorflow实现vgg

  1.网络结构

 def inference_op(input_op, keep_prob):
p = []
# 第一块 conv1_1-conv1_2-pool1
conv1_1 = conv_op(input_op, name='conv1_1', kh=3, kw=3,
n_out = 64, dh = 1, dw = 1, p = p)
conv1_2 = conv_op(conv1_1, name='conv1_2', kh=3, kw=3,
n_out = 64, dh = 1, dw = 1, p = p)
pool1 = mpool_op(conv1_2, name = 'pool1', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第二块 conv2_1-conv2_2-pool2
conv2_1 = conv_op(pool1, name='conv2_1', kh=3, kw=3,
n_out = 128, dh = 1, dw = 1, p = p)
conv2_2 = conv_op(conv2_1, name='conv2_2', kh=3, kw=3,
n_out = 128, dh = 1, dw = 1, p = p)
pool2 = mpool_op(conv2_2, name = 'pool2', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第三块 conv3_1-conv3_2-conv3_3-pool3
conv3_1 = conv_op(pool2, name='conv3_1', kh=3, kw=3,
n_out = 256, dh = 1, dw = 1, p = p)
conv3_2 = conv_op(conv3_1, name='conv3_2', kh=3, kw=3,
n_out = 256, dh = 1, dw = 1, p = p)
conv3_3 = conv_op(conv3_2, name='conv3_3', kh=3, kw=3,
n_out = 256, dh = 1, dw = 1, p = p)
pool3 = mpool_op(conv3_3, name = 'pool3', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第四块 conv4_1-conv4_2-conv4_3-pool4
conv4_1 = conv_op(pool3, name='conv4_1', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv4_2 = conv_op(conv4_1, name='conv4_2', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv4_3 = conv_op(conv4_2, name='conv4_3', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
pool4 = mpool_op(conv4_3, name = 'pool4', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第五块 conv5_1-conv5_2-conv5_3-pool5
conv5_1 = conv_op(pool4, name='conv5_1', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv5_2 = conv_op(conv5_1, name='conv5_2', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv5_3 = conv_op(conv5_2, name='conv5_3', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
pool5 = mpool_op(conv5_3, name = 'pool5', kh = 2, kw = 2,
dw = 2, dh = 2)
# 把pool5 ( [7, 7, 512] ) 拉成向量
shp = pool5.get_shape()
flattened_shape = shp[1].value * shp[2].value * shp[3].value
resh1 = tf.reshape(pool5, [-1, flattened_shape], name = 'resh1') # 全连接层1 添加了 Droput来防止过拟合
fc1 = fc_op(resh1, name = 'fc1', n_out = 2048, p = p)
fc1_drop = tf.nn.dropout(fc1, keep_prob, name = 'fc1_drop') # 全连接层2 添加了 Droput来防止过拟合
fc2 = fc_op(fc1_drop, name = 'fc2', n_out = 2048, p = p)
fc2_drop = tf.nn.dropout(fc2, keep_prob, name = 'fc2_drop') # 全连接层3 加一个softmax求给类别的概率
fc3 = fc_op(fc2_drop, name = 'fc3', n_out = 10, p = p)
softmax = tf.nn.softmax(fc3)
predictions = tf.argmax(softmax, 1)
return predictions, softmax, fc3, p

  2.训练网络结构

 # -*- coding: utf-8 -*-
"""
Created by huxiaoman 2017.12.12
vgg_tf.py:训练tensorflow版的vgg16网络,对cifar-10shuju进行分类
"""
from datetime import datetime
import math
import time
import tensorflow as tf
import cifar10 batch_size = 128
num_batches = 200 # 定义函数对卷积层进行初始化
# input_op : 输入数据
# name : 该卷积层的名字,用tf.name_scope()来命名
# kh,kw : 分别是卷积核的高和宽
# n_out : 输出通道数
# dh,dw : 步长的高和宽
# p : 是参数列表,存储VGG所用到的参数
# 采用xavier方法对卷积核权值进行初始化
def conv_op(input_op, name, kh, kw, n_out, dh, dw, p):
n_in = input_op.get_shape()[-1].value # 获得输入图像的通道数
with tf.name_scope(name) as scope:
kernel = tf.get_variable(scope+'w',
shape = [kh, kw, n_in, n_out], dtype = tf.float32,
initializer = tf.contrib.layers.xavier_initializer_conv2d())
# 卷积层计算
conv = tf.nn.conv2d(input_op, kernel, (1, dh, dw, 1), padding = 'SAME')
bias_init_val = tf.constant(0.0, shape = [n_out], dtype = tf.float32)
biases = tf.Variable(bias_init_val, trainable = True, name = 'b')
z = tf.nn.bias_add(conv, biases)
activation = tf.nn.relu(z, name = scope)
p += [kernel, biases]
return activation # 定义函数对全连接层进行初始化
# input_op : 输入数据
# name : 该全连接层的名字
# n_out : 输出的通道数
# p : 参数列表
# 初始化方法用 xavier方法
def fc_op(input_op, name, n_out, p):
n_in = input_op.get_shape()[-1].value with tf.name_scope(name) as scope:
kernel = tf.get_variable(scope+'w',
shape = [n_in, n_out], dtype = tf.float32,
initializer = tf.contrib.layers.xavier_initializer())
biases = tf.Variable(tf.constant(0.1, shape = [n_out],
dtype = tf.float32), name = 'b')
activation = tf.nn.relu_layer(input_op, kernel,
biases, name = scope)
p += [kernel, biases]
return activation # 定义函数 创建 maxpool层
# input_op : 输入数据
# name : 该卷积层的名字,用tf.name_scope()来命名
# kh,kw : 分别是卷积核的高和宽
# dh,dw : 步长的高和宽
def mpool_op(input_op, name, kh, kw, dh, dw):
return tf.nn.max_pool(input_op, ksize = [1,kh,kw,1],
strides = [1, dh, dw, 1], padding = 'SAME', name = name) #---------------创建 VGG-16------------------ def inference_op(input_op, keep_prob):
p = []
# 第一块 conv1_1-conv1_2-pool1
conv1_1 = conv_op(input_op, name='conv1_1', kh=3, kw=3,
n_out = 64, dh = 1, dw = 1, p = p)
conv1_2 = conv_op(conv1_1, name='conv1_2', kh=3, kw=3,
n_out = 64, dh = 1, dw = 1, p = p)
pool1 = mpool_op(conv1_2, name = 'pool1', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第二块 conv2_1-conv2_2-pool2
conv2_1 = conv_op(pool1, name='conv2_1', kh=3, kw=3,
n_out = 128, dh = 1, dw = 1, p = p)
conv2_2 = conv_op(conv2_1, name='conv2_2', kh=3, kw=3,
n_out = 128, dh = 1, dw = 1, p = p)
pool2 = mpool_op(conv2_2, name = 'pool2', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第三块 conv3_1-conv3_2-conv3_3-pool3
conv3_1 = conv_op(pool2, name='conv3_1', kh=3, kw=3,
n_out = 256, dh = 1, dw = 1, p = p)
conv3_2 = conv_op(conv3_1, name='conv3_2', kh=3, kw=3,
n_out = 256, dh = 1, dw = 1, p = p)
conv3_3 = conv_op(conv3_2, name='conv3_3', kh=3, kw=3,
n_out = 256, dh = 1, dw = 1, p = p)
pool3 = mpool_op(conv3_3, name = 'pool3', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第四块 conv4_1-conv4_2-conv4_3-pool4
conv4_1 = conv_op(pool3, name='conv4_1', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv4_2 = conv_op(conv4_1, name='conv4_2', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv4_3 = conv_op(conv4_2, name='conv4_3', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
pool4 = mpool_op(conv4_3, name = 'pool4', kh = 2, kw = 2,
dw = 2, dh = 2)
# 第五块 conv5_1-conv5_2-conv5_3-pool5
conv5_1 = conv_op(pool4, name='conv5_1', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv5_2 = conv_op(conv5_1, name='conv5_2', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
conv5_3 = conv_op(conv5_2, name='conv5_3', kh=3, kw=3,
n_out = 512, dh = 1, dw = 1, p = p)
pool5 = mpool_op(conv5_3, name = 'pool5', kh = 2, kw = 2,
dw = 2, dh = 2)
# 把pool5 ( [7, 7, 512] ) 拉成向量
shp = pool5.get_shape()
flattened_shape = shp[1].value * shp[2].value * shp[3].value
resh1 = tf.reshape(pool5, [-1, flattened_shape], name = 'resh1') # 全连接层1 添加了 Droput来防止过拟合
fc1 = fc_op(resh1, name = 'fc1', n_out = 2048, p = p)
fc1_drop = tf.nn.dropout(fc1, keep_prob, name = 'fc1_drop') # 全连接层2 添加了 Droput来防止过拟合
fc2 = fc_op(fc1_drop, name = 'fc2', n_out = 2048, p = p)
fc2_drop = tf.nn.dropout(fc2, keep_prob, name = 'fc2_drop') # 全连接层3 加一个softmax求给类别的概率
fc3 = fc_op(fc2_drop, name = 'fc3', n_out = 10, p = p)
softmax = tf.nn.softmax(fc3)
predictions = tf.argmax(softmax, 1)
return predictions, softmax, fc3, p # 定义评测函数 def time_tensorflow_run(session, target, feed, info_string):
num_steps_burn_in = 10
total_duration = 0.0
total_duration_squared = 0.0 for i in range(num_batches + num_steps_burn_in):
start_time = time.time()
_ = session.run(target, feed_dict = feed)
duration = time.time() - start_time
if i >= num_steps_burn_in:
if not i % 10:
print('%s: step %d, duration = %.3f' %
(datetime.now(), i-num_steps_burn_in, duration))
total_duration += duration
total_duration_squared += duration * duration
mean_dur = total_duration / num_batches
var_dur = total_duration_squared / num_batches - mean_dur * mean_dur
std_dur = math.sqrt(var_dur)
print('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %(datetime.now(), info_string, num_batches, mean_dur, std_dur)) def train_vgg16():
with tf.Graph().as_default():
image_size = 224 # 输入图像尺寸
# 生成随机数测试是否能跑通
#images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3], dtype=tf.float32, stddev=1e-1))
with tf.device('/cpu:0'):
images, labels = cifar10.distorted_inputs()
keep_prob = tf.placeholder(tf.float32)
prediction,softmax,fc8,p = inference_op(images,keep_prob)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
time_tensorflow_run(sess, prediction,{keep_prob:1.0}, "Forward")
# 用以模拟训练的过程
objective = tf.nn.l2_loss(fc8) # 给一个loss
grad = tf.gradients(objective, p) # 相对于loss的 所有模型参数的梯度
time_tensorflow_run(sess, grad, {keep_prob:0.5},"Forward-backward") if __name__ == '__main__':
train_vgg16()

  当然,我们也可以用tf.slim来简化一下网络结构

 def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net, [2, 2], scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net, [2, 2], scope='pool5')
net = slim.fully_connected(net, 4096, scope='fc6')
net = slim.dropout(net, 0.5, scope='dropout6')
net = slim.fully_connected(net, 4096, scope='fc7')
net = slim.dropout(net, 0.5, scope='dropout7')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')

  对比训练结果,在同等设备和环境下,迭代200tensorflow的训练结果是89.18%,耗时18h12min,对比paddlepaddle的效果,精度差不多,时间慢一点。其实可以对数据进行处理后再进行训练,转换成tfrecord多线程输入在训练,时间应该会快很多。


总结

  通过论文的分析和实验的结果,我总结了几点:

   1.LRN层太耗费计算资源,作用不大,可以舍去。

    2.大卷积核可以学习更大的空间特征,但是需要的参数空间也更多,小卷积核虽然学习的空间特征有限,但所需参数空间更小,多层叠加训练可能效果更好。

    3.越深的网络效果越好,但是要避免梯度消失的问题,选取relu的激活函数、batch_normalization等都可以从一定程度上避免。

  4.小卷积核+深层网络的效果,在迭代相同次数时,比大卷积核+浅层网络效果更好,对于我们自己设计网络时可以有借鉴作用。但是前者的训练时间可能更长,不过可能比后者收敛速度更快,精确度更好。

ps:为了方便大家及时看到我的更新,我搞了一个公众号,以后文章会同步发布与公众号和博客园,这样大家就能及时收到通知啦,有不懂的问题也可以在公众号留言,这样我能够及时看到并回复。

可以通过扫下面的二维码或者直接搜公众号:CharlotteDataMining 就可以了,谢谢关注^_^

本文同步发布于:https://mp.weixin.qq.com/s?__biz=MzI0OTQwMTA5Ng==&mid=2247483677&idx=1&sn=9402a0532bc6330f83e58c7e18f51b93&chksm=e9935b7adee4d26cd69de6c89b25be994735094ef420befd1d275f97821819ba9528f13e079a#rd

参考文献:

1.https://arxiv.org/pdf/1409.1556.pdf

【深度学习系列】用PaddlePaddle和Tensorflow实现经典CNN网络Vgg的更多相关文章

  1. 【深度学习系列】用PaddlePaddle和Tensorflow实现经典CNN网络AlexNet

    上周我们用PaddlePaddle和Tensorflow实现了图像分类,分别用自己手写的一个简单的CNN网络simple_cnn和LeNet-5的CNN网络识别cifar-10数据集.在上周的实验表现 ...

  2. 【深度学习系列】用PaddlePaddle和Tensorflow实现经典CNN网络GoogLeNet

    前面讲了LeNet.AlexNet和Vgg,这周来讲讲GoogLeNet.GoogLeNet是由google的Christian Szegedy等人在2014年的论文<Going Deeper ...

  3. 【深度学习系列】PaddlePaddle垃圾邮件处理实战(二)

    PaddlePaddle垃圾邮件处理实战(二) 前文回顾   在上篇文章中我们讲了如何用支持向量机对垃圾邮件进行分类,auc为73.3%,本篇讲继续讲如何用PaddlePaddle实现邮件分类,将深度 ...

  4. 使用腾讯云 GPU 学习深度学习系列之二:Tensorflow 简明原理【转】

    转自:https://www.qcloud.com/community/article/598765?fromSource=gwzcw.117333.117333.117333 这是<使用腾讯云 ...

  5. 【深度学习系列】PaddlePaddle之手写数字识别

    上周在搜索关于深度学习分布式运行方式的资料时,无意间搜到了paddlepaddle,发现这个框架的分布式训练方案做的还挺不错的,想跟大家分享一下.不过呢,这块内容太复杂了,所以就简单的介绍一下padd ...

  6. 【深度学习系列】PaddlePaddle可视化之VisualDL

    上篇文章我们讲了如何对模型进行可视化,用的keras手动绘图输出CNN训练的中途结果,本篇文章将讲述如何用PaddlePaddle新开源的VisualDL来进行可视化.在讲VisualDL之前,我们先 ...

  7. 【深度学习系列】PaddlePaddle垃圾邮件处理实战(一)

    PaddlePaddle垃圾邮件处理实战(一) 背景介绍   在我们日常生活中,经常会受到各种垃圾邮件,譬如来自商家的广告.打折促销信息.澳门博彩邮件.理财推广信息等,一般来说邮件客户端都会设置一定的 ...

  8. 【深度学习系列】PaddlePaddle之数据预处理

    上篇文章讲了卷积神经网络的基本知识,本来这篇文章准备继续深入讲CNN的相关知识和手写CNN,但是有很多同学跟我发邮件或私信问我关于PaddlePaddle如何读取数据.做数据预处理相关的内容.网上看的 ...

  9. 【深度学习系列】关于PaddlePaddle的一些避“坑”技巧

    最近除了工作以外,业余在参加Paddle的AI比赛,在用Paddle训练的过程中遇到了一些问题,并找到了解决方法,跟大家分享一下: PaddlePaddle的Anaconda的兼容问题 之前我是在服务 ...

随机推荐

  1. session设置过期的方法(转载)

    这篇文章主要介绍了php中实现精确设置session过期时间的方法,需要的朋友可以参考下   大多数据情况下我们对于session过期时间使用的是默认设置的时间,而对于一些有特殊要求的情况下我们可以设 ...

  2. Linux替换命令

    :s/^.*$/\L&/100 ##将100行内的小写转换成大写 vi/vim 中可以使用 :s 命令来替换字符串. :s/vivian/sky/ 替换当前行第一个 vivian 为 sky ...

  3. backface-visibility 3D修复

    backface-visibility  是作用于 3D transform 时候   默认是   backface-visibility: hidden;   当一个元素3D变换的时候,会立即看到背 ...

  4. eclipse安装checkstyle无法加载到preferences的问题

    描述一下问题,eclipse安装checkstyle,不管是在线安装还是下载安装,在preferences都没有checkstyle选项,如下: 然我们要的效果是这样的:   解决方案如下: 1 启动 ...

  5. Pycharm,Python原生IDE?

    老套路,安装和使用(Win7x64.JDK神马滴早已装好). 1.安装 网上下下来后就这东西 Next D盘路径 我选择.我喜欢 开装 好慢,以后用光纤 O了 桌面小图标 2.使用 以管理员身份打开软 ...

  6. 浅谈JavaScript的apply和call语句

    我们试图在回调函数中,用this表示oDiv对象,这样感觉爽. 1    animate(oDiv,{"left":600},2000,function(){ 2        t ...

  7. smm框架整合实现登录功能

    一.准备所需的jar包 1.1所需jar包 1.Spring框架jar包 2.Mybatis框架jar包 3.Spring的AOP事务jar包 4.Mybatis整合Spring中间件jar包 5.a ...

  8. 在LINQ查询中LINQ之Group By的用法

    LINQ定义了大约40个查询操作符,如select.from.in.where.group 以及order by,借助于LINQ技术,我们可以使用一种类似SQL的语法来查询任何形式的数据.Linq有很 ...

  9. asp .net连接打开数据库初步

    1 #endregion59 WebDriver

  10. 处理ASP.NET Core中的HTML5客户端路由回退

    在使用由Angular,React,Vue等应用程序框架构建的客户端应用程序时,您总是会处理HTML5客户端路由,它将完全在浏览器中处理到页面和组件的客户端路由.几乎完全在浏览器中... HTML5客 ...