简单的卷积神经网络(CNN)的搭建

卷积神经网络（Convolutional Neural Network, CNN）是一种前馈神经网络，它的人工神经元可以响应一部分覆盖范围内的周围单元，对于大型图像处理有出色表现。与普通神经网络非常相似，它们都由具有可学习的权重和偏置常量(biases)的神经元组成。每个神经元都接收一些输入，并做一些点积计算，输出是每个分类的分数，普通神经网络里的一些计算技巧到这里依旧适用。

卷积神经网络通常包含以下几种层：

卷积层（Convolutional layer），卷积神经网路中每层卷积层由若干卷积单元组成，每个卷积单元的参数都是通过反向传播算法优化得到的。卷积运算的目的是提取输入的不同特征，第一层卷积层可能只能提取一些低级的特征如边缘、线条和角等层级，更多层的网络能从低级特征中迭代提取更复杂的特征。
线性整流层（Rectified Linear Units layer, ReLU layer），这一层神经的活性化函数（Activation function）使用线性整流（Rectified Linear Units, ReLU）f(x)=max(0,x)。
池化层（Pooling layer），通常在卷积层之后会得到维度很大的特征，将特征切成几个区域，取其最大值或平均值，得到新的、维度较小的特征。
Drop out, 通常我们在训练Covnets时，会随机的丢弃一部分训练获得的参数，这样可以在一定程度上来防止过度拟合
全连接层（ Fully-Connected layer）, 把所有局部特征结合变成全局特征，用来计算最后每一类的得分。

下面是代码部分，今天我将使用Covnets去完成一件非常非常简单的图像分类任务。这里我们将对 CIFAR-10 数据集中的图片进行分类。该数据集包含飞机、猫狗和其他物体。

首先，我们先获得数据集（或者直接从 https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz ）这里直接下载

 from urllib.request import urlretrieve

 from os.path import isfile, isdir

 from tqdm import tqdm

 import tarfile

 cifar10_dataset_folder_path = 'cifar-10-batches-py'

 class DLProgress(tqdm):

     last_block = 0

     def hook(self, block_num=1, block_size=1, total_size=None):

         self.total = total_size

         self.update((block_num - self.last_block) * block_size)

         self.last_block = block_num

 if not isfile(tar_gz_path):

     with DLProgress(unit='B', unit_scale=True, miniters=1, desc='CIFAR-10 Dataset') as pbar:

         urlretrieve(

             'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz',

             tar_gz_path,

             pbar.hook)

 if not isdir(cifar10_dataset_folder_path):

     with tarfile.open(tar_gz_path) as tar:

         tar.extractall()

         tar.close()

在数据载入之后，我们需要对我们的图片预处理下，因为现在的像素点是0-255之间，我们需要把图片的像素点的值变成0-1之间，这样方便在后面的计算

 def normalize(x):

     """

     Normalize a list of sample image data in the range of 0 to 1

     : x: List of image data.  The image shape is (32, 32, 3)

     : return: Numpy array of normalize data

     """

     a = 0

     b = 1

     grayscale_min = 0

     grayscale_max = 255

     return a + (((x - grayscale_min) * (b - a))/(grayscale_max - grayscale_min))

因为CIFAR数据集里面有10类不同的图片，现在我们需要使用ONE-HOT的方法来给图片打上标签

 def one_hot_encode(x):

     """

     One hot encode a list of sample labels. Return a one-hot encoded vector for each label.

     : x: List of sample Labels

     : return: Numpy array of one-hot encoded labels

     """

     d = {0:[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],

      1:[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],

      2:[0, 0, 1, 0, 0, 0, 0, 0, 0, 0],

      3:[0, 0, 0, 1, 0, 0, 0, 0, 0, 0],

      4:[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],

      5:[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],

      6:[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],

      7:[0, 0, 0, 0, 0, 0, 0, 1, 0, 0],

      8:[0, 0, 0, 0, 0, 0, 0, 0, 1, 0],

      9:[0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}

     map_list = []

     for item in x:

         map_list.append(d[item])

     target = np.array(map_list)

     return target

下面，我们就开始构建我们的Covnets了，首先，我们需要构建placeholder来储存我们的训练图片，训练数据的one-hot标签的编码以及我们dropout时候的概率值

 import tensorflow as tf

 def neural_net_image_input(image_shape):

     """

     Return a Tensor for a batch of image input

     : image_shape: Shape of the images

     : return: Tensor for image input.

     """

     x = tf.placeholder(tf.float32,[None, image_shape[0], image_shape[1],image_shape[2]],'x')

     return x

 def neural_net_label_input(n_classes):

     """

     Return a Tensor for a batch of label input

     : n_classes: Number of classes

     : return: Tensor for label input.

     """

     y = tf.placeholder(tf.float32,[None, n_classes],'y')

     return y

 def neural_net_keep_prob_input():

     """

     Return a Tensor for keep probability

     : return: Tensor for keep probability.

     """

     keep_prob = tf.placeholder(tf.float32,None,'keep_prob')

     return keep_prob

接着我们来构建Covnets中最核心的卷积层+最大池化层（这里我们用最大池化）

 def conv2d_maxpool(x_tensor, conv_num_outputs, conv_ksize, conv_strides, pool_ksize, pool_strides):

     """

     Apply convolution then max pooling to x_tensor

     :param x_tensor: TensorFlow Tensor

     :param conv_num_outputs: Number of outputs for the convolutional layer

     :param conv_ksize: kernal size 2-D Tuple for the convolutional layer

     :param conv_strides: Stride 2-D Tuple for convolution

     :param pool_ksize: kernal size 2-D Tuple for pool

     :param pool_strides: Stride 2-D Tuple for pool

     : return: A tensor that represents convolution and max pooling of x_tensor

     """

     ## Weights and Bias

     weight = tf.Variable(tf.truncated_normal([conv_ksize[0],conv_ksize[1],

                                               x_tensor.get_shape().as_list()[-1],conv_num_outputs],stddev=0.1))

     bias = tf.Variable(tf.zeros(conv_num_outputs))

     ## Apply Convolution

     conv_layer = tf.nn.conv2d(x_tensor,weight,strides = [1,conv_strides[0],conv_strides[1],1], padding='SAME')

     ## Add Bias

     conv_layer = tf.nn.bias_add(conv_layer,bias)

     ## Apply Relu

     conv_layer = tf.nn.relu(conv_layer)

     return tf.nn.max_pool(conv_layer,

                           ksize=[1,pool_ksize[0],pool_ksize[1],1],

                           strides=[1,pool_strides[0],pool_strides[1],1],

                           padding='SAME')

实现 flatten 层，将 x_tensor 的维度从四维张量（4-D tensor）变成二维张量。输出应该是形状（部分大小（Batch Size），扁平化图片大小（Flattened Image Size））

 def flatten(x_tensor):

     """

     Flatten x_tensor to (Batch Size, Flattened Image Size)

     : x_tensor: A tensor of size (Batch Size, ...), where ... are the image dimensions.

     : return: A tensor of size (Batch Size, Flattened Image Size).

     """

     # Get the shape of tensor

     shape = x_tensor.get_shape().as_list()

     # Compute the dim for image

     dim = np.prod(shape[1:])

     # reshape the tensor

     return tf.reshape(x_tensor, [-1,dim])

在网络的最后一步，我们需要做一个全连接层 + 输出层，然后输出一个1*10的结果（10种结果的概率）

 def fully_conn(x_tensor, num_outputs):

     """

     Apply a fully connected layer to x_tensor using weight and bias

     : x_tensor: A 2-D tensor where the first dimension is batch size.

     : num_outputs: The number of output that the new tensor should be.

     : return: A 2-D tensor where the second dimension is num_outputs.

     """

     weight = tf.Variable(tf.truncated_normal([x_tensor.get_shape().as_list()[-1], num_outputs],stddev=0.1))

     bias = tf.Variable(tf.zeros([num_outputs]))

     fc = tf.reshape(x_tensor,[-1, weight.get_shape().as_list()[0]])

     fc = tf.add(tf.matmul(fc,weight), bias)

     fc = tf.nn.relu(fc)

     return fc

 def output(x_tensor, num_outputs):

     """

     Apply a output layer to x_tensor using weight and bias

     : x_tensor: A 2-D tensor where the first dimension is batch size.

     : num_outputs: The number of output that the new tensor should be.

     : return: A 2-D tensor where the second dimension is num_outputs.

     """

     weight_out = tf.Variable(tf.truncated_normal([x_tensor.get_shape().as_list()[-1],num_outputs],stddev=0.1))

     bias_out = tf.Variable(tf.zeros([num_outputs]))

     out = tf.reshape(x_tensor, [-1, weight_out.get_shape().as_list()[0]])

     out = tf.add(tf.matmul(out,weight_out),bias_out)

     return out

在我们都完成基本的元素之后,我们这个时候来构建我们的网络

 def conv_net(x, keep_prob):

     """

     Create a convolutional neural network model

     : x: Placeholder tensor that holds image data.

     : keep_prob: Placeholder tensor that hold dropout keep probability.

     : return: Tensor that represents logits

     """

     conv1 = conv2d_maxpool(x, 32,(5,5),(2,2),(4,4),(2,2))

     conv2 = conv2d_maxpool(conv1, 128, (5,5),(2,2),(2,2),(2,2))

     conv3 = conv2d_maxpool(conv2, 256, (5,5),(2,2),(2,2),(2,2))

     #   flatten(x_tensor)

     flatten_layer = flatten(conv3)

     #   fully_conn(x_tensor, num_outputs)

     fc = fully_conn(flatten_layer, 1024)

     #    Set this to the number of classes

     # Function Definition from Above:

     #   output(x_tensor, num_outputs)

     output_layer = output(fc, 10)

     return output_layer 

 ##############################

 ## Build the Neural Network ##

 ##############################

 # Remove previous weights, bias, inputs, etc..

 tf.reset_default_graph()

 # Inputs

 x = neural_net_image_input((32, 32, 3))

 y = neural_net_label_input(10)

 keep_prob = neural_net_keep_prob_input()

 # Model

 logits = conv_net(x, keep_prob)

 # Name logits Tensor, so that is can be loaded from disk after training

 logits = tf.identity(logits, name='logits')

 # Loss and Optimizer

 cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))

 optimizer = tf.train.AdamOptimizer().minimize(cost)

 # Accuracy

 correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))

 accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32), name='accuracy')

在网络构建完成后，我们可以开始把我们的数据喂进去，训练我们的模型了

这里我随便设置下Hyper-Paramter

 epochs = 30

 batch_size = 256

 keep_probability = 0.5

还需要设置下，在训练的过程中，我们一直需要看到测试集的accuracy来观测我们训练的情况

 def print_stats(session, feature_batch, label_batch, cost, accuracy):

     """

     Print information about loss and validation accuracy

     : session: Current TensorFlow session

     : feature_batch: Batch of Numpy image data

     : label_batch: Batch of Numpy label data

     : cost: TensorFlow cost function

     : accuracy: TensorFlow accuracy function

     """

     loss = sess.run(cost, feed_dict = {

         x:feature_batch,

         y:label_batch,

         keep_prob:1.

     })

     valid_acc = sess.run(accuracy,feed_dict = {

         x:valid_features,

         y:valid_labels,

         keep_prob:1.

     })

     print('Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(

                 loss,

                 valid_acc))

模型训练

 save_model_path = './image_classification'

 print('Training...')

 with tf.Session() as sess:

     # Initializing the variables

     sess.run(tf.global_variables_initializer())

     # Training cycle

     for epoch in range(epochs):

         # Loop over all batches

         n_batches = 5

         for batch_i in range(1, n_batches + 1):

             for batch_features, batch_labels in helper.load_preprocess_training_batch(batch_i, batch_size):

                 train_neural_network(sess, optimizer, keep_probability, batch_features, batch_labels)

             print('Epoch {:>2}, CIFAR-10 Batch {}:  '.format(epoch + 1, batch_i), end='')

             print_stats(sess, batch_features, batch_labels, cost, accuracy)

     # Save Model

     saver = tf.train.Saver()

     save_path = saver.save(sess, save_model_path)

贴上我在训练的最后的验证集的准确率

Epoch 29, CIFAR-10 Batch 4:  Loss:     0.0139 Validation Accuracy: 0.625600

Epoch 29, CIFAR-10 Batch 5:  Loss:     0.0090 Validation Accuracy: 0.631000

Epoch 30, CIFAR-10 Batch 1:  Loss:     0.0138 Validation Accuracy: 0.638800

Epoch 30, CIFAR-10 Batch 2:  Loss:     0.0192 Validation Accuracy: 0.627400

Epoch 30, CIFAR-10 Batch 3:  Loss:     0.0055 Validation Accuracy: 0.633400

Epoch 30, CIFAR-10 Batch 4:  Loss:     0.0114 Validation Accuracy: 0.641800

Epoch 30, CIFAR-10 Batch 5:  Loss:     0.0050 Validation Accuracy: 0.647400

还不错，50%以上了，如果瞎猜只有10%的

当然了，我们的模型的效率可以进一步提高，比如我们进一步去选择更合适的超参数，或者加入一些其他的技巧。

http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#43494641522d3130

这里有个链接，是大家利用这个数据集训练的结果，现在最高的已经96.53%了，看看大佬们是怎么做的吧。。。。

简单的卷积神经网络(CNN)的搭建的更多相关文章

深度学习之卷积神经网络CNN及tensorflow代码实现示例
深度学习之卷积神经网络CNN及tensorflow代码实现示例 2017年05月01日 13:28:21 cxmscb 阅读数 151413更多分类专栏: 机器学习深度学习机器学习版权声明 ...
卷积神经网络(CNN)前向传播算法
在卷积神经网络(CNN)模型结构中,我们对CNN的模型结构做了总结,这里我们就在CNN的模型基础上,看看CNN的前向传播算法是什么样子的.重点会和传统的DNN比较讨论. 1. 回顾CNN的结构在上一 ...
卷积神经网络(CNN)反向传播算法
在卷积神经网络(CNN)前向传播算法中,我们对CNN的前向传播算法做了总结,基于CNN前向传播算法的基础,我们下面就对CNN的反向传播算法做一个总结.在阅读本文前,建议先研究DNN的反向传播算法:深度 ...
卷积神经网络CNN总结
从神经网络到卷积神经网络(CNN)我们知道神经网络的结构是这样的: 那卷积神经网络跟它是什么关系呢?其实卷积神经网络依旧是层级网络,只是层的功能和形式做了变化,可以说是传统神经网络的一个改进.比如下图 ...
【深度学习系列】手写数字识别卷积神经--卷积神经网络CNN原理详解(一)
上篇文章我们给出了用paddlepaddle来做手写数字识别的示例,并对网络结构进行到了调整,提高了识别的精度.有的同学表示不是很理解原理,为什么传统的机器学习算法,简单的神经网络(如多层感知机)都可 ...
深度学习之卷积神经网络(CNN)详解与代码实现（二）
用Tensorflow实现卷积神经网络(CNN) 本文系作者原创,转载请注明出处:https://www.cnblogs.com/further-further-further/p/10737065. ...
深度学习之卷积神经网络(CNN)详解与代码实现（一）
卷积神经网络(CNN)详解与代码实现本文系作者原创,转载请注明出处:https://www.cnblogs.com/further-further-further/p/10430073.html 目 ...
【深度学习系列】卷积神经网络CNN原理详解(一)——基本原理
上篇文章我们给出了用paddlepaddle来做手写数字识别的示例,并对网络结构进行到了调整,提高了识别的精度.有的同学表示不是很理解原理,为什么传统的机器学习算法,简单的神经网络(如多层感知机)都可 ...
深度学习方法（五）：卷积神经网络CNN经典模型整理Lenet，Alexnet，Googlenet，VGG，Deep Residual Learning
欢迎转载,转载请注明:本文出自Bin的专栏blog.csdn.net/xbinworld. 技术交流QQ群:433250724,欢迎对算法.技术感兴趣的同学加入. 关于卷积神经网络CNN,网络和文献中 ...

随机推荐

windows常用运行命令总结
开始→运行→命令集锦 winver---------检查Windows版本 wmimgmt.msc----打开windows管理体系结构(WMI) wupdmgr--------windows更新程序 ...
Golang之beego读取配置信息，输出log模块
1,准备好配置文件 [server] listen_ip = "0.0.0.0" listen_port = [logs] log_level=debug log_path=./l ...
vmware fusion 10序列号
vmware fusion pro 10序列号亲测可用激活 FG3TU-DDX1M-084CY-MFYQX-QC0RD
剑指offer面试题3二维数组中的查找
题目: 在一个二维数组中,每一行都按照从左到右递增的顺序排序,每一列都按照从上到下递增的顺序排序.请完成一个函数,输入这样的一个二维数组和一个整数,判断数组中是否含有该整数. 需要与面试官确认的是,这 ...
centos6.5 设置ssh无密码登录
:关闭防火墙 vim /etc/selinux/config 把SELINUX=enforcing修改为SELINUX=disabled A机器root连接B机器root用户 (root用户登录) ...
jQuery获得元素位置offset()和position()的区别
jQuery获得元素位置offset()和position()的区别 jQuery 中有两个获取元素位置的方法offset()和position(),这两个方法之间有什么异同 offset(): 获取 ...
着重基础之—MySql Blob类型和Text类型
着重基础之—MySql Blob类型和Text类型在经历了几个Java项目后,遇到了一些问题,在解决问题中体会到基础需要不断的回顾与巩固. 最近做的项目中,提供给接口调用方数据同步接口,传输的数据格 ...
<a href=“#”>
在html中看到这样的属性:<a href=“#”>搜了好久,感觉不甚明白,现记之,等遇到了再做补充. # is called an anchor (or hash...). so the ...
spring AbstractRoutingDataSource实现动态数据源切换
使用Spring 提供的 AbstractRoutingDataSource 实现创建 AbstractRoutingDataSource 实现类,负责保存所有数据源与切换数据源策略:public ...
Object-C 类和对象
//创建对象 //类名 *对象名 = [[类名 alloc] init] /* Car *car = [[Car alloc] init]; //Car ...

简单的卷积神经网络(CNN)的搭建

简单的卷积神经网络(CNN)的搭建的更多相关文章

随机推荐

热门专题