【机器学习】使用CNN神经网络实现对图片分类识别及模型转换

仅做记录，后面慢慢整理

训练函数：

from skimage import io, transform  # skimage模块下的io transform(图像的形变与缩放)模块

import glob  # glob 文件通配符模块

import os  # os 处理文件和目录的模块

import tensorflow as tf

import numpy as np  # 多维数据处理模块

import time

# 数据集地址

path = 'E:/tensor_data/powerpoint/test_database/'

# 模型保存地址

model_path = 'E:/tensor_data/powerpoint/model/fc_model.ckpt'

# 将所有的图片resize成100*100

w = 100

h = 100

c = 3

print("开始执行读取图片和数据处理")

# 读取图片+数据处理

def read_img(path):

    # os.listdir(path) 返回path指定的文件夹包含的文件或文件夹的名字的列表

    # os.path.isdir(path)判断path是否是目录

    # b = [x+x for x in list1 if x+x<15 ]  列表生成式,循环list1，当if为真时，将x+x加入列表b

    print(os.listdir(path))

    '''for x in os.listdir(path):

        if os.path.isdir(path+x):

           print(x)'''

    cate = [path + x for x in os.listdir(path) if os.path.isdir(path + x)]

    print("数据集地址："+path)

    imgs = []

    labels = []

    for idx, folder in enumerate(cate):

        # glob.glob(s+'*.py') 从目录通配符搜索中生成文件列表

        for im in glob.glob(folder + '/*.jpg'):

            # 输出读取的图片的名称

            print('reading the images:%s' % (im))

            # io.imread(im)读取单张RGB图片 skimage.io.imread(fname,as_grey=True)读取单张灰度图片

            # 读取的图片

            img = io.imread(im)

            # skimage.transform.resize(image, output_shape)改变图片的尺寸

            img = transform.resize(img, (w, h))

            # 将读取的图片数据加载到imgs[]列表中

            imgs.append(img)

            # 将图片的label加载到labels[]中，与上方的imgs索引对应

            labels.append(idx)

    # 将读取的图片和labels信息，转化为numpy结构的ndarr(N维数组对象（矩阵）)数据信息

    return np.asarray(imgs, np.float32), np.asarray(labels, np.int32)

# 调用读取图片的函数，得到图片和labels的数据集

data, label = read_img(path)

# 打乱顺序

# 读取data矩阵的第一维数（图片的个数）

num_example = data.shape[0]

# 产生一个num_example范围，步长为1的序列

arr = np.arange(num_example)

# 调用函数，打乱顺序

np.random.shuffle(arr)

# 按照打乱的顺序，重新排序

data = data[arr]

label = label[arr]

# 将所有数据分为训练集和验证集

ratio = 0.8

s = np.int(num_example * ratio)

x_train = data[:s]

y_train = label[:s]

x_val = data[s:]

y_val = label[s:]

# -----------------构建网络----------------------

# 本程序cnn网络模型，共有7层，前三层为卷积层，后三层为全连接层，前三层中，每层包含卷积、激活、池化层

# 占位符设置输入参数的大小和格式

x = tf.placeholder(tf.float32, shape=[None, w, h, c], name='x')

y_ = tf.placeholder(tf.int32, shape=[None, ], name='y_')

def inference(input_tensor, train, regularizer):

    # -----------------------第一层----------------------------

    with tf.variable_scope('layer1-conv1'):

        # 初始化权重conv1_weights为可保存变量，大小为5x5,3个通道（RGB），数量为32个

        conv1_weights = tf.get_variable("weight", [5, 5, 3, 32],

                                        initializer=tf.truncated_normal_initializer(stddev=0.1))

        # 初始化偏置conv1_biases，数量为32个

        conv1_biases = tf.get_variable("bias", [32], initializer=tf.constant_initializer(0.0))

        # 卷积计算，tf.nn.conv2d为tensorflow自带2维卷积函数，input_tensor为输入数据，

        # conv1_weights为权重，strides=[1, 1, 1, 1]表示左右上下滑动步长为1，padding='SAME'表示输入和输出大小一样，即补0

        conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')

        # 激励计算，调用tensorflow的relu函数

        relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))

    with tf.name_scope("layer2-pool1"):

        # 池化计算，调用tensorflow的max_pool函数，strides=[1,2,2,1]，表示池化边界，2个对一个生成，padding="VALID"表示不操作。

        pool1 = tf.nn.max_pool(relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="VALID")

    # -----------------------第二层----------------------------

    with tf.variable_scope("layer3-conv2"):

        # 同上，不过参数的有变化，根据卷积计算和通道数量的变化，设置对应的参数

        conv2_weights = tf.get_variable("weight", [5, 5, 32, 64],

                                        initializer=tf.truncated_normal_initializer(stddev=0.1))

        conv2_biases = tf.get_variable("bias", [64], initializer=tf.constant_initializer(0.0))

        conv2 = tf.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')

        relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))

    with tf.name_scope("layer4-pool2"):

        pool2 = tf.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

    # -----------------------第三层----------------------------

    # 同上，不过参数的有变化，根据卷积计算和通道数量的变化，设置对应的参数

    with tf.variable_scope("layer5-conv3"):

        conv3_weights = tf.get_variable("weight", [3, 3, 64, 128],

                                        initializer=tf.truncated_normal_initializer(stddev=0.1))

        conv3_biases = tf.get_variable("bias", [128], initializer=tf.constant_initializer(0.0))

        conv3 = tf.nn.conv2d(pool2, conv3_weights, strides=[1, 1, 1, 1], padding='SAME')

        relu3 = tf.nn.relu(tf.nn.bias_add(conv3, conv3_biases))

    with tf.name_scope("layer6-pool3"):

        pool3 = tf.nn.max_pool(relu3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

    # -----------------------第四层----------------------------

    # 同上，不过参数的有变化，根据卷积计算和通道数量的变化，设置对应的参数

    with tf.variable_scope("layer7-conv4"):

        conv4_weights = tf.get_variable("weight", [3, 3, 128, 128],

                                        initializer=tf.truncated_normal_initializer(stddev=0.1))

        conv4_biases = tf.get_variable("bias", [128], initializer=tf.constant_initializer(0.0))

        conv4 = tf.nn.conv2d(pool3, conv4_weights, strides=[1, 1, 1, 1], padding='SAME')

        relu4 = tf.nn.relu(tf.nn.bias_add(conv4, conv4_biases))

    with tf.name_scope("layer8-pool4"):

        pool4 = tf.nn.max_pool(relu4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

        nodes = 6 * 6 * 128

        reshaped = tf.reshape(pool4, [-1, nodes])

        # 使用变形函数转化结构

    # -----------------------第五层---------------------------

    with tf.variable_scope('layer9-fc1'):

        # 初始化全连接层的参数，隐含节点为1024个

        fc1_weights = tf.get_variable("weight", [nodes, 1024],

                                      initializer=tf.truncated_normal_initializer(stddev=0.1))

        if regularizer != None: tf.add_to_collection('losses', regularizer(fc1_weights))  # 正则化矩阵

        fc1_biases = tf.get_variable("bias", [1024], initializer=tf.constant_initializer(0.1))

        # 使用relu函数作为激活函数

        fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)

        # 采用dropout层，减少过拟合和欠拟合的程度，保存模型最好的预测效率

        if train: fc1 = tf.nn.dropout(fc1, 0.5)

    # -----------------------第六层----------------------------

    with tf.variable_scope('layer10-fc2'):

        # 同上，不过参数的有变化，根据卷积计算和通道数量的变化，设置对应的参数

        fc2_weights = tf.get_variable("weight", [1024, 512],

                                      initializer=tf.truncated_normal_initializer(stddev=0.1))

        if regularizer != None: tf.add_to_collection('losses', regularizer(fc2_weights))

        fc2_biases = tf.get_variable("bias", [512], initializer=tf.constant_initializer(0.1))

        fc2 = tf.nn.relu(tf.matmul(fc1, fc2_weights) + fc2_biases)

        if train: fc2 = tf.nn.dropout(fc2, 0.5)

    # -----------------------第七层----------------------------

    with tf.variable_scope('layer11-fc3'):

        # 同上，不过参数的有变化，根据卷积计算和通道数量的变化，设置对应的参数

        fc3_weights = tf.get_variable("weight", [512, 5],

                                      initializer=tf.truncated_normal_initializer(stddev=0.1))

        if regularizer != None: tf.add_to_collection('losses', regularizer(fc3_weights))

        fc3_biases = tf.get_variable("bias", [5], initializer=tf.constant_initializer(0.1))

        logit = tf.add(tf.matmul(fc2, fc3_weights), fc3_biases, name="output")  # matmul矩阵相乘

    # 返回最后的计算结果

    return logit

# ---------------------------网络结束---------------------------

# 设置正则化参数为0.0001

regularizer = tf.contrib.layers.l2_regularizer(0.0001)

# 将上述构建网络结构引入

logits = inference(x, False, regularizer)

# (小处理)将logits乘以1赋值给logits_eval，定义name，方便在后续调用模型时通过tensor名字调用输出tensor

b = tf.constant(value=1, dtype=tf.float32)

logits_eval = tf.multiply(logits, b, name='logits_eval')  # b为1

# 设置损失函数，作为模型训练优化的参考标准，loss越小，模型越优

loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=y_)

# 设置整体学习率为α为0.001

train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)

# 设置预测精度

correct_prediction = tf.equal(tf.cast(tf.argmax(logits, 1), tf.int32), y_)

acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# 定义一个函数，按批次取数据

def minibatches(inputs=None, targets=None, batch_size=None, shuffle=False):

    assert len(inputs) == len(targets)

    if shuffle:

        indices = np.arange(len(inputs))

        np.random.shuffle(indices)

    for start_idx in range(0, len(inputs) - batch_size + 1, batch_size):

        if shuffle:

            excerpt = indices[start_idx:start_idx + batch_size]

        else:

            excerpt = slice(start_idx, start_idx + batch_size)

        yield inputs[excerpt], targets[excerpt]

# 训练和测试数据，可将n_epoch设置更大一些

# 迭代次数

n_epoch = 20#

# 每次迭代输入的图片数据

batch_size = 64

saver = tf.train.Saver(max_to_keep=4)  # 可以指定保存的模型个数，利用max_to_keep=4，则最终会保存4个模型（

with tf.Session() as sess:

    # 初始化全局参数

    sess.run(tf.global_variables_initializer())

    # 开始迭代训练，调用的都是前面设置好的函数或变量

    for epoch in range(n_epoch):

        start_time = time.time()

        # training#训练集

        train_loss, train_acc, n_batch = 0, 0, 0

        for x_train_a, y_train_a in minibatches(x_train, y_train, batch_size, shuffle=True):

            _, err, ac = sess.run([train_op, loss, acc], feed_dict={x: x_train_a, y_: y_train_a})

            train_loss += err;

            train_acc += ac;

            n_batch += 1

            print("   train loss: %f" % (np.sum(train_loss) / n_batch))

            print("   train acc: %f" % (np.sum(train_acc) / n_batch))

        # validation#验证集

        val_loss, val_acc, n_batch = 0, 0, 0

        for x_val_a, y_val_a in minibatches(x_val, y_val, batch_size, shuffle=False):

            err, ac = sess.run([loss, acc], feed_dict={x: x_val_a, y_: y_val_a})

            val_loss += err;

            val_acc += ac;

            n_batch += 1

            print("   validation loss: %f" % (np.sum(val_loss) / n_batch))

            print("   validation acc: %f" % (np.sum(val_acc) / n_batch))

        # 保存模型及模型参数

        if epoch % 2 == 0:

            saver.save(sess, model_path, global_step=epoch)

            print(sess.graph.name_scope)

测试代码：

from skimage import io, transform

import tensorflow as tf

import numpy as np

import os  # os 处理文件和目录的模块

import glob  # glob 文件通配符模块

# 此程序作用于进行简单的预测，取5个图片来进行预测，如果有多数据预测，按照cnn.py中，读取数据的方式即可

path = 'E:/tensor_data/powerpoint/test_powerpoint/'

# 类别代表字典

flower_dict = {0: '其他', 1: '文档', 2: '幻灯片', 3: '黑板', 4: '不可能出现的类别'}

w = 100

h = 100

c = 3

# 读取图片+数据处理

def read_img(path):

    # os.listdir(path) 返回path指定的文件夹包含的文件或文件夹的名字的列表

    # os.path.isdir(path)判断path是否是目录

    # b = [x+x for x in list1 if x+x<15 ]  列表生成式,循环list1，当if为真时，将x+x加入列表b

    cate = [path + x for x in os.listdir(path) if os.path.isdir(path + x)]

    imgs = []

    for idx, folder in enumerate(cate):

        # glob.glob(s+'*.py') 从目录通配符搜索中生成文件列表

        for im in glob.glob(folder + '/*.jpg'):

            # 输出读取的图片的名称

            print('reading the images:%s' % (im))

            # io.imread(im)读取单张RGB图片 skimage.io.imread(fname,as_grey=True)读取单张灰度图片

            # 读取的图片

            img = io.imread(im)

            # skimage.transform.resize(image, output_shape)改变图片的尺寸

            img = transform.resize(img, (w, h))

            # 将读取的图片数据加载到imgs[]列表中

            imgs.append(img)

            # 将图片的label加载到labels[]中，与上方的imgs索引对应

        # labels.append(idx)

    # 将读取的图片和labels信息，转化为numpy结构的ndarr(N维数组对象（矩阵）)数据信息

    return np.asarray(imgs, np.float32)

# 调用读取图片的函数，得到图片和labels的数据集

data = read_img(path)

with tf.Session() as sess:

    saver = tf.train.import_meta_graph('E:/tensor_data/powerpoint/model/fc_model.ckpt-18.meta')

    saver.restore(sess, tf.train.latest_checkpoint('E:/tensor_data/powerpoint/model/'))

    # sess：表示当前会话，之前保存的结果将被加载入这个会话

    # 设置每次预测的个数

    graph = tf.get_default_graph()

    x = graph.get_tensor_by_name("x:0")

    feed_dict = {x: data}

    logits = graph.get_tensor_by_name("logits_eval:0")  # eval功能等同于sess(run)

    classification_result = sess.run(logits, feed_dict)

    # 打印出预测矩阵

    print(classification_result)

    # 打印出预测矩阵每一行最大值的索引

    print(tf.argmax(classification_result, 1).eval())

    # 根据索引通过字典对应的分类

    output = []

    output = tf.argmax(classification_result, 1).eval()

    for i in range(len(output)):

        print("第", i + 1, "张图片预测:" + flower_dict[output[i]])

这里生成的模型是ckpt,参考代码CNN中是没有指定输入输出结点名称的，这里直接在源码第11层修改即可。

使用Netron可以快速查看模型结构，找到输入输出结点名称。

也可以使用代码打印全部结点名称：

import os

import tensorflow as tf

checkpoint_path=os.path.join('E:/tensor_data/powerpoint/model/fc_model.ckpt-18')

reader=pywrap_tensorflow.NewCheckpointReader(checkpoint_path)

var_to_shape_map=reader.get_variable_to_shape_map()

for key in var_to_shape_map:

    print ('tensor_name: ',key)

拿到输出结点名称后，就可以使用脚本对ckpt模型转换了，转成pb格式

第一个参数是 ckpt模型地址，第二个是pb模型输出地址，第三个是输出结点

import tensorflow as tf

def read_graph_from_ckpt(ckpt_path, out_pb_path, output_name):

    # 从meta文件加载网络结构

    saver = tf.train.import_meta_graph(ckpt_path + '.meta', clear_devices=True)

    graph = tf.get_default_graph()

    with tf.Session(graph=graph) as sess:

        sess.run(tf.global_variables_initializer())

        # 从ckpt加载参数

        saver.restore(sess, ckpt_path)

        output_tf = graph.get_tensor_by_name(output_name)

        # 固化

        pb_graph = tf.graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [output_tf.op.name])

        # 保存

        with tf.gfile.FastGFile(out_pb_path, mode='wb') as f:

            f.write(pb_graph.SerializeToString())

read_graph_from_ckpt('E:/tensor_data/powerpoint/model/fc_model.ckpt-18', 'E:/tensor_data/powerpoint/model/idcard_seg.pb', 'layer11-fc3/output:0')

拿到pb模型后，再使用Netron查看就清晰了很多~~~~

由于我训练模型是为了手机使用的，因此还需要将pb模型转成tflite格式

查看官方文档发现已经提供了转换的py接口，直接使用就好啦~

input是输入结点，output是输出结点，使用Netron看一下就好了

生成的tflite在你的工程根目录下

import tensorflow as tf

graph_def_file = "E:/tensor_data/powerpoint/model/idcard_seg.pb"

input_arrays = ["x"]

output_arrays = ["layer11-fc3/output"]

converter = tf.lite.TFLiteConverter.from_frozen_graph(

  graph_def_file, input_arrays, output_arrays)

tflite_model = converter.convert()

open("converted_model.tflite", "wb").write(tflite_model)

那个啥，完全没有测试模型的准确率emmm先试试看吧！

【机器学习】使用CNN神经网络实现对图片分类识别及模型转换的更多相关文章

【机器学习】BP神经网络实现手写数字识别
最近用python写了一个实现手写数字识别的BP神经网络,BP的推导到处都是,但是一动手才知道,会理论推导跟实现它是两回事.关于BP神经网络的实现网上有一些代码,可惜或多或少都有各种问题,在下手写了一 ...
吴恩达机器学习笔记61-应用实例：图片文字识别(Application Example: Photo OCR)【完结】
最后一章内容,主要是OCR的实例,很多都是和经验或者实际应用有关:看完了,总之,善始善终,继续加油!! 一.图像识别(店名识别)的步骤: 图像文字识别应用所作的事是,从一张给定的图片中识别文字.这比从 ...
写给程序员的机器学习入门 (八) - 卷积神经网络 (CNN) - 图片分类和验证码识别
这一篇将会介绍卷积神经网络 (CNN),CNN 模型非常适合用来进行图片相关的学习,例如图片分类和验证码识别,也可以配合其他模型实现 OCR. 使用 Python 处理图片在具体介绍 CNN 之前, ...
深度学习之神经网络核心原理与算法-caffe&keras框架图片分类
之前我们在使用cnn做图片分类的时候使用了CIFAR-10数据集其他框架对于CIFAR-10的图片分类是怎么做的来与TensorFlow做对比. Caffe Keras 安装官方安装文档: ht ...
【原】Coursera—Andrew Ng机器学习—编程作业 Programming Exercise 3—多分类逻辑回归和神经网络
作业说明 Exercise 3,Week 4,使用Octave实现图片中手写数字 0-9 的识别,采用两种方式(1)多分类逻辑回归(2)多分类神经网络.对比结果. (1)多分类逻辑回归:实现 lrCo ...
机器学习框架ML.NET学习笔记【6】TensorFlow图片分类
一.概述通过之前两篇文章的学习,我们应该已经了解了多元分类的工作原理,图片的分类其流程和之前完全一致,其中最核心的问题就是特征的提取,只要完成特征提取,分类算法就很好处理了,具体流程如下: 之前介绍 ...
Stanford机器学习笔记-4. 神经网络Neural Networks (part one)
4. Neural Networks (part one) Content: 4. Neural Networks (part one) 4.1 Non-linear Classification. ...
源码分析——迁移学习Inception V3网络重训练实现图片分类
1. 前言近些年来,随着以卷积神经网络(CNN)为代表的深度学习在图像识别领域的突破,越来越多的图像识别算法不断涌现.在去年,我们初步成功尝试了图像识别在测试领域的应用:将网站样式错乱问题.无线领域 ...
TensorFlow.NET机器学习入门【5】采用神经网络实现手写数字识别（MNIST）
从这篇文章开始,终于要干点正儿八经的工作了,前面都是准备工作.这次我们要解决机器学习的经典问题,MNIST手写数字识别. 首先介绍一下数据集.请首先解压:TF_Net\Asset\mnist_png. ...

随机推荐

WPF另类实现摄像头录像
WPF中使用第三方控件来直接进行录像的控件没有找到(aforgenet好像不维护了?WPFMediaKit好像只能实现摄像头拍照.收费的控件没有使用,不做评论.) 通过百度(感谢:https://ww ...
SpringBoot学习遇到的问题(1) - 配置文件有日志的debug模式等配置项，为什么不起作用
这个问题困扰我近乎两天,通过查找N多资料后终于解决,写下来共享给大家. logging.level.root=DEBUG ... 一系列的日志配置项,都不起作用的原因是springboot启动加载不到 ...
LoadRunner随机数
需求:自定义随机数方法: int randomnumber; randomnumber = rand()%+; //100到300的随机数 lr_output_message("ca:%d ...
Qt qApp
qApp A global pointer referring to the unique application object. It is equivalent to the pointer re ...
Webpack 一，打包JS
创建入口文件 app.js // es6 module 规范 import sum_d from './sum.js' import {sum_e} from './sum.js' // commco ...
HDU_4403
http://acm.hdu.edu.cn/showproblem.php?pid=4403 数值不大,暴力枚举,枚举每一种划分,然后枚举每一种等号位置. #include<iostream&g ...
BZOJ 1614 [Usaco2007 Jan]Telephone Lines架设电话线 (二分+最短路)
题意: 给一个2e4带正边权的图,可以免费k个边,一条路径的花费为路径上边权最大值,问你1到n的最小花费思路: 对于一个x,我们如果将大于等于x的边权全部免费,那么至少需要免费的边的数量就是 “设大 ...
Dapper系列作者：懒懒的程序员一枚
Dapper 第一篇简单介绍什么是小巧玲珑?Dapper如何工作安装需求方法参数结果常用类型 Dapper 第二篇 Execute 方法介绍描述存储过程Insert语句Update语句Delete语句 ...
WebAPI 微信小程序的授权登录以及实现
这个星期最开始 ,老大扔了2个任务过来,这个是其中之一.下面直接说步骤: 1. 查阅微信开发文档 https://developers.weixin.qq.com/miniprogram/dev/ ...
【OpenGL】OpenGL4.3常用指令目录
参考OpenGL编程指南第8版 VAO void glGenVertexArrays(GLsizei n, GLuint *arrays); 返回n个未使用的对象名到数组arrays中,用作顶点数组 ...

【机器学习】使用CNN神经网络实现对图片分类识别及模型转换

【机器学习】使用CNN神经网络实现对图片分类识别及模型转换的更多相关文章

随机推荐

热门专题