默认caffe已经编译好了，并且编译好了pycaffe

1 数据准备

首先准备训练和测试数据集，这里准备两类数据，分别放在文件夹0和文件夹1中（之所以使用0和1命名数据类别，是因为方便标注数据类别，直接用文件夹的名字即可）。即训练数据集：/data/train/0、/data/train/1 训练数据集：/data/val/0、/data/val/1。

数据准备好之后，创建记录数据文件和对应标签的txt文件

（1）创建训练数据集的train.txt

 import os

 f =open(r'train.txt',"w")

 path = os.getcwd()+'/data/train/'

 for filename in os.listdir(path) :

     count = 0

     for file in os.listdir(path+filename) :

         count = count + 1

         ff='/'+filename+"/"+file+" "+filename+"\n"

         f.write(ff)

     print '{} class: {}'.format(filename,count)

 f.close()

（2）创建测试数据集val.txt

 import os

 f =open(r'val.txt',"w")

 path = os.getcwd()+'/data/val/'

 for filename in os.listdir(path) :

     count = 0

     for file in os.listdir(path+filename) :

         count = count + 1

         ff='/'+filename+"/"+file+" "+filename+"\n"

         f.write(ff)

     print '{} class: {}'.format(filename,count)

 f.close()

注意，txt中文件的路径为： /类别文件夹名/文件名（空格，不能是制表符）类别

2 创建LMDB数据文件

创建createlmdb.sh使用caffe自带的（bulid/tools下的）convert_imageset创建LMDB数据文件，主要是注意数据文件以及上一步生成的txt文件的位置，注意数据文件的RESIZE，后边在进行训练和测试的时候还要用到，其余就是文件的路径的问题了。

 #!/usr/bin/env sh

 CAFFE_ROOT=/home/caf/object/caffe

 TOOLS=$CAFFE_ROOT/build/tools

 TRAIN_DATA_ROOT=/home/caf/wk/learn/data/train

 VAL_DATA_ROOT=/home/caf/wk/learn/data/val

 DATA=/home/caf/wk/learn/data

 EXAMPLE=/home/caf/wk/learn/data/lmdb

 # Set RESIZE=true to resize the images to  x . Leave as false if images have

 # already been resized using another tool.

 RESIZE=true

 if $RESIZE; then

   RESIZE_HEIGHT=

   RESIZE_WIDTH=

 else

   RESIZE_HEIGHT=

   RESIZE_WIDTH=

 fi

 if [ ! -d "$TRAIN_DATA_ROOT" ]; then

   echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"

   echo "Set the TRAIN_DATA_ROOT variable in create_face_48.sh to the path" \

        "where the face_48 training data is stored."

   exit

 fi

 if [ ! -d "$VAL_DATA_ROOT" ]; then

   echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"

   echo "Set the VAL_DATA_ROOT variable in create_face_48.sh to the path" \

        "where the face_48 validation data is stored."

   exit

 fi

 echo "Creating train lmdb..."

 GLOG_logtostderr= $TOOLS/convert_imageset \

     --resize_height=$RESIZE_HEIGHT \

     --resize_width=$RESIZE_WIDTH \

     --shuffle \

     $TRAIN_DATA_ROOT \

     $DATA/train.txt \

     $EXAMPLE/face_train_lmdb

 echo "Creating val lmdb..."

 GLOG_logtostderr= $TOOLS/convert_imageset \

     --resize_height=$RESIZE_HEIGHT \

     --resize_width=$RESIZE_WIDTH \

     --shuffle \

     $VAL_DATA_ROOT \

     $DATA/val.txt \

     $EXAMPLE/face_val_lmdb

 echo "Done."

3 定义网络

caffe接受的网络模型是prototxt文件，对于caffe网络的定义语法有详细的解释，本次实验用的是AlexNet，保存在train_val.prototxt

 name: "AlexNet"

 layer {

   name: "data"

   type: "Data"

   top: "data"

   top: "label"

   include {

     phase: TRAIN

   }

   data_param {

     source: "/home/caf/wk/learn/data/lmdb/face_train_lmdb"

     batch_size:

     backend: LMDB

   }

 }

 layer {

   name: "data"

   type: "Data"

   top: "data"

   top: "label"

   include {

     phase: TEST

   }

   data_param {

     source: "/home/caf/wk/learn/data/lmdb/face_val_lmdb"

     batch_size:

     backend: LMDB

   }

 }

 layer {

   name: "conv1"

   type: "Convolution"

   bottom: "data"

   top: "conv1"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     kernel_size:

     stride:

     weight_filler {

       type: "gaussian"

       std: 0.01

     }

     bias_filler {

       type: "constant"

       value:

     }

   }

 }

 layer {

   name: "relu1"

   type: "ReLU"

   bottom: "conv1"

   top: "conv1"

 }

 layer {

   name: "norm1"

   type: "LRN"

   bottom: "conv1"

   top: "norm1"

   lrn_param {

     local_size:

     alpha: 0.0001

     beta: 0.75

   }

 }

 layer {

   name: "pool1"

   type: "Pooling"

   bottom: "norm1"

   top: "pool1"

   pooling_param {

     pool: MAX

     kernel_size:

     stride:

   }

 }

 layer {

   name: "conv2"

   type: "Convolution"

   bottom: "pool1"

   top: "conv2"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     group:

     weight_filler {

       type: "gaussian"

       std: 0.01

     }

     bias_filler {

       type: "constant"

       value: 0.1

     }

   }

 }

 layer {

   name: "relu2"

   type: "ReLU"

   bottom: "conv2"

   top: "conv2"

 }

 layer {

   name: "norm2"

   type: "LRN"

   bottom: "conv2"

   top: "norm2"

   lrn_param {

     local_size:

     alpha: 0.0001

     beta: 0.75

   }

 }

 layer {

   name: "pool2"

   type: "Pooling"

   bottom: "norm2"

   top: "pool2"

   pooling_param {

     pool: MAX

     kernel_size:

     stride:

   }

 }

 layer {

   name: "conv3"

   type: "Convolution"

   bottom: "pool2"

   top: "conv3"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     weight_filler {

       type: "gaussian"

       std: 0.01

     }

     bias_filler {

       type: "constant"

       value:

     }

   }

 }

 layer {

   name: "relu3"

   type: "ReLU"

   bottom: "conv3"

   top: "conv3"

 }

 layer {

   name: "conv4"

   type: "Convolution"

   bottom: "conv3"

   top: "conv4"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     group:

     weight_filler {

       type: "gaussian"

       std: 0.01

     }

     bias_filler {

       type: "constant"

       value: 0.1

     }

   }

 }

 layer {

   name: "relu4"

   type: "ReLU"

   bottom: "conv4"

   top: "conv4"

 }

 layer {

   name: "conv5"

   type: "Convolution"

   bottom: "conv4"

   top: "conv5"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     group:

     weight_filler {

       type: "gaussian"

       std: 0.01

     }

     bias_filler {

       type: "constant"

       value: 0.1

     }

   }

 }

 layer {

   name: "relu5"

   type: "ReLU"

   bottom: "conv5"

   top: "conv5"

 }

 layer {

   name: "pool5"

   type: "Pooling"

   bottom: "conv5"

   top: "pool5"

   pooling_param {

     pool: MAX

     kernel_size:

     stride:

   }

 }

 layer {

   name: "fc6"

   type: "InnerProduct"

   bottom: "pool5"

   top: "fc6"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   inner_product_param {

     num_output:

     weight_filler {

       type: "gaussian"

       std: 0.005

     }

     bias_filler {

       type: "constant"

       value: 0.1

     }

   }

 }

 layer {

   name: "relu6"

   type: "ReLU"

   bottom: "fc6"

   top: "fc6"

 }

 layer {

   name: "drop6"

   type: "Dropout"

   bottom: "fc6"

   top: "fc6"

   dropout_param {

     dropout_ratio: 0.5

   }

 }

 layer {

   name: "fc7"

   type: "InnerProduct"

   bottom: "fc6"

   top: "fc7"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   inner_product_param {

     num_output:

     weight_filler {

       type: "gaussian"

       std: 0.005

     }

     bias_filler {

       type: "constant"

       value: 0.1

     }

   }

 }

 layer {

   name: "relu7"

   type: "ReLU"

   bottom: "fc7"

   top: "fc7"

 }

 layer {

   name: "drop7"

   type: "Dropout"

   bottom: "fc7"

   top: "fc7"

   dropout_param {

     dropout_ratio: 0.5

   }

 }

 layer {

   name: "fc8"

   type: "InnerProduct"

   bottom: "fc7"

   top: "fc8"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   inner_product_param {

     num_output:

     weight_filler {

       type: "gaussian"

       std: 0.01

     }

     bias_filler {

       type: "constant"

       value:

     }

   }

 }

 layer {

   name: "accuracy"

   type: "Accuracy"

   bottom: "fc8"

   bottom: "label"

   top: "accuracy"

   include {

     phase: TEST

   }

 }

 layer {

   name: "loss"

   type: "SoftmaxWithLoss"

   bottom: "fc8"

   bottom: "label"

   top: "loss"

 }

 layer {

   name: "prob"

   type: "Softmax"

   bottom: "fc8"

   top: "prob"

 }

创建超参数文件slover.prototxt，主要定义训练的参数，包括迭代次数，每迭代多少次保存模型文件，学习率等等，net就是刚才定义的训练网络，这里训练和测试使用同一个网络。

 net: "train_val.prototxt"

 test_iter:

 test_interval:

 base_lr: 0.001

 lr_policy: "step"

 gamma: 0.1

 stepsize:

 display:

 max_iter:

 momentum: 0.9

 weight_decay: 0.005

 solver_mode: GPU

 snapshot:

 snapshot_prefix: "model/"

4 训练模型

创建train.sh使用GPU进行训练，否则太慢！！！

 #!/usr/bin/env sh

 CAFFE_ROOT=/home/caf/object/caffe

 SLOVER_ROOT=/home/caf/wk/learn

 $CAFFE_ROOT/build/tools/caffe train --solver=$SLOVER_ROOT/slover.prototxt --gpu=

在model文件夹下会生成caffemodel文件，使用这些文件用于图像的分类等操作。

4 测试

创建deploy.prototxt进行测试，和训练网络一样，只不过用于实际分类的网络并不需要训练网络那些参数了，因此需要重新定义一个模型文件，测试的图片在该模型中进行。

deploy.prototxt文件和train_val.prototxt文件不同的地方在于：

（1）输入的数据不再是LMDB，也不分为测试集和训练集，输入的类型为Input，定义的维度，和训练集的数据维度保持一致，227*227，否则会报错；

（2）去掉weight_filler和bias_filler，这些参数已经存在于caffemodel中了，由caffemodel进行初始化。

（3）去掉最后的Accuracy层和loss层，换位Softmax层，表示分为某一类的概率。

 name: "AlexNet"

 layer {

   name: "data"

   type: "Input"

   top: "data"

   input_param { shape: { dim:  dim:  dim:  dim:  } }

 }

 layer {

   name: "conv1"

   type: "Convolution"

   bottom: "data"

   top: "conv1"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     kernel_size:

     stride:

   }

 }

 layer {

   name: "relu1"

   type: "ReLU"

   bottom: "conv1"

   top: "conv1"

 }

 layer {

   name: "norm1"

   type: "LRN"

   bottom: "conv1"

   top: "norm1"

   lrn_param {

     local_size:

     alpha: 0.0001

     beta: 0.75

   }

 }

 layer {

   name: "pool1"

   type: "Pooling"

   bottom: "norm1"

   top: "pool1"

   pooling_param {

     pool: MAX

     kernel_size:

     stride:

   }

 }

 layer {

   name: "conv2"

   type: "Convolution"

   bottom: "pool1"

   top: "conv2"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     group:

   }

 }

 layer {

   name: "relu2"

   type: "ReLU"

   bottom: "conv2"

   top: "conv2"

 }

 layer {

   name: "norm2"

   type: "LRN"

   bottom: "conv2"

   top: "norm2"

   lrn_param {

     local_size:

     alpha: 0.0001

     beta: 0.75

   }

 }

 layer {

   name: "pool2"

   type: "Pooling"

   bottom: "norm2"

   top: "pool2"

   pooling_param {

     pool: MAX

     kernel_size:

     stride:

   }

 }

 layer {

   name: "conv3"

   type: "Convolution"

   bottom: "pool2"

   top: "conv3"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

   }

 }

 layer {

   name: "relu3"

   type: "ReLU"

   bottom: "conv3"

   top: "conv3"

 }

 layer {

   name: "conv4"

   type: "Convolution"

   bottom: "conv3"

   top: "conv4"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     group:

   }

 }

 layer {

   name: "relu4"

   type: "ReLU"

   bottom: "conv4"

   top: "conv4"

 }

 layer {

   name: "conv5"

   type: "Convolution"

   bottom: "conv4"

   top: "conv5"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   convolution_param {

     num_output:

     pad:

     kernel_size:

     group:

   }

 }

 layer {

   name: "relu5"

   type: "ReLU"

   bottom: "conv5"

   top: "conv5"

 }

 layer {

   name: "pool5"

   type: "Pooling"

   bottom: "conv5"

   top: "pool5"

   pooling_param {

     pool: MAX

     kernel_size:

     stride:

   }

 }

 layer {

   name: "fc6"

   type: "InnerProduct"

   bottom: "pool5"

   top: "fc6"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   inner_product_param {

     num_output:

   }

 }

 layer {

   name: "relu6"

   type: "ReLU"

   bottom: "fc6"

   top: "fc6"

 }

 layer {

   name: "drop6"

   type: "Dropout"

   bottom: "fc6"

   top: "fc6"

   dropout_param {

     dropout_ratio: 0.5

   }

 }

 layer {

   name: "fc7"

   type: "InnerProduct"

   bottom: "fc6"

   top: "fc7"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   inner_product_param {

     num_output:

   }

 }

 layer {

   name: "relu7"

   type: "ReLU"

   bottom: "fc7"

   top: "fc7"

 }

 layer {

   name: "drop7"

   type: "Dropout"

   bottom: "fc7"

   top: "fc7"

   dropout_param {

     dropout_ratio: 0.5

   }

 }

 layer {

   name: "fc8"

   type: "InnerProduct"

   bottom: "fc7"

   top: "fc8"

   param {

     lr_mult:

     decay_mult:

   }

   param {

     lr_mult:

     decay_mult:

   }

   inner_product_param {

     num_output:

   }

 }

 layer {

   name: "prob"

   type: "Softmax"

   bottom: "fc8"

   top: "prob"

 }

用于训练的python代码，使用caffe中python的接口，主要定义好自己训练好的参数文件，模型文件的位置，以及均值文件的位置。

 import numpy as np

 import matplotlib.pyplot as plt

 import sys

 caffe_root="/home/caf/object/caffe/"

 sys.path.insert(0,caffe_root+'python')

 import caffe

 caffe.set_device(0)

 caffe.set_mode_gpu()

 model_def = 'deploy.prototxt'

 model_weights = 'model/_iter_100.caffemodel'

 net = caffe.Net(model_def,

                 model_weights,

                 caffe.TEST)

 mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')

 mu = mu.mean(1).mean(1)

 #print 'mean-subtracted values:', zip('BGR', mu)

 transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

 transformer.set_transpose('data', (2,0,1))

 transformer.set_mean('data', mu)

 transformer.set_raw_scale('data', 255)

 transformer.set_channel_swap('data', (2,1,0))

 net.blobs['data'].reshape(3,227, 227)

 image = caffe.io.load_image('test.jpg')

 transformed_image = transformer.preprocess('data', image)

 #plt.imshow(image)

 #plt.show()

 net.blobs['data'].data[...] = transformed_image

 output = net.forward()

 output_prob = output['prob']

 print output_prob

 print 'predicted class is:', output_prob.argmax()

遇到的问题

（1）标签文件不能用制表符，必须是空格，否则会找不到数据文件

（2）CUDA问题，报一个类似叫CUDASuccess的错误，说明GPU空间不够，需要释放空间，使用 nvidia-smi 命令查看那个程序占用GPU过高，使用 kill -9 PID结束掉即可

（3）由于caffe版本的问题，层的定义有layer和layers，使用layer定义，type需要加双引号，是字符格式；使用layers定义，type不用加双引号，变为全大写字母

caffe训练自己的数据集的更多相关文章

使用caffe训练mnist数据集 - caffe教程实战（一）
个人认为学习一个陌生的框架,最好从例子开始,所以我们也从一个例子开始. 学习本教程之前,你需要首先对卷积神经网络算法原理有些了解,而且安装好了caffe 卷积神经网络原理参考:http://cs231 ...
实践详细篇-Windows下使用VS2015编译的Caffe训练mnist数据集
上一篇记录的是学习caffe前的环境准备以及如何创建好自己需要的caffe版本.这一篇记录的是如何使用编译好的caffe做训练mnist数据集,步骤编号延用上一篇 <实践详细篇-Windows下 ...
SSD框架训练自己的数据集
SSD demo中详细介绍了如何在VOC数据集上使用SSD进行物体检测的训练和验证.本文介绍如何使用SSD实现对自己数据集的训练和验证过程,内容包括: 1 数据集的标注2 数据集的转换3 使用SSD如 ...
caffe︱深度学习参数调优杂记+caffe训练时的问题+dropout/batch Normalization
一.深度学习中常用的调节参数本节为笔者上课笔记(CDA深度学习实战课程第一期) 1.学习率步长的选择:你走的距离长短,越短当然不会错过,但是耗时间.步长的选择比较麻烦.步长越小,越容易得到局部最优 ...
目标检测算法SSD之训练自己的数据集
目标检测算法SSD之训练自己的数据集 prerequesties 预备知识/前提条件下载和配置了最新SSD代码 git clone https://github.com/weiliu89/caffe ...
实践详细篇-Windows下使用Caffe训练自己的Caffemodel数据集并进行图像分类
三:使用Caffe训练Caffemodel并进行图像分类上一篇记录的是如何使用别人训练好的MNIST数据做训练测试.上手操作一边后大致了解了配置文件属性.这一篇记录如何使用自己准备的图片素材做图像分 ...
[caffe] caffe训练tricks
Tags: Caffe Categories: Tools/Wheels --- 1. 将caffe训练时将屏幕输出定向到文本文件 caffe中自带可以画图的工具,在caffe路径下: ./tools ...
使用py-faster-rcnn训练自己的数据集
https://www.jianshu.com/p/a672f702e596 本文记录了在ubuntu16.04下使用py-faster-rcnn来训练自己的数据集的大致过程. 在此之前,已经成功配置 ...
【Tensorflow系列】使用Inception_resnet_v2训练自己的数据集并用Tensorboard监控
[写在前面] 用Tensorflow(TF)已实现好的卷积神经网络(CNN)模型来训练自己的数据集,验证目前较成熟模型在不同数据集上的准确度,如Inception_V3, VGG16,Inceptio ...

随机推荐

向量类Vector
Java.util.Vector提供了向量(Vector)类以实现类似动态数组的功能.在Java语言中.正如在一开始就提到过,是没有指针概念的,但如果能正确灵活地使用指针又确实可以大大提高程序的质量, ...
【Cloud Foundry】Cloud Foundry学习（四）——Service
在阅读的过程中有不论什么问题,欢迎一起交流邮箱:1494713801@qq.com QQ:1494713801 Services:Cloud Foundry的Service模块从源码控制上看就 ...
Delphi MessageBox
MessageBox对话框是比较常用的一个信息对话框,其不仅能够定义显示的信息内容.信息提示图标,而且可以定义按钮组合及对话框的标题,是一个功能齐全的信息对话框信息提示图标,而且可以定义按钮组合及对话 ...
ios 统一设计，iOS6也玩扁平化
转:http://esoftmobile.com/2014/01/14/build-ios6-ios7-apps/ 前言前段时间,苹果在它的开发者网站上放出了iOS系统安装比例,其中iOS7占到78 ...
转载别人的DLL DEll研究
昨日,编了个DLL和EXE来进行了下测试,exe通过lib静态联编dll,来调用它的导出类,当改变DLL中导出类的结构(eg.成员变量的顺序等),从新发布DLL而不从新联结编译EXE,就会造成错误的执 ...
Qunit 和 jsCoverage使用方法（js单元测试）
Qunit 和 jsCoverage使用方法(js单元测试) 近日在网上浏览过很多有关js单元测试相关的文档,工具,但是,针对Qunit 和 jsCoverage使用方法,缺少详细说明,对于初入前端的 ...
IO流入门-概述
纲要 Java流概述文件流缓冲流转换流打印流对象流 File类流的概念按方向划分:输入流和输出流,是相对内存而言的.从内存出来是输出,到内存中就是输入.输入流又叫做InputStream ...
Qt 如何自动安装常用依赖？
使用 *.prf 文件自动安装依赖在 Qt\Qt5.9.5\5.9.5\msvc2015\mkspecs\features 路径中添加 auto_install.prf 文件然后在程序配置文件(* ...
Pycharm如何修改背景图(BackgroundColor)
Outline 之前见到别人Eclipse自定义设置了背景图,在想 Pycharm是不是也可以设置?都是IDE嘛~ 没事捣鼓 Pycharm Settings 时,看到“Background Imag ...
Java中的字符串不变性
原文链接:http://www.programcreek.com/2009/02/diagram-to-show-java-strings-immutability/ (图片出处和内容参照) 1.声明 ...

caffe训练自己的数据集