Caffe之layer_factory
之前在测试NN中各个层的时间的时候,遇到一个非常奇怪的问题,分别使用Caffe自己的gpu方法和cuDNN方法,在卷积上性能差异非常大,但是在pooling层上基本没有变化。抽空检查了代码之后,发现是layer_factory模式导致的问题。下面就以下几个方面来进行
1.工厂模式
2.layer_factory详解
3.layer_factory中坑
4.问题影响分析
1.工厂模式
工厂模式是设计模式中的一种,面向的业务大概是在编码时不能预见需要创建那种类的实例,系统不依赖产品类如何被创建、组合和表达的细节,工厂模式的弊端是扩展比较少的项目中比较合适。
工厂模式有三种角色:
工厂类角色:根据逻辑产生具体的产品
抽象产品角色:具体产品的父类,一把由Java中的接口或者C++中的抽象类来实现
具体产品角色:产品实例
2.layer_factory详解
众所周知,Caffe1.0版本中,目前有三大类算子:CPU版本、Caffe自己实现的CUDA版本的和CuDNN版本的。layer_factory文件负责组装Caffe中算子,工厂模式的意思就是根据用户的设置,在执行时,选择相应版本的算子进行。
以下参考至http://zhuanlan.zhihu.com/hacker-and-painter/20456649
layer_factory.hpp是layer_factory的头文件
- /**
- * @brief A layer factory that allows one to register layers.
- * During runtime, registered layers could be called by passing a LayerParameter
- * protobuffer to the CreateLayer function:
- *
- * LayerRegistry<Dtype>::CreateLayer(param);
- *
- * There are two ways to register a layer. Assuming that we have a layer like:
- *
- * template <typename Dtype>
- * class MyAwesomeLayer : public Layer<Dtype> {
- * // your implementations
- * };
- *
- * and its type is its C++ class name, but without the "Layer" at the end
- * ("MyAwesomeLayer" -> "MyAwesome").
- *
- * If the layer is going to be created simply by its constructor, in your c++
- * file, add the following line:
- *
- * REGISTER_LAYER_CLASS(MyAwesome);
- *
- * Or, if the layer is going to be created by another creator function, in the
- * format of:
- *
- * template <typename Dtype>
- * Layer<Dtype*> GetMyAwesomeLayer(const LayerParameter& param) {
- * // your implementation
- * }
- *
- * (for example, when your layer has multiple backends, see GetConvolutionLayer
- * for a use case), then you can register the creator function instead, like
- *
- * REGISTER_LAYER_CREATOR(MyAwesome, GetMyAwesomeLayer)
- *
- * Note that each layer type should only be registered once.
- */
- #ifndef CAFFE_LAYER_FACTORY_H_
- #define CAFFE_LAYER_FACTORY_H_
- #include <map>
- #include <string>
- #include "caffe/common.hpp"
- #include "caffe/proto/caffe.pb.h"
- namespace caffe {
- template <typename Dtype>
- class Layer;
- //LayerResistry的功能很简单,就是将类和对应的字符串类型放入到一个map当中去,以便灵活调用。主要就是注册类的功能
- template <typename Dtype>
- class LayerRegistry {
- public:
- // 函数指针Creator,返回的是Layer<Dtype>类型的指针
- typedef shared_ptr<Layer<Dtype> > (*Creator)(const LayerParameter&);
- // CreatorRegistry是字符串与对应的Creator的映射
- typedef std::map<string, Creator> CreatorRegistry;
- static CreatorRegistry& Registry() {
- static CreatorRegistry* g_registry_ = new CreatorRegistry();
- return *g_registry_;
- }
- // Adds a creator.
- // 根据类型和函数指针,加入到表中
- static void AddCreator(const string& type, Creator creator) {
- CreatorRegistry& registry = Registry();
- CHECK_EQ(registry.count(type), )
- << "Layer type " << type << " already registered.";
- registry[type] = creator;
- }
- // Get a layer using a LayerParameter.
- //给定层的类型,创建层
- static shared_ptr<Layer<Dtype> > CreateLayer(const LayerParameter& param) {
- LOG(INFO) << "Creating layer " << param.name();
- // 从参数中获得类型字符串
- const string& type = param.type();
- // 检查是否查找到给定type的Creator
- CreatorRegistry& registry = Registry();
- CHECK_EQ(registry.count(type), ) << "Unknown layer type: " << type
- << " (known types: " << LayerTypeList() << ")";
- // 调用对应的层的Creator函数
- return registry[type](param);
- }
- private:
- // Layer registry should never be instantiated - everything is done with its
- // static variables.
- // 禁止实例化,因为该类都是静态函数,所以是私有的
- LayerRegistry() {}
- //返回层的类型列表
- static string LayerTypeList() {
- // 获得注册表
- CreatorRegistry& registry = Registry();
- string layer_types;
- // 遍历注册表压入layer_types字符串容器
- for (typename CreatorRegistry::iterator iter = registry.begin();
- iter != registry.end(); ++iter) {
- if (iter != registry.begin()) {
- layer_types += ", ";
- }
- layer_types += iter->first;
- }
- return layer_types;
- }
- };
- // LayerRegisterer
- // 自己定义层的注册器
- // 以供后面的宏进行使用
- template <typename Dtype>
- class LayerRegisterer {
- public:
- // 层的注册器的构造函数
- LayerRegisterer(const string& type,
- shared_ptr<Layer<Dtype> > (*creator)(const LayerParameter&)) {
- // LOG(INFO) << "Registering layer type: " << type;
- // 还是调用的层注册表中的加入Creator函数加入注册表
- LayerRegistry<Dtype>::AddCreator(type, creator);
- }
- };
- //为了方便作者还弄了个宏便于注册自己写的层类
- // 生成g_creator_f_type(type, creator<Dtype>)的两个函数 (double和float类型)
- #define REGISTER_LAYER_CREATOR(type, creator) \
- static LayerRegisterer<float> g_creator_f_##type(#type, creator<float>); \
- static LayerRegisterer<double> g_creator_d_##type(#type, creator<double>) \
- /* 注册自己定义的类,类名为type,
- 假设比如type=bias,那么生成如下的代码
- 下面的函数直接调用你自己的类的构造函数生成一个类的实例并返回
- CreatorbiasLayer(const LayerParameter& param)
- 下面的语句是为你自己的类定义了LayerRegisterer<float>类型的静态变量g_creator_f_biasLayer(float类型,实际上就是把你自己的类的字符串类型和类的实例绑定到注册表)
- static LayerRegisterer<float> g_creator_f_biasLayer(bias, CreatorbiasLayer)
- 下面的语句为你自己的类定义了LayerRegisterer<double>类型的静态变量g_creator_d_biasLayer(double类型,实际上就是把你自己的类的字符串类型和类的实例绑定到注册表)
- static LayerRegisterer<double> g_creator_d_biasLayer(bias, CreatorbiasLayer)
- */
- #define REGISTER_LAYER_CLASS(type) \
- template <typename Dtype> \
- shared_ptr<Layer<Dtype> > Creator_##type##Layer(const LayerParameter& param) \
- { \
- return shared_ptr<Layer<Dtype> >(new type##Layer<Dtype>(param)); \
- } \
- REGISTER_LAYER_CREATOR(type, Creator_##type##Layer)
- } // namespace caffe
- #endif // CAFFE_LAYER_FACTORY_H_
经过上边的阐述之后,实现部分(这部分和1.0版本有出入,大的方面不影响)
layer_factory.hpp:
- // Make sure we include Python.h before any system header
- // to avoid _POSIX_C_SOURCE redefinition
- #ifdef WITH_PYTHON_LAYER
- #include <boost/python.hpp>
- #endif
- #include <string>
- #include "caffe/layer.hpp"
- #include "caffe/layer_factory.hpp"
- #include "caffe/proto/caffe.pb.h"
- #include "caffe/vision_layers.hpp"
- #ifdef WITH_PYTHON_LAYER
- #include "caffe/python_layer.hpp"
- #endif
- namespace caffe {
- // 写一个获取卷积层实例的函数
- // Get convolution layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetConvolutionLayer(
- const LayerParameter& param) {
- // 从参数中获取是使用什么引擎进行计算CUDNN还是CAFFE还是DEFAULT
- // engine可从caffe.proto中看出是枚举类型的
- ConvolutionParameter_Engine engine = param.convolution_param().engine();
- if (engine == ConvolutionParameter_Engine_DEFAULT) {
- engine = ConvolutionParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = ConvolutionParameter_Engine_CUDNN;
- #endif
- }
- if (engine == ConvolutionParameter_Engine_CAFFE) {
- // 直接初始化Caffe的卷积层
- return shared_ptr<Layer<Dtype> >(new ConvolutionLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == ConvolutionParameter_Engine_CUDNN) {
- // 初始化CUDNN的卷积层
- return shared_ptr<Layer<Dtype> >(new CuDNNConvolutionLayer<Dtype>(param));
- #endif
- } else {// 否则就是出错了
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- // 注册该卷积层,类型名为Convolution,获取卷积层的实例为GetConvolutionLayer函数
- REGISTER_LAYER_CREATOR(Convolution, GetConvolutionLayer);
- // 获取池化层的实例,同卷积层的逻辑
- // Get pooling layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetPoolingLayer(const LayerParameter& param) {
- PoolingParameter_Engine engine = param.pooling_param().engine();
- if (engine == PoolingParameter_Engine_DEFAULT) {
- engine = PoolingParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = PoolingParameter_Engine_CUDNN;
- #endif
- }
- if (engine == PoolingParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == PoolingParameter_Engine_CUDNN) {
- PoolingParameter p_param = param.pooling_param();
- if (p_param.pad() || p_param.pad_h() || p_param.pad_w() ||
- param.top_size() > ) {
- LOG(INFO) << "CUDNN does not support padding or multiple tops. "
- << "Using Caffe's own pooling layer.";
- return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
- }
- return shared_ptr<Layer<Dtype> >(new CuDNNPoolingLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- // 注册池化层
- REGISTER_LAYER_CREATOR(Pooling, GetPoolingLayer);
- // 注册ReLU层
- // Get relu layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetReLULayer(const LayerParameter& param) {
- ReLUParameter_Engine engine = param.relu_param().engine();
- if (engine == ReLUParameter_Engine_DEFAULT) {
- engine = ReLUParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = ReLUParameter_Engine_CUDNN;
- #endif
- }
- if (engine == ReLUParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new ReLULayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == ReLUParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNReLULayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(ReLU, GetReLULayer);
- // 注册sigmoid层
- // Get sigmoid layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetSigmoidLayer(const LayerParameter& param) {
- SigmoidParameter_Engine engine = param.sigmoid_param().engine();
- if (engine == SigmoidParameter_Engine_DEFAULT) {
- engine = SigmoidParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = SigmoidParameter_Engine_CUDNN;
- #endif
- }
- if (engine == SigmoidParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new SigmoidLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == SigmoidParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNSigmoidLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(Sigmoid, GetSigmoidLayer);
- // 注册softmax层
- // Get softmax layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetSoftmaxLayer(const LayerParameter& param) {
- SoftmaxParameter_Engine engine = param.softmax_param().engine();
- if (engine == SoftmaxParameter_Engine_DEFAULT) {
- engine = SoftmaxParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = SoftmaxParameter_Engine_CUDNN;
- #endif
- }
- if (engine == SoftmaxParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new SoftmaxLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == SoftmaxParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNSoftmaxLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(Softmax, GetSoftmaxLayer);
- // 注册tanh层
- // Get tanh layer according to engine.
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetTanHLayer(const LayerParameter& param) {
- TanHParameter_Engine engine = param.tanh_param().engine();
- if (engine == TanHParameter_Engine_DEFAULT) {
- engine = TanHParameter_Engine_CAFFE;
- #ifdef USE_CUDNN
- engine = TanHParameter_Engine_CUDNN;
- #endif
- }
- if (engine == TanHParameter_Engine_CAFFE) {
- return shared_ptr<Layer<Dtype> >(new TanHLayer<Dtype>(param));
- #ifdef USE_CUDNN
- } else if (engine == TanHParameter_Engine_CUDNN) {
- return shared_ptr<Layer<Dtype> >(new CuDNNTanHLayer<Dtype>(param));
- #endif
- } else {
- LOG(FATAL) << "Layer " << param.name() << " has unknown engine.";
- }
- }
- REGISTER_LAYER_CREATOR(TanH, GetTanHLayer);
- // 注册PYTHON层
- #ifdef WITH_PYTHON_LAYER
- template <typename Dtype>
- shared_ptr<Layer<Dtype> > GetPythonLayer(const LayerParameter& param) {
- Py_Initialize();
- try {
- bp::object module = bp::import(param.python_param().module().c_str());
- bp::object layer = module.attr(param.python_param().layer().c_str())(param);
- return bp::extract<shared_ptr<PythonLayer<Dtype> > >(layer)();
- } catch (bp::error_already_set) {
- PyErr_Print();
- throw;
- }
- }
- REGISTER_LAYER_CREATOR(Python, GetPythonLayer);
- #endif
- // Layers that use their constructor as their default creator should be
- // registered in their corresponding cpp files. Do not register them here.
- } // namespace caffe
3.layer_factory中坑
在现有的代码中,Pooling层的注册部分出现了这个代码:
- // CuDNN assumes layers are not being modified in place, thus
- // breaking our index tracking for updates in some cases in Caffe.
- // Until there is a workaround in Caffe (index management) or
- // cuDNN, use Caffe layer to max pooling, or don't use in place
- // layers after max pooling layers
- if (param.pooling_param().pool() == PoolingParameter_PoolMethod_MAX) {
- return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));
- } else {
- return shared_ptr<Layer<Dtype> >(new CuDNNPoolingLayer<Dtype>(param));
- }
这就直接导致,只要你用的是MaxPool,使用的一定是Caffe自己实现的cu代码,永远无法使用cuDNN版本的代码,这就解释了我们之前测试MaxPool层性能一直没有变化的原因
4.问题影响分析
但是caffe的作者为什么不使用cuDNN的MaxPool呢,经过查询NVIDIA cuDNN的User Manual,我们发现,
4.144. cudnnPoolingForward
- cudnnStatus_t cudnnPoolingForward(
- cudnnHandle_t handle,
- const cudnnPoolingDescriptor_t poolingDesc,
- const void *alpha,
- const cudnnTensorDescriptor_t xDesc,
- const void *x,
- const void *beta,
- const cudnnTensorDescriptor_t yDesc,
- void *y)
This function computes pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.
Parameters
- handle
-
Input. Handle to a previously created cuDNN context.
- poolingDesc
-
Input. Handle to a previously initialized pooling descriptor.
- alpha, beta
-
Input. Pointers to scaling factors (in host memory) used to blend the computation result with prior value in the output layer as follows: dstValue = alpha[0]*result + beta[0]*priorDstValue. Refer to this section for additional details.
- xDesc
-
Input. Handle to the previously initialized input tensor descriptor. Must be of type FLOAT, or DOUBLE, or HALF, or INT8. See cudnnDataType_t.
- x
-
Input. Data pointer to GPU memory associated with the tensor descriptorxDesc.
- yDesc
-
Input. Handle to the previously initialized output tensor descriptor. Must be of type FLOAT, or DOUBLE, or HALF, or INT8. See cudnnDataType_t.
- y
-
Output. Data pointer to GPU memory associated with the output tensor descriptoryDesc.
The possible error values returned by this function and their meanings are listed below.
Returns
- CUDNN_STATUS_SUCCESS
-
The function launched successfully.
- CUDNN_STATUS_BAD_PARAM
-
At least one of the following conditions are met:
- The dimensionsn,cof the input tensor and output tensors differ.
- Thedatatypeof the input tensor and output tensors differs.
- CUDNN_STATUS_NOT_SUPPORTED
-
The function does not support the provided configuration. See the following for some examples of non-supported configurations:
- ThewStrideof input tensor or output tensor is not 1.
- CUDNN_STATUS_EXECUTION_FAILED
-
The function failed to launch on the GPU
这个地方比较神奇的是只能传入两个参数,这就无法实现mask的更新,不太明白cuDNN设计者的思路,目前看,这个地方要想保持正确性,暂时应该是无法使用cuDNN的PoolingForward了。
Caffe之layer_factory的更多相关文章
- 基于Caffe的DeepID2实现(中)
小喵的唠叨话:我们在上一篇博客里面,介绍了Caffe的Data层的编写.有了Data层,下一步则是如何去使用生成好的训练数据.也就是这一篇的内容. 小喵的博客:http://www.miaoerduo ...
- 浅析py-faster-rcnn中不同版本caffe的安装及其对应不同版本cudnn的解决方案
浅析py-faster-rcnn中不同版本caffe的安装及其对应不同版本cudnn的解决方案 本文是截止目前为止最强攻略,按照本文方法基本可以无压力应对caffe和Ross B. Girshick的 ...
- 【caffe】mnist训练日志
@tags caffe 前面根据train_lenet.sh改写了train_lenet.py后,在根目录下执行它,得到一系列输出,内容如下: I1013 10:05:16.721294 1684 c ...
- 在caffe中添加新的layer
比如现在要添加一个vision layer,名字叫Ly_Layer:(一般命名第一个字母大写,其余小写.) 1.属于哪个类型的layer(共五种:common_layer, data_layer, l ...
- caffe: compile error: Could not open or find file your path~~/resized_data/0 and a total of 2 images .
I0219 14:48:40.965386 31108 net.cpp:76] Memory required for data: 0I0219 14:48:40.965517 31108 layer ...
- [caffe]深度学习之图像分类模型VGG解读
一.简单介绍 vgg和googlenet是2014年imagenet竞赛的双雄,这两类模型结构有一个共同特点是go deeper.跟googlenet不同的是.vgg继承了lenet以及alexnet ...
- caffe+GPU︱AWS.G2+Ubuntu14.04+GPU+CUDA8.0+cudnn8.0
国服亚马逊的GPU实例G2.2xlarge的python+caffe的安装过程,被虐- 一周才装出来- BVLC/caffe的在AWS安装的官方教程github: https://github.com ...
- caffe项目工程化封装FRCNN
各种坑!!想要做好,一定要自己一步步试,下载别人的总会出现各种问题. 步骤如下:(可以把这些文件打包在一个文件加下,分两个文件libs,include,一定要是自己的文件) 1 首先是配置caffe的 ...
- caffe中使用python定义新的层
转载链接:http://withwsf.github.io/2016/04/14/Caffe-with-Python-Layer/ Caffe通过Boost中的Boost.Python模块来支持使用P ...
随机推荐
- Http的请求协议请求行介绍
请求协议包含的内容 请求行 GET /day04-tomcat/index.jsp HTTP/1.1 HTTP/1.1: 表示的是我们使用的是http协议的1.1版本 请求头 请求空行 请求体: 存储 ...
- js es6遍历对象的6种方法(应用中推荐前三种)
javaScript遍历对象总结 1.for … in 循环遍历对象自身的和继承的可枚举属性(循环遍历对象自身的和继承的可枚举属性(不含Symbol属性).). 2.使用Object.keys ...
- for-update与for-update nowait
1.for update 和 for update nowait 的区别: 首先一点,如果只是select 的话,Oracle是不会加任何锁的,也就是Oracle对 select 读到的数据不会有任何 ...
- Qt编写自定义控件21-圆弧仪表盘
一.前言 圆弧仪表盘在整个自定义控件大全中也稍微遇到了技术难点,比如背景透明,如果采用以前画圆形画扇形的方式绘制,肯定很难形成背景透明,需要用到切割,最后换了一种绘制方法,采用绘制圆弧的方式,即使用d ...
- 阿里云服务出现TCP连接快速增加尤其是NON_ESTABLISHED大量增加导致内存和CPU暴增系统无法使用的问题
TCP状态转移要点TCP协议规定,对于已经建立的连接,网络双方要进行四次握手才能成功断开连接,如果缺少了其中某个步骤,将会使连接处于假死状态,连接本身占用的资源不 会被释放.网络服务器程序要同时管理大 ...
- ansible安装、配置ssh、hosts、测试连接
.安装ansible 1.1.源码安装 源码安装参照 https://www.cnblogs.com/guxiong/p/7218717.html [root@kube-node3 ~]# .tar. ...
- Centos7 系统更改apache默认网站目录(解决You don't have permission to access / on this server问题)
当我们在Centos7中配置好Apache时,发现apache默认解析目录是在 /var/www/html,也就是说当访问服务器 IP 或者本地 localhost 时, 默认定位到这个目录里的 in ...
- ansible实践
ansible常用module ansible-doc -l List available modules -s Show playbook snippet for specified module( ...
- nrpe command
1. nrpe 连接问题: 报错:/usr/local/nagios/libexec/check_nrpe -H destip ; CHECK_NRPE: Error - Could no ...
- uWSGI 漏洞复现(CVE-2018-7490)
uWSGI是一个Web服务器,它实现了WSGI协议.uwsgi.http等协议.Nginx中HttpUwsgiModule的作用是与uWSGI服务器进行交换.WSGI是一种Web服务器网关接口.它是一 ...