caffe卷积层代码阅读笔记

卷积的实现思想：

通过im2col将image转为一个matrix，将卷积操作转为矩阵乘法运算
通过调用GEMM完毕运算操作
以下两个图是我在知乎中发现的，“盗”用一下，确实非常好。能帮助理解。

參数剖析

配置參数：(从配置文件得来)

kernel_h_ pad_h_ hole_h_ stride_h_

kernel_w_ pad_w_ hole_w_ stride_w_

is_1x1_:上面8个參数都为1时，该參数为true
和输入有关的參数：（从bottom得来）

num_

channels_

height_

width_
和卷积核有关的參数：(前两个參数从配置文件得来)

num_output_

group_

this->blobs_[0].reset(new Blob(num_output_, channels_ / group_, kernel_h_, kernel_w_));

this->blobs_[1].reset(new Blob(1, 1, 1, num_output_));

this->param_propagate_down_
和输出有关的參数：（计算得来）

const int kernel_h_eff = kernel_h_ + (kernel_h_ - 1) * (hole_h_ - 1);

const int kernel_w_eff = kernel_w_ + (kernel_w_ - 1) * (hole_w_ - 1);

height_out_ = (height_ + 2 * pad_h_ - kernel_h_eff) / stride_h_ + 1;

width_out_ = (width_ + 2 * pad_w_ - kernel_w_eff) / stride_w_ + 1;
和矩阵运算有关的參数：（计算得来）

M_ = num_output_ / group_;

K_ = channels_ * kernel_h_ * kernel_w_ / group_;

N_ = height_out_ * width_out_;

col_buffer_.Reshape(1, channels_*kernel_h_*kernel_w_, height_out_, width_out_);// is_1x1_为false的时候用

bias_multiplier_.Reshape(1, 1, 1, N_); //所有为1

输入大小：(num_, channels_, height_, width_)

输出大小：(num_, num_output_, height_out_, width_out_)

重点函数剖析

函数一：

im2col_cpu(bottom_data + bottom[i]->offset(n),

1, channels_, height_, width_,

kernel_h_, kernel_w_, pad_h_, pad_w_,

stride_h_, stride_w_, hole_h_, hole_w_,

col_buff);

该函数的目的是：依据配置參数，将一幅(1, channels_, height_, width_)的输入feature map expand成 (1, channels_*kernel_h_*kernel_w_, height_out_, width_out_)大小的矩阵。

详细的实现方法是：

内部主要有两套索引

一套是在输入图像上的索引，各自是：c_im(channels), h_im(height), w_im(width)

还有一套是在输出的col_buff上的。各自是：c(channels_col), h(height_col), w(width_col)

循环变量来自输出的col_buff的维数，依据输出的位置计算相应在输入图像上的位置（col2imh函数和im2col函数是一个道理。两套坐标反着来即可）。把索引的代码整合出来。对着源码看。非常easy懂：

    const int kernel_h_eff = kernel_h + (kernel_h - 1) * (hole_h - 1);

    const int kernel_w_eff = kernel_w + (kernel_w - 1) * (hole_w - 1);

    int height_col = (height + 2 * pad_h - kernel_h_eff) / stride_h + 1;

    int width_col = (width + 2 * pad_w - kernel_w_eff) / stride_w + 1;

    int channels_col = channels * kernel_h * kernel_w;

    int w_offset = (c % kernel_w)  * hole_w;

    int h_offset = ((c / kernel_w) % kernel_h) * hole_h;

    int c_im = c / kernel_w / kernel_h;

    const int h_im = h * stride_h + h_offset - pad_h;

    const int w_im = w * stride_w + w_offset - pad_w;

函数二：

caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, M_, N_, K_,

(Dtype)1., weight + weight_offset * g, col_buff + col_offset * g,

(Dtype)0., top_data + top[i]->offset(n) + top_offset * g);

该函数的目的是：

将(num_output_/group_, channels_ /group_, kernel_h_, kernel_w_)卷积核看成一个(num_output_/group_, channels_*kernel_h_*kernel_w_/group_)的矩阵A,即大小为M_x K_。

将(1, channels_*kernel_h_*kernel_w_, height_out_, width_out_)的col_buff看成group_个(channels_*kernel_h_*kernel_w_/group_, height_out_*width_out_)的矩阵B。即大小为K_x N_。

两者相乘再加上偏置项。就能得到卷积的结果。

解释caffe_cpu_gemm函数：

事实上其内部包了一个cblas_sgemm函数。

void cblas_sgemm(const enum CBLAS_ORDER Order, const enum CBLAS_TRANSPOSE TransA,

const enum CBLAS_TRANSPOSE TransB, const int M, const int N,

const int K, const float alpha, const float *A,

const int lda, const float *B, const int ldb,

const float beta, float *C, const int ldc)

得到的结果是:

C = alpha*op( A )*op( B ) + beta*C

const enum CBLAS_ORDER Order，这是指的数据的存储形式，在CBLAS的函数中不管一维还是二维数据都是用一维数组存储，这就要涉及是行主序还是列主序。在C语言中数组是用行主序。fortran中是列主序。

假设是习惯于是用行主序，所以这个參数是用CblasRowMajor。假设是列主序的话就是 CblasColMajor。

const int M，矩阵A的行，矩阵C的行

const int N，矩阵B的列。矩阵C的列

const int K，矩阵A的列。矩阵B的行

caffe卷积层代码阅读笔记的更多相关文章

[置顶] Linux协议栈代码阅读笔记（二）网络接口的配置
Linux协议栈代码阅读笔记(二)网络接口的配置 (基于linux-2.6.11) (一)用户态通过C库函数ioctl进行网络接口的配置例如,知名的ifconfig程序,就是通过C库函数sys_io ...
Linux协议栈代码阅读笔记（二）网络接口的配置
Linux协议栈代码阅读笔记(二)网络接口的配置 (基于linux-2.6.11) (一)用户态通过C库函数ioctl进行网络接口的配置例如,知名的ifconfig程序,就是通过C库函数sys_io ...
[置顶] Linux协议栈代码阅读笔记（一）
Linux协议栈代码阅读笔记(一) (基于linux-2.6.21.7) (一)用户态通过诸如下面的C库函数访问协议栈服务 int socket(int domain, int type, int p ...
【caffe】卷积层代码解析
1.Forward_cpu conv_layer.cpp template <typename Dtype> void ConvolutionLayer<Dtype>::For ...
Linux-3.0.8 input subsystem代码阅读笔记
先乱序记录一下阅读Linux input subsystem代码的笔记. 在input device driver的入口代码部分,需要分配并初始化input device结构,内核提供的API是inp ...
〖Android〗OK6410a的Android HAL层代码编写笔记
一.编写LED灯的Linux驱动程序代码之所以使用存在HAL层,是为了保护对硬件驱动过程的逻辑与原理: 所以,残留在Linux驱动层的代码,只保留了基本的读写操作,而不含有关键的逻辑思维: 1. l ...
caffe 代码阅读笔记1
首先查看caffe.cpp里的train函数: // Train / Finetune a model. //训练,微调一个网络模型 int train() { // google的glog库,检查- ...
caffe卷积层实现
下图是jiayangqing在知乎上的回答,其实过程就是把image转换成矩阵,然后进行矩阵运算卷积的实现在conv_layer层,conv_layer层继承了base_conv_layer层,ba ...
Typecho 代码阅读笔记（二） - 数据库访问
转载请注明出处:http://blog.csdn.net/jh_zzz 这一块比较复杂,我还没有完全理解为什么要把 SQL 语句的组装搞这么复杂. 从一个普通皮肤页面开始 themes/default ...

随机推荐

《c程序设计语言》读书笔记-递归实现快速排序算法
#include <stdio.h> void swap(int v[],int i,int j) { int temp; temp = v[i]; v[i] = v[j]; v[j] = ...
华中农业大学第四届程序设计大赛网络同步赛 J
Problem J: Arithmetic Sequence Time Limit: 1 Sec Memory Limit: 128 MBSubmit: 1766 Solved: 299[Subm ...
linux命令Netstat
1.需求了解Netstat命令 2.简介命令用于显示各种网络相关信息,如网络连接,路由表,接口状态 (Interface Statistics),masquerade 连接,多播成员 (Multi ...
50 days before NOI2017
2017.5.31 今天开了这个博客,打算每天来写点东西,嗯...毕竟要NOI了嘛... 第一天跑到常州里集训,打开题目一看湖南集训题... T1刷一下写完,然后交了然后发现错了...赶紧改过来,大概 ...
redis介绍与配置
redis 是什么我们都知道 mysql.oracle 是关系型数据库,它们的特点就是像一个 Excel 表格一样,而且都是用 SQL 语句.但是你可能看到一种东西叫NoSQL.NoSQL 泛指非关 ...
python的递归算法学习（3）：汉诺塔递归算法
汉诺塔问题是递归函数的经典应用,它来自一个古老传说:在世界刚被创建的时候有一座钻石宝塔A,其上有64个金蝶.所有碟子按从大到小的次序从塔底堆放至塔顶.紧挨着这座塔有另外两个钻石宝塔B和C.从世界创始之 ...
Ubuntu14.04安装配置LAMP环境（php5.6）
sudo apt-get install python-software-properties sudo apt-get update sudo apt-get install vim sudo ap ...
ActiveMQ StartUp
从http://activemq.apache.org/activemq-5132-release.html 下载解压从apache-activemq-5.13.2\bin\win64\wrapp ...
(12)C#枚举,结构
枚举枚举类型是类似自定义的一个类,类里放着你自己定义的常量,关键字enum. enum Season{spring,summer,fall,winter} 想用这里的常量的话,首先把变量定义成 Se ...
ELK帮助文档
elasticsearch: API中文指南:https://es.xiaoleilu.com/010_Intro/15_API.html 官方文档:https://www.elastic.co/cn ...

caffe卷积层代码阅读笔记

卷积的实现思想：

參数剖析

重点函数剖析

caffe卷积层代码阅读笔记的更多相关文章

随机推荐

热门专题