论文笔记——MobileNets(Efficient Convolutional Neural Networks for Mobile Vision Applications)

论文地址：MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

MobileNet由Google提出的一种新的卷积计算方法，旨在加速卷积计算过程。
为了减小网络模型大小，提出了两种比较暴力的裁剪方法。

(1) 直接对channel进行裁剪，这种随机砍掉一些channel，也太暴力了吧，砍多了效果肯定不好，想想都知道。

(2) 减少输入图像的分辨率，也就是减小输入的尺寸大小。

我们还是关注新的卷积计算方法，要做压缩的话，还是另辟蹊径。

1. Full convolution VS. Depthwise separable convolution

1.1 Full convolution

M表示输入的channel, N表示输出的channel，Dk表示kernel size.
我们可以看到输出的每一个channel，都跟所有的输入channel有关，也就是说，对于输出的一个channel，都是M个kernel与M个channel卷积以后的求和结果。
差别就在这里！在depthwise separable中，每一个输出的channel，只和一个输入的channel有关。

1.2 Depthwise separable convolution

输入M个channel，那么输出也是M个channel，每一个channel都是由一个kernel在一个channel卷积以后得到的结果，不在是和所有的输入相关了。这也就是为什么名字叫做depthwise separable（深度级的分离，channel的分离）。
但是我们发现输出只有M个channel,而我们想要输出N个channel，这个时候我们应该想到1*1的convolution，这个时候的卷积就是full convolution。这个时候输出的每一个channel都和输入有关了，相当于输入的加权求和。所以1x1的卷积有联合(combine)的作用。

2. 计算量对比

只要理解了两个的差别，不难算出计算直接的差别。

Dk表示kernel size， M表示输入的channel，也就是feature map的个数，N表示输出的channel。Df表示feature map的大小，也就是width和height, 上面这个式子再一次验证了我们上面说的，输出的每一个channel都和输入的所有channel有关。

求和的左半部分，表示depthwise separable的计算量，可以看到输出为M个channel,每个输出channel只和一个channel有关。
求和的有半部分，表示1x1 pointwise convolution，可以看到每一个输出channel，都和M个输入有关(M个输入的加权求和)。
计算量较少比例

3. 模型压缩

上面公式可以看到直接对输入的M个channel进行的压缩(随机采样)

上面公式可以看到对不仅对输出的channel进行了采样，对输入图像的分辨率也进行了减小。

4. 对比实验

4.1 参数量的对比

4.2 实验结果

5. 实现

Tensorflow的实现： https://github.com/tensorflow/models/blob/master/slim/nets/mobilenet_v1.md
Caffe实现(trick)： https://github.com/shicai/MobileNet-Caffe
(通过caffe 的group参数来实现depthwise的操作的，由于实现的问题和cuda/cudnn对其支持得不好，训练起来十分慢。前向预测时在CPU上的耗时大概是googlenet的70%。这个数据参考一篇博文的，未实践过。)
Pytorch实现：https://github.com/marvis/pytorch-mobilenet

6. 总结

根据实践经验的总结，这种新的卷积计算方式，对运算速度的改进还是比较明显的，精度影响不是很大，至于文中说的两个裁剪方法，我觉得还是慎重使用比较好。
现在市面上已经有很多裁剪方法了，没必要用这么暴力的进行裁剪来压缩模型大小。

论文笔记——MobileNets(Efficient Convolutional Neural Networks for Mobile Vision Applications)的更多相关文章

[论文阅读] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (MobileNet)
论文地址:MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 本文提出的模型叫Mobi ...
[论文理解] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Intro MobileNet 我 ...
【论文翻译】MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 论文链接:https://arxi ...
深度学习论文翻译解析（十七）：MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
论文标题:MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 论文作者:Andrew ...
【网络结构】MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications论文解析
目录 0. Paper link 1. Overview 2. Depthwise Separable Convolution 2.1 architecture 2.2 computational c ...
Paper | MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
目录 1. 故事 2. MobileNet 2.1 深度可分离卷积 2.2 网络结构 2.3 引入两个超参数 3. 实验本文提出了一种轻量级结构MobileNets.其基础是深度可分离卷积操作. M ...
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
1. 摘要作者提出了一系列应用于移动和嵌入式视觉的称之为 MobileNets 的高效模型,这些模型采用深度可分离卷积来构建轻量级网络. 作者还引入了两个简单的全局超参数来有效地权衡时延和准确率,以 ...
【MobileNet-V1】-2017-CVPR-MobileNets Efficient Convolutional Neural Networks for Mobile Vision Applications-论文阅读
2017-CVPR-MobileNets Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew H ...
深度学习论文翻译解析（六）：MobileNets：Efficient Convolutional Neural Networks for Mobile Vision Appliications
论文标题:MobileNets:Efficient Convolutional Neural Networks for Mobile Vision Appliications 论文作者:Andrew ...

随机推荐

HPC高性能计算知识: 异构并行计算
版权声明:很多其它内容,请关注[架构师技术联盟]公众号 https://blog.csdn.net/BtB5e6Nsu1g511Eg5XEg/article/details/80059122 当摩尔定 ...
后缀名htm与html的区别
前者是超文本标记(Hypertext Markup) 后者是超文本标记语言(Hypertext Markup Language) 可以说 htm = html 同时,这两种都是静态网页文件的扩展名,扩 ...
Linux中Kill掉进程的10种方法
常规篇: 首先,用ps查看进程,方法如下: 复制代码代码如下: $ ps -ef……smx 1822 1 0 11:38 ? 00:00:49 gnome-terminalsmx 1823 1822 ...
深度学习Momentum(动量方法)
转自:http://blog.csdn.net/bvl10101111/article/details/72615621 先上结论: 1.动量方法主要是为了解决Hessian矩阵病态条件问题(直观上讲 ...
testNG入门详解
TestNG 的注释: @DataProvider @ExpectedExceptions @Factory @Test @Parameters <suite name="Parame ...
Lintcode: First Position of Target (Binary Search)
Binary search is a famous question in algorithm. For a given sorted array (ascending order) and a ta ...
Summary: Depth-first Search(DFS)
There are generally two methods to write DFS algorithm, one is using recursion, another one is using ...
visual studio 2010 winform程序不能添加对system.web的引用
visual studio 2010 winform程序不能添加对system.web的引用[转载] 需要引用到System.Web.发现没有“System.Web”.在通过“浏览”方式,找到该DLL ...
Python - matplotlib 数据可视化
在许多实际问题中,经常要对给出的数据进行可视化,便于观察. 今天专门针对Python中的数据可视化模块--matplotlib这块内容系统的整理,方便查找使用. 本文来自于对<利用python进 ...
Nature重磅：Hinton、LeCun、Bengio三巨头权威科普深度学习
http://wallstreetcn.com/node/248376 借助深度学习,多处理层组成的计算模型可通过多层抽象来学习数据表征( representations).这些方法显著推动了语音识别 ...

论文笔记——MobileNets(Efficient Convolutional Neural Networks for Mobile Vision Applications)

1. Full convolution VS. Depthwise separable convolution

1.1 Full convolution

1.2 Depthwise separable convolution

2. 计算量对比

3. 模型压缩

4. 对比实验

4.1 参数量的对比

4.2 实验结果

5. 实现

6. 总结

论文笔记——MobileNets(Efficient Convolutional Neural Networks for Mobile Vision Applications)的更多相关文章

随机推荐

热门专题