使用SDNN (space displacement neural network)进行多字体手写识别
手写单字体的识别,在看过卷积神经网络的mnist例子之后,很容易实现,那么如何实现多字体的同时识别呢? 如下图
LeCun大神所用的是SDNN space displacement neural network,这是什么鬼?
经过一番查询之后,原来它就是滑动窗口+图像金子塔+NMS,2015年yahoo的一篇论文 Multi-view Face Detection using deep convolutional Neural Networks 用的也是这种方法
参考页面:https://www.quora.com/What-is-a-space-displacement-neural-network-SDNN
下面是两位知情人士的回答:
A neural network that is slided as a detector across all the possible locations in the image. You have a network with an input layer of size NxN pixels, Then, you have an image with size MxM pixels, with M>N. The objects that you want to detect are somewhere in the image but you do not know where. Thus, you sweep your neural network all over the image. At the first position, in the top-left corner, you have certain classification scores for the objects that you want to detect, and you update your score map at that position. Then, you apply your NN on a position shifted of 1 or few pixels horizontally, and you update the score map for that position as well. This process continue until all the image is processed and all the score map completed.
The score map represents a detection map of your objects. A mechanism of non-maxima suppression have to be implemented in order to avoid multiple matches of the same object.
It avoids you to use segmentation. However, also in this case there is not free lunch. For making it scale invariant, you need to create a scale space of your input image. This requires to perform a number of classification on the order of ten thousands for few scale in 1MP image. Even if you can reuse a great part of the computation for convolutional layers for nearby classifications, you have to recompute the fully connected layers all the time, making the process painfully slow.
That is why people started to research in object proposal techniques. Maybe one day enough computational power may let us not think about these problems.
翻译如下:
一个神经网络,像探测器一样,在图像的所有可能的位置进行滑动。假设你有一个输入大小为N×N象素的神经网络,然后,你有大小MXM像素的图像,其中M>N,你要检测图像中的某处,但你不知道在哪里。因此,你用神经网络扫描来遍布图像。在第一个位置,在左上角,你有一定的分类分数的对象,你想检测,你更新你的得分地图在那个位置。然后,你把你的NN水平转移到1或几个像素的位置,你更新该位置的得分地图以及。这个过程继续,直到所有的图像处理和所有的得分图完成。
分数图表示对象的检测图。以避免多个相同的对象的匹配,非最大值抑制的机制。
它避免了你使用分割。然而,在这种情况下也没有免费的午餐。为了使其缩放不变,您需要创建一个输入图像的缩放空间。这就需要对图像1mp几个规模十成千上万的顺序执行一系列分类。即使你可以利用附近的分类卷积层计算的很大一部分,你必须重新计算的全连接层的所有时间,使过程缓慢。
这就是为什么人们开始研究对象的建议技术(术语为region proposal,“区域建议”)。也许有一天足够的计算能力可能让我们不考虑这些问题。
Barath Lakshmanan, works at TVS Motor Company
CNNs extract features from the input and classify them. However, the input has to be size-normalized. In case of a single composite objects, each individual object within them have variable size and it is difficult to segment them. One way to recognize such objects is using a sliding window in the input layer as mentioned by Alessandro Ferrari.
It is to be noted that when convolution is performed, on the inputs which are overlapping regions in an image, same set of features gets extracted repeatedly. In order to avoid this redundant action, convolution is performed on the entire input image till the last conv layer. Finally the classifier is used as sliding window on the obtained feature map to produce the heat map.
Performance of such network should improve drastically as the redundancy is removed. This design is called as Space Displacement Neural Network (SDNN).
翻译如下:
CNN的特征提取和分类的输入。然而,输入必须是尺寸归一化。在一个单一的复合对象的情况下,每个单独的对象内有可变的大小,它是很难分割。认识到这些对象的一个方法是使用在输入层由Alessandro法拉利提到一个滑动窗口。
需要注意的是,当进行卷积,在图像中的重叠区域的输入,相同的一组功能被提取重复。为了避免这种重复的动作,卷积进行对整个输入图像到最后转换层。最后,分类器被用作所得到的特征映射的滑动窗口产生的热映射。
这样的网络的性能应大幅改善冗余被删除。这种设计被称为空间位移的神经网络(SDNN)。
使用SDNN (space displacement neural network)进行多字体手写识别的更多相关文章
- Recurrent Neural Network[survey]
0.引言 我们发现传统的(如前向网络等)非循环的NN都是假设样本之间无依赖关系(至少时间和顺序上是无依赖关系),而许多学习任务却都涉及到处理序列数据,如image captioning,speech ...
- How to implement a neural network
神经网络的实践笔记 link: http://peterroelants.github.io/posts/neural_network_implementation_part01/ 1. 生成训练数据 ...
- (转)The Neural Network Zoo
转自:http://www.asimovinstitute.org/neural-network-zoo/ THE NEURAL NETWORK ZOO POSTED ON SEPTEMBER 14, ...
- 深度神经网络如何看待你,论自拍What a Deep Neural Network thinks about your #selfie
Convolutional Neural Networks are great: they recognize things, places and people in your personal p ...
- A Neural Network in 11 lines of Python
A Neural Network in 11 lines of Python A bare bones neural network implementation to describe the in ...
- Recurrent Neural Network系列2--利用Python,Theano实现RNN
作者:zhbzz2007 出处:http://www.cnblogs.com/zhbzz2007 欢迎转载,也请保留这段声明.谢谢! 本文翻译自 RECURRENT NEURAL NETWORKS T ...
- Recurrent Neural Network系列4--利用Python,Theano实现GRU或LSTM
yi作者:zhbzz2007 出处:http://www.cnblogs.com/zhbzz2007 欢迎转载,也请保留这段声明.谢谢! 本文翻译自 RECURRENT NEURAL NETWORK ...
- What is “Neural Network”
Modern neuroscientists often discuss the brain as a type of computer. Neural networks aim to do the ...
- 通过Visualizing Representations来理解Deep Learning、Neural network、以及输入样本自身的高维空间结构
catalogue . 引言 . Neural Networks Transform Space - 神经网络内部的空间结构 . Understand the data itself by visua ...
随机推荐
- Azure Document DB Repository 的实现
阅读 需要大约 5 分钟. 前景: Azure Cosmos DB 由 Microsoft 提供,是全球分布式多模型数据库. 通过 Azure Cosmos DB 跨任意数量的 Azure 地理区域 ...
- [UI] 精美UI界面欣赏[12]
精美UI界面欣赏[12]
- 企业级NFS网络文件共享服务_【all】
1.1. 什么是NFS(1台机器提供服务) Network File System(网络文件系统)通过局域网让不同的主机系统之间共享文件或目录. NFS客户端可以通过挂载的方式将NFS服务器端共享的数 ...
- 铁乐学python_day20_面向对象编程2
面向对象的组合用法 软件重用的重要方式除了继承之外还有另外一种方式,即:组合 组合指的是,在一个类中以另外一个类的对象作为数据属性,称为类的组合. 例:人狗大战,人类绑定上武器来对狗进行攻击: # 定 ...
- 用pymysql代替MySQLdb
在我刚开始学python的时候,用的是python2.7,那时候连接mysql用的库是MySQLdb(很诡异的大小写,初学者经常因为记不住大小写导致“No module named xxx”).燃鹅, ...
- 关于右键属性与du -sh显示的文件大小不一致的解决
du -sh filename(其实我们经常用du -sh *,显示当前目录下所有的文件及其大小,如果要排序再在后面加上 | sort -n) 关于右键属性与du -sh显示的文件大小不一致的解决 ...
- 利用mpvue开发微信小程序
最近公司部门负责人提出需求需要开发一款微信小程序,由于本人之前是做前端开发的,对于小程序开发一窍不通,但是很多时候我们都是把不会做变成我会学.于是便在网上寻找小程序开发教程,相比于相生的小程序开发,本 ...
- Linux禁用root账户ssh登录
前言 今天登录服务器的时候,控制台输出如下信息 There were 48990 failed login attempts since the last successful login. Last ...
- Nginx如何配置静态文件直接访问
其实前面在这篇文章Nginx之动静分离中已经提到过如何配置静态文件直接访问,今天突然再写是因为之前写的不够完善,所以这一篇文章你可以理解为是在前一个基础上的扩展. 之所以下午临时想到这个,是因为之前搭 ...
- centos6.5添加阿里docker加速器
1. 配置阿里docker加速器 vi /etc/sysconfig/docker 在文件末尾追加下面两行 other_args="--registry-mirror=https://pl8 ...