[原创]Faster R-CNN论文翻译
Faster R-CNN论文翻译
Faster R-CNN是互怼完了的好基友一起合作出来的巅峰之作,本文翻译的比例比较小,主要因为本paper是前述paper的一个简单改进,方法清晰,想法自然。什么想法?就是把那个一直明明应该换掉却一直被几位大神挤牙膏般地拖着不换的选择性搜索算法,即区域推荐算法。在Fast R-CNN的基础上将区域推荐换成了神经网络,而且这个神经网络和Fast R-CNN的卷积网络一起复用,大大缩短了计算时间。同时mAP又上了一个台阶,我早就说过了,他们一定是在挤牙膏。
Faster R-CNN: Towards Real-Time Object
Detection with Region Proposal Networks
摘要
1. 介绍
2 相关工作
3 FASTER R-CNN
3.1 区域推荐网络
3.1.1 锚点
平移不变性锚点
多尺度锚点作为回归参照物
3.1.2 损失函数
3.1.3 训练RPNs
3.2 RPN and Fast R-CNN之间共享特征
3.3 实现细节
4 EXPERIMENTS
5 CONCLUSION
参考文献
[2] R. Girshick, “Fast R-CNN,” in IEEE International Conference onComputer Vision (ICCV), 2015.
[3] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in InternationalConference on Learning Representations (ICLR), 2015.
[4] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders, “Selective search for object recognition,” InternationalJournal of Computer Vision (IJCV), 2013.
[5] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich featurehierarchies for accurate object detection and semantic segmentation,” in IEEE Conference on Computer Vision and PatternRecognition (CVPR), 2014.
[6] C. L. Zitnick and P. Dollar, “Edge boxes: Locating object ´proposals from edges,” in European Conference on ComputerVision (ECCV), 2014.
[7] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutionalnetworks for semantic segmentation,” in IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2015.
[8] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained partbased models,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2010.
[9] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus,and Y. LeCun, “Overfeat: Integrated recognition, localizationand detection using convolutional networks,” in InternationalConference on Learning Representations (ICLR), 2014.
[10] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” inNeural Information Processing Systems (NIPS), 2015.
[11] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, andA. Zisserman, “The PASCAL Visual Object Classes Challenge2007 (VOC2007) Results,” 2007.
[12] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft COCO: Com- ´mon Objects in Context,” in European Conference on ComputerVision (ECCV), 2014.
[13] S. Song and J. Xiao, “Deep sliding shapes for amodal 3d objectdetection in rgb-d images,” arXiv:1511.02300, 2015.
[14] J. Zhu, X. Chen, and A. L. Yuille, “DeePM: A deep part-basedmodel for object detection and semantic part localization,”arXiv:1511.07131, 2015.
[15] J. Dai, K. He, and J. Sun, “Instance-aware semantic segmentation via multi-task network cascades,” arXiv:1512.04412, 2015.
[16] J. Johnson, A. Karpathy, and L. Fei-Fei, “Densecap: Fullyconvolutional localization networks for dense captioning,”arXiv:1511.07571, 2015.
[17] D. Kislyuk, Y. Liu, D. Liu, E. Tzeng, and Y. Jing, “Human curation and convnets: Powering item-to-item recommendationson pinterest,” arXiv:1511.04003, 2015.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learningfor image recognition,” arXiv:1512.03385, 2015.
[19] J. Hosang, R. Benenson, and B. Schiele, “How good are detection proposals, really?” in British Machine Vision Conference(BMVC), 2014.
[20] J. Hosang, R. Benenson, P. Dollar, and B. Schiele, “What makes ´for effective detection proposals?” IEEE Transactions on PatternAnalysis and Machine Intelligence (TPAMI), 2015.
[21] N. Chavali, H. Agrawal, A. Mahendru, and D. Batra,“Object-Proposal Evaluation Protocol is ’Gameable’,” arXiv:1505.05836, 2015.
[22] J. Carreira and C. Sminchisescu, “CPMC: Automatic object segmentation using constrained parametric min-cuts,”IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), 2012.
[23] P. Arbelaez, J. Pont-Tuset, J. T. Barron, F. Marques, and J. Malik, ´“Multiscale combinatorial grouping,” in IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2014.
[24] B. Alexe, T. Deselaers, and V. Ferrari, “Measuring the objectness of image windows,” IEEE Transactions on Pattern Analysisand Machine Intelligence (TPAMI), 2012.
[25] C. Szegedy, A. Toshev, and D. Erhan, “Deep neural networksfor object detection,” in Neural Information Processing Systems(NIPS), 2013.
[26] D. Erhan, C. Szegedy, A. Toshev, and D. Anguelov, “Scalableobject detection using deep neural networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[27] C. Szegedy, S. Reed, D. Erhan, and D. Anguelov, “Scalable,high-quality object detection,” arXiv:1412.1441 (v1), 2015.
[28] P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning tosegment object candidates,” in Neural Information ProcessingSystems (NIPS), 2015.
[29] J. Dai, K. He, and J. Sun, “Convolutional feature maskingfor joint object and stuff segmentation,” in IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2015.
[30] S. Ren, K. He, R. Girshick, X. Zhang, and J. Sun, “Object detection networks on convolutional feature maps,”arXiv:1504.06066, 2015.
[31] J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, andY. Bengio, “Attention-based models for speech recognition,”in Neural Information Processing Systems (NIPS), 2015.
[32] M. D. Zeiler and R. Fergus, “Visualizing and understandingconvolutional neural networks,” in European Conference onComputer Vision (ECCV), 2014.
[33] V. Nair and G. E. Hinton, “Rectified linear units improverestricted boltzmann machines,” in International Conference onMachine Learning (ICML), 2010.
[34] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov,D. Erhan, and A. Rabinovich, “Going deeper with convolutions,” in IEEE Conference on Computer Vision and PatternRecognition (CVPR), 2015.
[35] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard,W. Hubbard, and L. D. Jackel, “Backpropagation applied tohandwritten zip code recognition,” Neural computation, 1989.
[36] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg,and L. Fei-Fei, “ImageNet Large Scale Visual RecognitionChallenge,” in International Journal of Computer Vision (IJCV),2015.
[37] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in NeuralInformation Processing Systems (NIPS), 2012.
[38] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutionalarchitecture for fast feature embedding,” arXiv:1408.5093, 2014.
[39] K. Lenc and A. Vedaldi, “R-CNN minus R,” in British MachineVision Conference (BMVC), 2015.
[原创]Faster R-CNN论文翻译的更多相关文章
- k[原创]Faster R-CNN论文翻译
物体检测论文翻译系列: 建议从前往后看,这些论文之间具有明显的延续性和递进性. R-CNN SPP-net Fast R-CNN Faster R-CNN Faster R-CNN论文翻译 原文地 ...
- 深度学习论文翻译解析(四):Faster R-CNN: Down the rabbit hole of modern object detection
论文标题:Faster R-CNN: Down the rabbit hole of modern object detection 论文作者:Zhi Tian , Weilin Huang, Ton ...
- 深度学习论文翻译解析(十三):Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
论文标题:Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 标题翻译:基于区域提议(Regi ...
- 深度学习论文翻译解析(三):Detecting Text in Natural Image with Connectionist Text Proposal Network
论文标题:Detecting Text in Natural Image with Connectionist Text Proposal Network 论文作者:Zhi Tian , Weilin ...
- 深度学习论文翻译解析(十六):Squeeze-and-Excitation Networks
论文标题:Squeeze-and-Excitation Networks 论文作者:Jie Hu Li Shen Gang Sun 论文地址:https://openaccess.thecvf.co ...
- R-CNN论文翻译
R-CNN论文翻译 Rich feature hierarchies for accurate object detection and semantic segmentation 用于精确物体定位和 ...
- SSD: Single Shot MultiBoxDetector英文论文翻译
SSD英文论文翻译 SSD: Single Shot MultiBoxDetector 2017.12.08 摘要:我们提出了一种使用单个深层神经网络检测图像中对象的方法.我们的方法,名为SSD ...
- 深度学习论文翻译解析(二):An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
论文标题:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application ...
- 论文翻译——R-CNN(目标检测开山之作)
R-CNN论文翻译 <Rich feature hierarchies for accurate object detection and semantic segmentation> 用 ...
随机推荐
- Linux JDK配置
第一步:下载jdk-7-linux-i586.tar.gz wget -c http://download.oracle.com/otn-pub/java/jdk/7/jdk-7-linux-i586 ...
- Hive简记
在大数据工作中难免遇到数据仓库(OLAP)架构,以及通过Hive SQL简化分布式计算的场景.所以想通过这篇博客对Hive使用有一个大致总结,希望道友多多指教! 摘要: 1.Hive安装 2.Hive ...
- POJ1032 Parliament(数论)
New convocation of The Fool Land's Parliament consists of N delegates. According to the present regu ...
- linux下c语言的多线程编程
我们在写linux的服务的时候,经常会用到linux的多线程技术以提高程序性能 多线程的一些小知识: 一个应用程序可以启动若干个线程. 线程(Lightweight Process,LWP),是程序执 ...
- git仓库管理笔录
Git是目前世界上最先进的分布式版本控制系统(没有之一). 小明做了个个人博客,放到了Git 仓库里面.第二天换了台电脑,只需要 git clone 克隆一下git 远程仓库的代码到本地即可.然后他 ...
- 读Zepto源码之Stack模块
Stack 模块为 Zepto 添加了 addSelf 和 end 方法. 读 Zepto 源码系列文章已经放到了github上,欢迎star: reading-zepto 源码版本 本文阅读的源码为 ...
- 部署LAMP+NFS实现双Web服务器负载均衡
一.需求分析 1.前端需支持更大的访问量,单台Web服务器已无法满足需求了,则需扩容Web服务器: 2.虽然动态内容可交由后端的PHP服务器执行,但静态页面还需要Web服务器自己解析,那是否意味着多台 ...
- TypeScript中的怪语法
TypeScript中的怪语法 如何处理undefined 和 null undefined的含义是:一个变量没有初始化. null的含义是:一个变量的值是空. undefined 和 null 的最 ...
- 关于KVO导读
入门篇 KVO是什么? Key-value observing is a mechanism that allows objects to be notified of changes to spec ...
- iOS音频格式PCM转G711u(或G711a-law)
请尊重作者劳动成果,如需转载本博客文章请注明出处!谢谢合作! inputData是PCM的实时数据,可以通过转码,获取到最后导出的G711u数据(sendData) NSUInteger datal ...