【文献阅读】Perceptual Generative Adversarial Networks for Small Object Detection –CVPR-2017
Perceptual Generative Adversarial Networks for Small Object Detection
2017CVPR 新鲜出炉的paper,这是针对small object detection的一篇文章,采用PGAN来提升small object detection任务的performance。
最近也没做object detection,只是别人推荐了这篇paper,看了摘要觉得通俗易懂就往下看了。。。最后发现还是没怎么搞懂,只是明白PGAN的模型。如果理解有误的地方,请指出。
言归正传,PGAN为什么对small object有效?具体是这样,small object 不好检测,而large object好检测,那PGAN就让generator 学习一个映射,把small object 的features 映射成 large object 的features,然后就好检测了。PGAN呢,主要就看它的generator。
传统GAN中的generator是学习从随机噪声到图像的映射,也就是generator可以把一个噪声变成图片,而PGAN的思想是让generator把small object 变成 large object,这样就有利于检测了。 来看看文章中的原话都是怎么介绍generator的:
- we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to “super-resolved” ones, achieving similar characteristics as large objects
- Perceptual Generative Adversarial Network (Perceptual GAN) model that improves small object detection through narrowing representation difference of small objects from the large ones.
- generator learns to transfer perceived poor representations of the small objects to super-resolved ones
- The Perceptual GAN aims to enhance the representations of small objects to be similar to those of large object
- the generator is a deep residual based feature generative model which transforms the original poor features of small objects to highly discriminative ones by introducing fine-grained details from lower-level layers, achieving “super-resolution” on the intermediate representations
6.传统的generator G represents a generator that learns to map data z from the noise distribution pz(z) to the distribution pdata(x) over data x,而PGAN的generator中 x and z are the representations for large objects and small objects - The generator network aims to generate super-resolved representations for small objects to improve detection accurac
- the generator as a deep residual learning network that augments the representations of small objects to super-resolved ones by introducing more fine-grained details absent from the small objects through residual learning
文章在不同地方不断的重复了一个意思,就是generator学习的是一个映射,这个映射就是把假(small object)的变成真(large object)的
来看看generator长什么样子
分两个部分,这里就没看懂是什么意思了,或许和object detection有关了。最终得出的结果是Super-Resolved Features 这个就很像Large Objects Featuresle. 如图,左下角是G生成的,左上角是真实的:
讲完了generator 就到discriminator了,这里的discrimintor和传统的GAN也有不一样的地方。
在这里,加入了一个新的loss,叫做perceptual loss ,PGAN也因此而得名(我猜的,很明显嘛)这个loss我也是没看明白的地方,贴原文大家看看吧(有理解的这部分的同学,请在评论区讲一讲,供大家学习)
1. justify the detection accuracy benefiting from the generated super-resolved features with a perceptual loss
看完paper感觉作者没有很直接说提出PGAN是inspired by哪些文章~不过GAN(2014 Goodfellow)
【文献阅读】Perceptual Generative Adversarial Networks for Small Object Detection –CVPR-2017的更多相关文章
- Paper Reading: Perceptual Generative Adversarial Networks for Small Object Detection
Perceptual Generative Adversarial Networks for Small Object Detection 2017-07-11 19:47:46 CVPR 20 ...
- Perceptual Generative Adversarial Networks for Small Object Detection
Perceptual Generative Adversarial Networks for Small Object Detection 感知生成对抗网络用于目标检测 论文链接:https://ar ...
- 文献阅读报告 - Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
paper:Gupta A , Johnson J , Fei-Fei L , et al. Social GAN: Socially Acceptable Trajectories with Gen ...
- CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks阅读笔记
CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks 2020 CVPR 2005.09544.pdf ...
- 生成对抗网络(Generative Adversarial Networks,GAN)初探
1. 从纳什均衡(Nash equilibrium)说起 我们先来看看纳什均衡的经济学定义: 所谓纳什均衡,指的是参与人的这样一种策略组合,在该策略组合上,任何参与人单独改变策略都不会得到好处.换句话 ...
- 语音合成论文翻译:2019_MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
论文地址:MelGAN:条件波形合成的生成对抗网络 代码地址:https://github.com/descriptinc/melgan-neurips 音频实例:https://melgan-neu ...
- StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks 论文笔记
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks 本文将利 ...
- 论文笔记之:Semi-Supervised Learning with Generative Adversarial Networks
Semi-Supervised Learning with Generative Adversarial Networks 引言:本文将产生式对抗网络(GAN)拓展到半监督学习,通过强制判别器来输出类 ...
- 《Self-Attention Generative Adversarial Networks》里的注意力计算
前天看了 criss-cross 里的注意力模型 仔细理解了 在: https://www.cnblogs.com/yjphhw/p/10750797.html 今天又看了一个注意力模型 < ...
随机推荐
- Memcache的mutex设计模式 -- 高并发解决方案
场景 Mutex主要用于有大量并发访问并存在cache过期的场合,如 首页top 10, 由数据库加载到memcache缓存n分钟: 微博中名人的content cache, 一旦不存在会大量请求不能 ...
- js常用函数和常用技巧
学习和工作的过程中总结的干货,包括常用函数.常用js技巧.常用正则表达式.git笔记等.为刚接触前端的童鞋们提供一个简单的查询的途径,也以此来缅怀我的前端学习之路. PS:此文档,我会持续更新. Aj ...
- MTD
内存技术设备(英语:Memory Technology Device,缩写为 MTD),是Linux系统中设备文件系统的一个类别,主要用于快闪存储器的应用,是一种快闪存储器转换层(Flash Tran ...
- 【maven】maven的web项目打包报错:No compiler is provided in this environment. Perhaps you are running on a JRE rather than a JDK
打包过程中报错如下: No compiler is provided in this environment. Perhaps you are running on a JRE rather than ...
- Javascript 内置值、typeof运算符、true/false判断
一.内置值 true false null undefined NaN Infinity 二.typeof运算结果 ...
- 关于在SSH2中使用ajax技术的总结(主要写Struts2和ajax)
以下内容是自己理解的,因为还没有看过相关的文章,所以,技术上还是有很大的欠缺.不过这也是自己努力思考得到的,如果有什么更好的建议可以回复我. 1. 任务需求: 实现一个包含数据的表格,并且有分页功能. ...
- S5PV210使用的启动方式
2017年12月25日1. S5PV210存储配置: +内置64KB NorFlash(上电不需要初始化)(叫IROM 内部外存):用于存储预先设置的BL0; + SoC内置96KB SRAM(上电不 ...
- remmina rdp远程连接windows
一.remmina rdp远程连接windows sudo apt-get install remmina 二.ubuntu设置桌面快捷方式 ①找到Remmina远程桌面客户端 比如在[搜索您的本地和 ...
- http://blog.csdn.net/i_bruce/article/details/39555417
http://blog.csdn.net/i_bruce/article/details/39555417
- etcd的原理分析
k8s集群使用etcd作为它的数据后端,etcd是一种无状态的分布式数据存储集群. 数据以key-value的形式存储在其中. 今天同事针对etcd集群的运作原理做了一个讲座,总结一下. A. etc ...