Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Link of the Paper: https://arxiv.org/abs/1711.09151

Motivation:

LSTM units are complex and inherently sequential across time.
Convolutional networks have shown advantages on machine translation and conditional image generation.

Innovation:

The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.

The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.

Improvement:

Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.

General Points:

Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
An illustration of a classical RNN architecture for image captioning is provided below.

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

Paper Read: Convolutional Image Captioning
Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...
Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...
Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...
爬取CVPR 2018过程中遇到的坑
爬取 CVPR 2018 过程中遇到的坑使用语言及模块语言: Python 3.6.6 模块: re requests lxml bs4 过程一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...
在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境
这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...
Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...
Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...
Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...

随机推荐

H5基本标签
react系列（一）JSX语法、组件概念、生命周期介绍
JSX React中,推出了一种新的语法取名为JSX,它给了JS中写HTML标签的能力,不需要加引号.JSX的语法看起来是一种模板,然而它在编译以后,会转成JS语法,只是书写过程中的语法糖. JSX的 ...
Python 学习笔记（九）Python元组和字典（一）
Python 元组元组的定义元组(tuple)是一种Python对象类型,元组也是一种序列 Python中的元组与列表类似,不同之处元组的元素不能修改元组使用小括号,列表使用方括号元组的创建 ...
【oracle笔记2】约束
约束 *约束是添加在列上的,用来约束列的. 1. 主键约束(唯一标识) ***非空*** ***唯一*** ***被引用***(外键时引用主键) *当表的某一列被指定为主键后,该列就不能为空,不能有重 ...
iOS之iOS中的（null）、<null>、 nil 的问题
摘要: 你有没有过这样的经历,就是界面上显示出类似<null>.(null)这样一些东西,有时候还会莫名其妙的闪退.反反复复真是曰了犬,今天来总结一下这个问题的解决方法前段时间开发过 ...
Vue.js与 ASP.NET Core 服务端渲染功能整合
http://mgyongyosi.com/2016/Vuejs-server-side-rendering-with-aspnet-core/ 原作者:Mihály Gyöngyösi 译者:oop ...
ubuntu 18.04可以连接内网，无法连接外网
手动增加网关后,又重新sudo apt-get upgrade, 提示/etc/resolvconf/resolv.conf.d更新时,选Y后,不用手动修改网关也可以连接外网了. 一切默认更新后,1 ...
我的Tmux学习笔记
0. 修改指令前缀 // ~/.tmux.conf ubind C-b set -g prefix C-a 1. 新建会话 tmux tmux new -s session-name // 可以设置会 ...
JavaScript6里出现了哪些新语法、新特征？
ES5是2009年就出来的,目前来说在我写这篇文章的时候基本上ES6在浏览器上面还没有普及,不过Google浏览器是支持ES6语法的,谁让Google是美国生产的呢... ES6现在使用的地方其实还是 ...
理解Linux系统调用
目录 1.什么是系统调用 2.linux的系统调用 3.linux系统调用实现 1.什么是系统调用系统调用,指的是操作系统提供给用户程序调用的一组特殊接口,用户程序可以根据这组接口获得操作系统内核的 ...

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章

随机推荐

热门专题