Paper Reading - Convolutional Image Captioning ( CVPR 2018 )
Link of the Paper: https://arxiv.org/abs/1711.09151
Motivation:
- LSTM units are complex and inherently sequential across time.
- Convolutional networks have shown advantages on machine translation and conditional image generation.
Innovation:
- The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.
- The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.
Improvement:
- Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.
General Points:
- Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
- Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
- An illustration of a classical RNN architecture for image captioning is provided below.
Paper Reading - Convolutional Image Captioning ( CVPR 2018 )的更多相关文章
- Paper Read: Convolutional Image Captioning
Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_ ...
- Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learnin ...
- Paper Reading - Convolutional Sequence to Sequence Learning ( CoRR 2017 ) ★
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convol ...
- Paper Reading: Stereo DSO
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse ...
- 爬取CVPR 2018过程中遇到的坑
爬取 CVPR 2018 过程中遇到的坑 使用语言及模块 语言: Python 3.6.6 模块: re requests lxml bs4 过程 一开始都挺顺利的,先获取到所有文章的链接再逐个爬取获 ...
- 在矩池云上复现 CVPR 2018 LearningToCompare_FSL 环境
这是 CVPR 2018 的一篇少样本学习论文:Learning to Compare: Relation Network for Few-Shot Learning 源码地址:https://git ...
- Paper Reading - Long-term Recurrent Convolutional Networks for Visual Recognition and Description ( CVPR 2015 )
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Arch ...
- Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN fra ...
- Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal ...
随机推荐
- H5基本标签
- react系列(一)JSX语法、组件概念、生命周期介绍
JSX React中,推出了一种新的语法取名为JSX,它给了JS中写HTML标签的能力,不需要加引号.JSX的语法看起来是一种模板,然而它在编译以后,会转成JS语法,只是书写过程中的语法糖. JSX的 ...
- Python 学习笔记(九)Python元组和字典(一)
Python 元组 元组的定义 元组(tuple)是一种Python对象类型,元组也是一种序列 Python中的元组与列表类似,不同之处元组的元素不能修改 元组使用小括号,列表使用方括号 元组的创建 ...
- 【oracle笔记2】约束
约束 *约束是添加在列上的,用来约束列的. 1. 主键约束(唯一标识) ***非空*** ***唯一*** ***被引用***(外键时引用主键) *当表的某一列被指定为主键后,该列就不能为空,不能有重 ...
- iOS之iOS中的(null)、<null>、 nil 的问题
摘要: 你有没有过这样的经历,就是界面上显示出类似<null>.(null)这样一些东西,有时候还会莫名其妙的闪退.反反复复真是曰了犬,今天来总结一下这个问题的解决方法 前段时间开发过 ...
- Vue.js与 ASP.NET Core 服务端渲染功能整合
http://mgyongyosi.com/2016/Vuejs-server-side-rendering-with-aspnet-core/ 原作者:Mihály Gyöngyösi 译者:oop ...
- ubuntu 18.04可以连接内网,无法连接外网
手动增加网关后,又重新sudo apt-get upgrade, 提示/etc/resolvconf/resolv.conf.d更新时,选Y后,不用手动修改网关也可以连接外网了. 一切默认更新后,1 ...
- 我的Tmux学习笔记
0. 修改指令前缀 // ~/.tmux.conf ubind C-b set -g prefix C-a 1. 新建会话 tmux tmux new -s session-name // 可以设置会 ...
- JavaScript6里出现了哪些新语法、新特征?
ES5是2009年就出来的,目前来说在我写这篇文章的时候基本上ES6在浏览器上面还没有普及,不过Google浏览器是支持ES6语法的,谁让Google是美国生产的呢... ES6现在使用的地方其实还是 ...
- 理解Linux系统调用
目录 1.什么是系统调用 2.linux的系统调用 3.linux系统调用实现 1.什么是系统调用 系统调用,指的是操作系统提供给用户程序调用的一组特殊接口,用户程序可以根据这组接口获得操作系统内核的 ...