( 转) Awesome Image Captioning

Awesome Image Captioning 2018-12-03 19:19:56 From: https://github.com/zhjohnchan/awesome-image-captioning Papers 2010 I2t: Image parsing to text description - Yao B Z et al, P IEEE 2011. 2011 Im2Text: Describing Images Using 1 Million Captioned Photo…

Paper Read: Convolutional Image Captioning

Convolutional Image Captioning 2018-11-04 20:42:07 Paper: http://openaccess.thecvf.com/content_cvpr_2018/papers/Aneja_Convolutional_Image_Captioning_CVPR_2018_paper.pdf Code: https://github.com/aditya12agd5/convcap Related Papers: 1. Convolutional Se…

[Paper Reading] Image Captioning using Deep Neural Architectures (arXiv: 1801.05568v1)

Main Contributions: A brief introduction about two different methods (retrieval based method and generative method) for image captioning task. The authors implemented the classical model, Show and Tell, and gave analyses based on the experiments. Exc…

视频描述（Video Captioning）调研

Video Analysis 相关领域介绍之Video Captioning(视频to文字描述)http://blog.csdn.net/wzmsltw/article/details/71192385 基于视频图像的信息:包括简单的用CNN(VGGNet, ResNet等)提取图像(spatial)特征,用action recognition的模型(如C3D)提取视频动态(spatial+temporal)特征先验特征:比如视频的类别,这种特征能提供很强的先验信息基于文本的特征:此处基于文…

Paper Reading - Deep Captioning with Multimodal Recurrent Neural Networks ( m-RNN ) ( ICLR 2015 ) ★

Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal Recurrent Neural Networks ( AlexNet/VGGNet + a multimodal layer + RNNs ). Their work has two major differences from these methods. Firstly, they inco…

[ Continuously Update ] The Paper List of Image / Video Captioning

Papers Published in 2018 Convolutional Image Captioning - Jyoti Aneja et al., CVPR 2018 - [ Paper Reading ] Learning to Evaluate Image Captioning - Yin Cui et al., CVPR 2018 - [ Paper Reading ] CNN+CNN: Convolutional Decoders for Image Captioning - Q…

Paper Reading - CNN+CNN: Convolutional Decoders for Image Captioning

Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN framework for image captioning. There are four modules in the framework: vision module ( VGG-16 ), which is adopted to "watch" images; language modu…

Paper Reading - Learning to Evaluate Image Captioning ( CVPR 2018 ) ★

Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learning based discriminative evaluation metric that is directly trained to distinguish between human and machine-generated captions. They train an automatic…

Paper Reading - Convolutional Image Captioning ( CVPR 2018 )

Link of the Paper: https://arxiv.org/abs/1711.09151 Motivation: LSTM units are complex and inherently sequential across time. Convolutional networks have shown advantages on machine translation and conditional image generation. Innovation: The author…

第九讲_图像生成 Image Captioning

第九讲_图像生成 Image Captioning 生成式对抗网络 Generative Adversarial network 学习数据分布:概率密度函数估计+数据样本生成生成式模型是共生关系,判别式模型是因果关系 GAN在生成模型的位置 GAN特点 GAN 无监督网络框架生成器generator and 判别器 discriminator 先学习判别器,然后固定判别器,优化生成器生成器网络生成样本数据判别器网络样本有真实采样数据+生成器生成的样本数据 EM优化是同方向优化,GAN…

第七讲_图像描述（图说）Image Captioning

第七讲_图像描述(图说)Image Captioning 本章结构递归神经网络时序后向传播(BPTT) 朴素Vanilla-RNN 基本模型用sigmoid存在严重的梯度消失 LSTM长短时记忆模型(97年提出) 基本模型模型对比 LSTM数学模型控制门作用理解 LSTM结构图 LSTM变种: Peephole Coupled 忘记输入门 GRU门限递归单元(Gated Recurrent Unit) 改进 LSTM和GRU比较图像描述为图片生成描述语言具有多模态理解和推理:复合…

【CV论文阅读】Image Captioning 总结

初次接触Captioning的问题,第一印象就是Andrej Karpathy好聪明.主要从他的两篇文章开始入门,<Deep Fragment Embeddings for Bidirectional Image Sentence Mapping>和<Deep Visual-Semantic Alignments for Generating Image Descriptions>.基本上,第一篇文章看明白了,第二篇就容易了,研究思路其实是一样的.但确实,第二个模型的功能更强大一些…

Image Captioning 经典论文合辑

Image Caption: Automatically describing the content of an image domain:CV+NLP Category:(by myself, you can read the survey for detail.) CNN+RNN, with attention mechanisms Reinforcement Learning GAN Compositional Architecture: Review Network, Guiding…

Unpaired/Partially/Unsupervised Image Captioning

这篇涉及到以下三篇论文: Unpaired Image Captioning by Language Pivoting (ECCV 2018) Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data (ECCV 2018) Unsupervised Image Caption (CVPR 2019) 1. Unpaired Image Captioning by Lan…

Image Captioning代码复现

Image caption generation: https://github.com/eladhoffer/captionGen Simple encoder-decoder image captioning: https://github.com/udacity/CVND---Image-Captioning-Project (Paper)StyleNet: Generating Attractive Visual Captions with Styles: https://github…

Video Captioning 综述

1.Unsupervised learning of video representations using LSTMs 方法:从先前的帧编码预测未来帧序列相似于Sequence to sequence learning with neural networks论文方法:使用一个LSTM编码输入文本成固定表示,另一个LSTM解码成不同语言 2.Describing Videos by Exploiting Temporal Structure 该论文发表在iccv2015,是第一篇使用时间关…

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

题目:SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning 作者: Long Chen等(浙大.新国立.山大) 期刊:CVPR 2017 1 背景注意力机制已经在自然语言处理和计算机视觉领域取得了很大成功,但是大多数现有的基于注意力的模型只考虑了空间特征,即那些注意模型考虑特征图像中的局部更“重要”的信息,忽略了多通道信息的重要性关系.这篇文章介绍了一种新…

视频描述（Video Captioning）近年重要论文总结

视频描述顾名思义视频描述是计算机对视频生成一段描述,如图所示,这张图片选取了一段视频的两帧,针对它的描述是"A man is doing stunts on his bike",这对在线的视频的检索等有很大帮助.近几年图像描述的发展也让人们思考对视频生成描述,但不同于图像这种静态的空间信息,视频除了空间信息还包括时序信息,同时还有声音信息,这就表示一段视频比图像包含的信息更多,同时要求提取的特征也就更多,这对生成一段准确的描述是重大的挑战. 一.long-term Recurrent…

论文：Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering-阅读总结

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering-阅读总结笔记不能简单的抄写文中的内容,得有自己的思考和理解. 一.基本信息 **\1.标题:**Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering **\2.作者:**Peter Anderson,Xiaodong…

Paper Reading - Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

Link of the Paper: https://arxiv.org/abs/1609.06647 A Correlative Paper: Show and Tell: A Neural Image Caption Generator (Link of the Paper: https://arxiv.org/abs/1411.4555) Main Points ( Improvements Over the CVPR2015 Model ): Image Model Improveme…

读a paper of ICCV 2017 : Areas of Attention for Image Captioning

前言废话,作者说把代码公布在gitub上,但是迟迟没有公布,我发邮件询问代码情况,邮件也迟迟不回,表示很尴尬..虽然种种这些,但是工作还是好工作,这个没的黑,那我们今天就来详细的介绍这篇文章. 导论:不了解caption的童鞋可以去看下这两篇知乎专栏: 看图说话的AI小朋友--图像标注趣谈(上) 看图说话的AI小朋友--图像标注趣谈(下) 一:摘要作者提出了一个新的attention模型,这个模型与以往的区别在于,不仅考虑了状态与预测单词之间的关系,同时也考虑了图像区域…

(CV学习笔记)看图说话(Image Captioning)-2

实现load_img_as_np_array def load_img_as_np_array(path, target_size): """从给定文件[加载]图像,[缩放]图像大小为给定target_size,返回[Keras支持]的浮点数numpy数组. # Arguments path: 图像文件路径 target_size: 元组(图像高度, 图像宽度). # Returns numpy 数组. """ 使用PIL库: from PIL…

(CV学习笔记)看图说话(Image Captioning)-1

Background 分别使用CNN和LSTM对图像和文字进行处理: 将两个神经网络结合: 应用领域图像搜索安全鉴黄涉猎知识数字图像处理图像读取图像缩放图像数据纬度变换自然语言处理文字清洗文字嵌入(Embedding) CNN卷积神经网络图像特征提取迁移学习(Transfer Learning) LSTM递归神经网络文字串(sequence)特征提取 DNN深度神经网络从图像特征和文字串(sequence)的特征预测下一个单词使用数据集 Framing Image…

【机器学习Machine Learning】资料大全

昨天总结了深度学习的资料,今天把机器学习的资料也总结一下(友情提示:有些网站需要"科学上网"^_^) 推荐几本好书: 1.Pattern Recognition and Machine Learning (by Hastie, Tibshirani, and Friedman's ) 2.Elements of Statistical Learning(by Bishop's) 这两本是英文的,但是非常全,第一本需要有一定的数学基础,第可以先看第二本.如果看英文觉得吃力,推荐看一下下面…

cordova + ionic 使用中碰到的一些问题

cordova + ionic 使用中碰到的一些问题 No Content-Security-Policy meta tag found. Please add one when using the cordova-plugin-whitelist plugin.解决办法index.html 中添加<meta http-equiv="Content-Security-Policy" content="default-src *; style-src 'self'…

转一下大牛的嵌入web页播放视频方法（转）

来自:http://www.cnblogs.com/bandry/archive/2006/10/11/526229.html 在Web页中嵌入Media Player的方法比较简单,只要用HTML中的<Object></Object>可以了,如下所示.<OBJECT ID="WMPlay" WIDTH=320 HEIGHT=240CLASSID="CLSID:22D6f312-B0F6-11D0-94AB-0080C74C7E95"C…

Top Deep Learning Projects in github

Top Deep Learning Projects A list of popular github projects related to deep learning (ranked by stars). Last Update: 2016.08.09 Project Name Stars Description TensorFlow 29622 Computation using data flow graphs for scalable machine lear…

LaTeX插入图片方法 Inserting Images

Inserting Images Images are essential elements in most of the scientific documents. LATEX provides several options to handle images and make them look exactly what you need. In this article is explained how to include images in the most common format…

UIImageWriteToSavedPhotosAlbum

UIImageWriteToSavedPhotosAlbum: Next UIKit Function Reference Overview The UIKit framework defines a number of functions, many of them used in graphics and drawing operations. Functions by Task Application Launch UIApplicationMain Image Manipulation…

RNN and LSTM saliency Predection Scene Label

http://handong1587.github.io/deep_learning/2015/10/09/rnn-and-lstm.html //RNN and LSTM http://handong1587.github.io/deep_learning/2015/10/09/saliency-prediction.html //saliency Predection http://handong1587.github.io/deep_learning/2015/10/09/scene-l…

【( 转) Awesome Image Captioning】的更多相关文章