Link of the Paper: https://arxiv.org/abs/1412.2306 Main Points: An Alignment Model: Convolutional Neural Networks over image regions ( An image -> RCNN -> Top 19 detected locations in addition to the whole image -> the representations based on th…
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal Recurrent Neural Networks ( AlexNet/VGGNet + a multimodal layer + RNNs ). Their work has two major differences from these methods. Firstly, they inco…
Link of the Paper: https://arxiv.org/abs/1411.4555 Main Points: A generative model ( NIC, GoogLeNet + LSTM ) based on a deep recurrent architecture: the model is trained to maximize the likelihoodP(S|I) of the target description sentence given the tr…
https://cs.stanford.edu/people/karpathy/deepimagesent/ Abstract We present a model that generates natural language descriptions of images and their regions. Our approach leverages datasets of images and their sentence descriptions to learn about the…
论文:Deep Neural Networks for YouTube Recommendations 发表时间:2016 发表作者:(Google)Paul Covington, Jay Adams, Emre Sargin 发表刊物/会议:RecSys 论文链接:论文链接 这篇论文是google的YouTube团队在推荐系统上DNN方面的尝试,发表在16年9 月的RecSys会议.本文就focus在YouTube视频推荐的DNN算法,文中不但详细介绍了Youtube推荐算法和架构细节,还给了…
发表时间:2013 发表作者:(Google)Szegedy C, Toshev A, Erhan D 发表刊物/会议:Advances in Neural Information Processing Systems(NIPS) 本文实现了一种利用DNN来做目标检测的方法.当时,CNN等深度学习在识别上面做的还挺好,但是在目标检测上面没有特别突出的结果.本文中作者把目标检测看做一个回归问题,回归目标窗口(BoundingBox)的位置,寻找一张图片当中目标类别和目标出现的位置. 作者在Imag…
Visual Semantic Navigation Using Scene Priors 2018-10-21 19:39:26 Paper:  https://arxiv.org/pdf/1810.06543.pdf Demo:https://www.youtube.com/watch?v=otKjuO805dE&feature=youtu.be 本文将首先定义什么是 visual semantic navigation, 然后描述怎么利用深度强化学习的框架来解决该问题,以及该任务的 bas…
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras Abstract Optimization objectives: intrinsic/extrinsic parameters of all keyframes all selected pixels' depth Inte…
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association2018-09-29 19:36:43 Paper:http://openaccess.thecvf.com/content_ECCV_2018/papers/Dapeng_Chen_Improving_Deep_Visual_ECCV_2018_paper.pdf 1. I…
这是一篇被ICLR 2019 接收的论文.论文讨论了如何利用场景先验知识 (scene priors)来定位一个新场景(novel scene)中未曾见过的物体(unseen objects).举例来说,在「厨房」这一场景中,有一张图片显示「苹果」在冰箱的储物架上,同为水果的物体,如「橙子」,会出现在场景的哪个位置呢?论文提出了用基于强化学习的方法来定位「橙子」. 论文:VISUAL SEMANTIC NAVIGATION USING SCENE PRIORS 论文作者:Wei Yang , X…