【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics
Unsupervised Learning of Spatiotemporally Coherent Metrics
Note here: it's a learning note on the topic of unsupervised learning on videos, a novel work published by Yann LeCun's group.
Link: http://arxiv.org/pdf/1412.6056.pdf
Motivation:
Temporal coherence is a form of weak supervision, which they exploit to learn generic signal representations that are stable with respect to the variability in natural video, including local deformations.
This induces the assumption that data samples that are temporal neighbors are also likely to be neighbors in the latent space.
(The invariant features in temporal sequences are also called slow features.)
Proposed Model:
The loss function based on temporal coherence is shown below:

The first term denotes neighbor frames should be similar to maintain the slowness, but in case of the network learns a constant mapping, they add the second term to force frames at different time steps to be separated by at least a distance of m-units in feature space.
However, the second term only provides the discriminative criteria on pairwise distances in the feature space. This paper argues this discriminative constraint is too weak. Thus, they introduce a reconstruction term not only prevents the constant solution but also acts to explicitly preserve information about the input. So the new loss function is:

(The first term is reconstruction term, the second one is to train slow features. And \(a|h_{r}|\) denotes sparsity penalty term.)
The overall pipeline is shown below:

Tricks:
They leverage several intuitions and tricks in the paper, but as the limitation of knowledge in this field, I can just dive into one of these.
Pooling plays an important role in the architecture. Training through a local pooling operator enforces a local topology on the hidden activations, inducing units that are pooled together to learn complimentary features.
Also, pooling in space and across features when we use convolutional architecture can produce more invariant features.
【CV】ICCV2015_Unsupervised Learning of Spatiotemporally Coherent Metrics的更多相关文章
- 【CV】ICCV2015_Unsupervised Learning of Visual Representations using Videos
Unsupervised Learning of Visual Representations using Videos Note here: it's a learning note on Prof ...
- 【CV】ICCV2015_Unsupervised Visual Representation Learning by Context Prediction
Unsupervised Visual Representation Learning by Context Prediction Note here: it's a learning note on ...
- 【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF:在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合
[论文标题]CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Colla ...
- 【RS】List-wise learning to rank with matrix factorization for collaborative filtering - 结合列表启发排序和矩阵分解的协同过滤
[论文标题]List-wise learning to rank with matrix factorization for collaborative filtering (RecSys '10 ...
- 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角
[论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys ...
- 论文阅读笔记(三)【AAAI2017】:Learning Heterogeneous Dictionary Pair with Feature Projection Matrix for Pedestrian Video Retrieval via Single Query Image
Introduction (1)IVPR问题: 根据一张图片从视频中识别出行人的方法称为 image to video person re-id(IVPR) 应用: ① 通过嫌犯照片,从视频中识别出嫌 ...
- 【转载】Deep Learning(深度学习)学习笔记整理
http://blog.csdn.net/zouxy09/article/details/8775360 一.概述 Artificial Intelligence,也就是人工智能,就像长生不老和星际漫 ...
- 【转】Deep Learning(深度学习)学习笔记整理系列之(八)
十.总结与展望 1)Deep learning总结 深度学习是关于自动学习要建模的数据的潜在(隐含)分布的多层(复杂)表达的算法.换句话来说,深度学习算法自动的提取分类需要的低层次或者高层次特征. 高 ...
- 【CV】ICCV2015_Describing Videos by Exploiting Temporal Structure
Describing Videos by Exploiting Temporal Structure Note here: it's a learning note on the topic of v ...
随机推荐
- MessageQueue 相关概念
/** * Implements a thread-local storage, that is, a variable for which each thread * has its own v ...
- Android平台调用Web Service:线程返回值
接上文 前文中的遗留问题 对于Java多线程的理解,我曾经只局限于实现Runnable接口或者继承Thread类.然后重写run()方法.最后start()调用就算完事,可是一旦涉及死锁以及对共享资源 ...
- [CQOI2014]排序机械臂
嘟嘟嘟 最近复习复习平衡树,然后又体会到了那种感觉:"写代码半小时,debug一下午". 这题其实就是让你搞一个数据结构,支持一下操作: 1.区间翻转. 2.查询区间最小值所在位置 ...
- Javascript中的各结构的嵌套和函数
各位朋友大家好,上周更新给大家分享了JavaScript的入门知识及各种常用结构的用法,那么,本次更新博主就跟大家更深入的聊一聊JS各结构的嵌套用法,及JS中及其常用的一种结构——函数.以下为函数和循 ...
- wallet.metamask.io 网页版钱包 connecting unknown network导致页面卡住
之前在还不是十分懂用的时候想要用其连接本地的打开的ganache,所以就像使用本地插件的metamask一样,点击custom rpc,然后输入http://localhost:7545,然后页面就一 ...
- 将Integer赋值给int(空指针异常)
将Integer赋值为null,然后在赋值给int类型,会出现空指针异常
- 可长点心吧-sort
sort #<algorithm> 用的时候一定是 从 第一个(你想要排序的范围内的) 到 最后一个+1 真的错了不止一次了 真的长点心吧
- Qt+QGis二次开发:加载栅格图层和矢量图层
一.加载栅格图像 加载栅格图像的详细步骤在下面代码里: //添加栅格数据按钮槽函数 void MainWindow::addRasterlayers() { //步骤1:打开文件选择对话框 QStri ...
- 使用 Apache Web 配置多个站点
导读 如何在流行而强大的 Apache Web 服务器上托管两个或多个站点.这篇文章的环境是 Fedora 27 虚拟机,配置了 Apache 2.4.29.如果你用另一个发行版或不同的 Fedora ...
- webpack4+node合并资源请求, 实现combo功能(二十三)
本文学习使用nodejs实现css或js资源文件的合并请求功能,我们都知道在一个复杂的项目当中,可能会使用到很多第三方插件,虽然目前使用vue开发系统或者h5页面,vue组件够用,但是有的项目中会使用 ...