[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记

http://openaccess.thecvf.com/content_cvpr_2017/papers/Kodirov_Semantic_Autoencoder_for_CVPR_2017_paper.pdf

Semantic Autoencoder for Zero-Shot Learning，Elyor Kodirov Tao Xiang Shaogang Gong，Queen Mary University of London, UK，{e.kodirov, t.xiang, s.gong}@qmul.ac.uk

亮点

通过对耦学习提升零次学习系统的性能（类似CycleGan）
结构非常简洁，且可直接求解，速度非常快
有效应用到其他相关任务（监督聚类）上，证明了范化性能

方法

Linear autoencoder

Model Formulation

which is a well-known Sylvester equation which can be solved efficiently by the Bartels-Stewart algorithm (matlab sylvester).

零次学习：基于以上算法有两种测试的方法：

将一个未知的类别特征样本xi通过W映射到语义空间（属性）si，通过比较语义空间的距离找到离它最近的类别（无训练样本），即为它的标签
将所有无训练数据类别的语义特征S通过WT映射到特征空间X，通过比较一个未知类别的样本xi和映射到特征空间的类别中心X的距离，找到离它最近的类别，即为它的标签
以上两种算法得到结果的准确度基本相同。

监督聚类：在这个问题中，语义空间即为类别标签空间（one-hot class label）。所有测试数据被影射到训练类别标签空间，然后使用k-means聚合

与已有模型的关系：零度学习已有模型一般学习一个满足以下条件的影射：

或者，在［54］中将属性影射到特征空间，学习目标变为，

文中的算法结合了这两者，而且由于W*=WT，在对耦学习中W不可能太大（否则，x乘以两个范数很大的的矩阵无法恢复原来的初始值），正则化项可以被忽略。

实验

零次学习

数据集：Semantic word vector representation is used for large-scale datasets (ImNet-1 and ImNet-2). We train a skip-gram text model on a corpus of 4.6M Wikipedia documents to obtain the word2vec2 [38, 37] word vectors.

特征：除 ImNet-1用AlexNet提取外，其他均使用了GoogleNet

结果：

Our SAE model achieves the best results on all 6 datasets.
On the smallscale datasets, the gap between our model’s results to the strongest competitor ranges from 3.5% to 6.5%.
On the large-scale datasets, the gaps are even bigger: On the largest ImNet-2, our model improves over the state-of-the-art SS-Voc [22] by 8.8%.
Both the encoder and decoder projection functions in our SAE model (SAE (W) and SAE (WT) respectively) can be used for effective ZSL.

The encoder projection function seems to be slightly better overall.

Measures how well a zero-shot learning method can trade-off between recognising data from seen classes and that of unseen classes

Holding out 20% of the data samples from the seen classes and mixing them with the samples from the unseen classes.
On AwA, our model is slightly worse than the SynCstruct [13].
However, on the more challenging CUB dataset, our method significantly outperforms the competitors.

聚类

数据集： A synthetic dataset and Oxford Flowers-17 (848 images)

结果：

On computational cost, our model (93s) is more expensive than MLCA (39%) but much better than all others (hours~days).
Achieves the best clustering accuracy

p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 16.0px }
p.p5 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXGeneral; color: #323333 }
p.p6 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px STIXGeneral; color: #323333 }
p.p7 { margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px STIXGeneral; color: #323333 }
p.p8 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px STIXGeneral; color: #323333 }
p.p9 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px "Helvetica Neue"; color: #323333; min-height: 20.0px }
p.p10 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 19.0px STIXSizeOneSym; color: #323333 }
p.p11 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 17.0px }
li.li3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
span.s3 { font: 19.0px STIXSizeOneSym }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记的更多相关文章

Spectral Norm Regularization for Improving the Generalizability of Deep Learning论文笔记
Spectral Norm Regularization for Improving the Generalizability of Deep Learning论文笔记 2018年12月03日 00: ...
Deep Learning论文笔记之（四）CNN卷积神经网络推导和实现（转）
Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文, ...
Deep Learning论文笔记之（八）Deep Learning最新综述
Deep Learning论文笔记之(八)Deep Learning最新综述 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感觉看完 ...
Deep Learning论文笔记之（六）Multi-Stage多级架构分析
Deep Learning论文笔记之(六)Multi-Stage多级架构分析 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些 ...
Deep Learning论文笔记之（一）K-means特征学习
Deep Learning论文笔记之(一)K-means特征学习 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感 ...
Deep Learning论文笔记之（三）单层非监督学习网络分析
Deep Learning论文笔记之(三)单层非监督学习网络分析 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感 ...
PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning --- 论文笔记
PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning ICLR 20 ...
Correlation Filter in Visual Tracking系列二：Fast Visual Tracking via Dense Spatio-Temporal Context Learning 论文笔记
原文再续,书接一上回.话说上一次我们讲到了Correlation Filter类 tracker的老祖宗MOSSE,那么接下来就让我们看看如何对其进一步地优化改良.这次要谈的论文是我们国内Zhang ...
Deep Learning论文笔记之（四）CNN卷积神经网络推导和实现
https://blog.csdn.net/zouxy09/article/details/9993371 自己平时看了一些论文,但老感觉看完过后就会慢慢的淡忘,某一天重新拾起来的时候又好像没有看过一 ...

随机推荐

【Qt编程】基于Qt的词典开发系列--后序
从去年八月份到现在,总算完成了词典的编写以及相关技术文档的编辑工作.从整个过程来说,文档的编写比程序的实现耗费的时间更多.基于Qt的词典开发系列文章,大致包含了在编写词典软件过程中遇到的技术重点与难点 ...
Swift基础之UITableView（之前写的知识点都是最新的2.2版本样式，欢迎大家参考，可以相互交流）
//这里只是列举了经常使用的UITableView的属性和方法,其他的都可以类似使用,注意用法即可 //设置全局变量UITableView var myTableView = UITa ...
mysql进阶(七)limit的用法
limit是mysql的语法 select * from table limit m,n 其中m是指记录开始的index,从0开始,表示第一条记录 n是指从第m+1条开始,取n条. select * ...
mysql进阶(十二)常见错误汇总
原因:外键名不能重复
Memcached学习笔记 — 第四部分:Memcached Java 客户端-gwhalin(1)-介绍及使用
介绍 Memcached java client是官方推荐的最早的memcached java客户端.最新版本:java_memcached-release_2.6.1. 官方下载地址:http ...
ActiveMQ系列之三：理解和掌握JMS
JMS是什么 JMS Java Message Service,Java消息服务,是Java EE中的一个技术. JMS规范 JMS定义了Java 中访问消息中间件的接口,并没有给予实现,实现JMS ...
【42】android Context深度剖析
android程序和java程序的区别 Android程序不像Java程序一样,随便创建一个类,写个main()方法就能跑了,而是要有一个完整的Android工程环境,在这个环境下,我们有像Activ ...
拆解轮子之XRecyclerView
简介这个轮子是对RecyclerView的封装,主要完成了下拉刷新.上拉加载更多.RecyclerView头部.在我的Material Design学习项目中使用到了项目地址,感觉还不错.趁着毕业答 ...
Core Animation简介
一.Core Animation简介 * Core Animation,中文翻译为核心动画,它是一组非常强大的动画处理API,使用它能做出非常炫丽的动画效果,而且往往是事半功倍.也就是说,使用少量的代 ...
mysql6.5 操作日志
创建用户并授权 grant all privileges on database.* to user@localhost identified by '123456'; flush privilege ...

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记的更多相关文章

随机推荐

热门专题