Gan-based zero-shot learning 论文整理

1 Feature Generating Networks for Zero-Shot Learning

Suffering from the extreme training data imbalance between seen and unseen classes, most ofexisting state-of-the- art approaches fail to achieve satisfactory results for the challenging generalized zero-shot learning task. To circum- vent the need for labeled examples of unseen classes, we propose a novel generative adversarial network (GAN) that synthesizes CNN features conditioned on class-level semantic information, offering a shortcut directly from a semantic descriptor ofa class to a class-conditional feature distribution. Our proposed approach, pairing a Wasserstein GAN with a classification loss, is able to generate sufficiently discriminative CNN features to train softmax classifiers or any multimodal embedding method. Our experimental results demonstrate a significant boost in accuracy over the state of the art on five challenging datasets – CUB, FLO, SUN, AWA and ImageNet – in both the zero-shot learning and general- ized zero-shot learning settings.

2 Adversarial Zero-Shot Learning with Semantic Augmentation

In situations in which labels are expensive or difficult to ob- tain, deep neural networks for object recognition often suffer to achieve fair performance. Zero-shot learning is dedicated to this problem. It aims to recognize objects of unseen classes by transferring knowledge from seen classes via a shared intermediate representation. Using the manifold structure of seen training samples is widely regarded as important to learn a robust mapping between samples and the intermediate rep- resentation, which is crucial for transferring the knowledge. However, their irregular structures, such as the lack in vari- ation of samples for certain classes and highly overlapping clusters of different classes, may result in an inappropriate mapping. Additionally, in a high dimensional mapping space, the hubness problem may arise, in which one of the unseen classes has a high possibility to be assigned to samples of dif- ferent classes. To mitigate such problems, we use a genera- tive adversarial network to synthesize samples with specified semantics to cover a higher diversity of given classes and in- terpolated semantics of pairs of classes. We propose a simple yet effective method for applying the augmented semantics to the hinge loss functions to learn a robust mapping. The pro- posed method was extensively evaluated on small- and large- scale datasets, showing a significant improvement over state- of-the-art methods.

3 Semantically Aligned Bias Reducing Zero Shot Learning Akanksha

Zero shot learning (ZSL) aims to recognize unseen
classes by exploiting semantic relationships between seen and unseen classes. Two major problems faced by ZSL al- gorithms are the hubness problem and the bias towards the seen classes. Existing ZSL methods focus on only one of these problems in the conventional and generalized ZSL set- ting. In this work, we propose a novel approach, Semanti- cally Aligned Bias Reducing (SABR) ZSL, which focuses on solving both the problems. It overcomes the hubness prob- lem by learning a latent space that preserves the semantic relationship between the labels while encoding the discrim- inating information about the classes. Further, we also pro- pose ways to reduce bias ofthe seen classes through a sim- ple cross-validation process in the inductive setting and a novel weak transfer constraint in the transductive setting. Extensive experiments on three benchmark datasets suggest that the proposed model significantly outperforms existing state-of-the-art algorithms by ∼1.5-9% in the conventional ZSL setting and by ∼2-14% in the generalized ZSL for both the inductive and transductive settings.

4 Multi-modal Cycle-consistent Generalized Zero-Shot Learning

In generalized zero shot learning (GZSL), the set of classes are split into seen and unseen classes, where training relies on the seman- tic features of the seen and unseen classes and the visual representations of only the seen classes, while testing uses the visual representations of the seen and unseen classes. Current methods address GZSL by learning a transformation from the visual to the semantic space, exploring the assumption that the distribution of classes in the semantic and visual spaces is relatively similar. Such methods tend to transform unseen test- ing visual representations into one of the seen classes’ semantic features instead of the semantic features of the correct unseen class, resulting in low accuracy GZSL classification. Recently, generative adversarial net- works (GAN) have been explored to synthesize visual representations of the unseen classes from their semantic features - the synthesized rep- resentations of the seen and unseen classes are then used to train the GZSL classifier. This approach has been shown to boost GZSL classi- fication accuracy, but there is one important missing constraint: there is no guarantee that synthetic visual representations can generate back their semantic feature in a multi-modal cycle-consistent manner. This missing constraint can result in synthetic visual representations that do not represent well their semantic features, which means that the use of this constraint can improve GAN-based approaches. In this paper, we propose the use of such constraint based on a new regularization for the GAN training that forces the generated visual features to reconstruct their original semantic features. Once our model is trained with this multi-modal cycle-consistent semantic compatibility, we can then syn- thesize more representative visual representations for the seen and, more importantly, for the unseen classes. Our proposed approach shows the best GZSL classification results in the field in several publicly available datasets.

5 Gradient Matching Generative Networks for Zero-Shot Learning

Zero-shot learning (ZSL) is one of the most promising
problems where substantial progress can potentially be achieved through unsupervised learning, due to distribu- tional differences between supervised and zero-shot classes. For this reason, several works investigate the incorporation of discriminative domain adaptation techniques into ZSL, which, however, lead to modest improvements in ZSL ac- curacy. In contrast, we propose a generative model that can naturally learn from unsupervised examples, and syn- thesize training examples for unseen classes purely based on their class embeddings, and therefore, reduce the zero- shot learning problem into a supervised classification task. The proposed approach consists of two important compo- nents: (i) a conditional Generative Adversarial Network that learns to produce samples that mimic the characteristics of unsupervised data examples, and (ii) the Gradient Matching (GM) loss that measures the quality ofthe gradient signal obtained from the synthesized examples. Using our GM loss formulation, we enforce the generator to produce examples from which accurate classifiers can be trained. Experimental results on several ZSL benchmark datasets show that our approach leads to significant improvements over the state of the art in generalized zero-shot classification.

6 EZSL-GAN: EEG-based Zero-Shot Learning approach using a Generative Adversarial Network

Recent studies show that deep neural network can
be effective for learning EEG-based classification network. In particular, Recurrent Neural Networks (RNN) show competitive performance to learn the sequential information of the EEG signals. However, none of the previous approaches considers recognizing the unknown EEG signals which have never been seen in the training dataset. In this paper, we first propose a new scheme for Zero-Shot EEG signal classification. Our EZSL-GAN has three parts. The first part is an EEG encoder network that generates 128-dim of EEG features using a Gated Recurrent Unit (GRU). The second part is a Generative Adversarial Network (GAN) that can tackle the problem for recognizing unknown EEG labels with a knowledge base. The third part is a simple classification network to learn unseen EEG signals from the fake EEG features which are generated from the learned Generator. We evaluate our method on the EEG dataset evoked from 40 classes visual object stimuli. The experimental results show that our EEG encoder achieves an accuracy of 95.89%. Furthermore, our Zero-Shot EEG classification method reached an accuracy of 39.65% for the ten untrained EEG classes. Our experiments demonstrate that unseen EEG labels can be recognized by the knowledge base.

7 SR-GAN: SEMANTIC RECTIFYING GENERATIVE ADVERSARIAL NETWORK FOR ZERO-SHOT LEARNING

The existing Zero-Shot learning (ZSL) methods may suffer from the vague class attributes that are highly overlapped for different classes. Unlike these methods that ignore the dis- crimination among classes, in this paper, we propose to clas- sify unseen image by rectifying the semantic space guided by the visual space. First, we pre-train a Semantic Rectifying Network (SRN) to rectify semantic space with a semantic loss and a rectifying loss. Then, a Semantic Rectifying Generative Adversarial Network (SR-GAN) is built to generate plausi- ble visual feature of unseen class from both semantic feature and rectified semantic feature. To guarantee the effectiveness of rectified semantic features and synthetic visual features, a pre-reconstruction and a post reconstruction networks are proposed, which keep the consistency between visual feature and semantic feature. Experimental results demonstrate that our approach significantly outperforms the state-of-the-arts on four benchmark datasets.

8 Visual Data Synthesis via GAN for Zero-Shot Video Classification

Zero-Shot Learning (ZSL) in video classification is a promising research direction, which aims to tackle the challenge from explosive growth of video categories. Most existing methods exploit seen- to-unseen correlation via learning a projection be- tween visual and semantic spaces. However, such projection-based paradigms cannot fully utilize the discriminative information implied in data distri- bution, and commonly suffer from the information degradation issue caused by “heterogeneity gap”. In this paper, we propose a visual data synthesis framework via GAN to address these problems. Specifically, both semantic knowledge and visual distribution are leveraged to synthesize video fea- ture of unseen categories, and ZSL can be turned into typical supervised problem with the synthetic features. First, we propose multi-level semantic inference to boost video feature synthesis, which captures the discriminative information implied in joint visual-semantic distribution via feature-level and label-level semantic inference. Second, we propose Matching-aware Mutual Information Cor- relation to overcome information degradation is- sue, which captures seen-to-unseen correlation in matched and mismatched visual-semantic pairs by mutual information, providing the zero-shot syn- thesis procedure with robust guidance signals. Ex- perimental results on four video datasets demon- strate that our approach can improve the zero-shot video classification performance significantly.

9 VHEGAN: VARIATIONAL HETERO-ENCODER RANDOMIZED GAN FOR ZERO-SHOT LEARNING

To extract and relate visual and linguistic concepts from images and textual descriptions for text-based zero-shot learning (ZSL), we develop variational heteroencoder (VHE) that decodes text via a deep probabilisitic topic model, the variational posterior of whose local latent variables is encoded from an image via a Weibull distribution based inference network. To further improve VHE and add an image generator, we propose VHE randomized generative adversarial net (VHEGAN) that exploits the synergy between VHE and GAN through their shared latent space. After training with a hybrid stochastic-gradient MCMC/variational inference/stochastic gradient descent inference algorithm, VHEGAN can be used in a variety of settings, such as text generation/retrieval conditioning on an image, image generation/retrieval conditioning on a document/image, and generation of text-image pairs. The efficacy of VHEGAN is demonstrated quantitatively with experiments on both conventional and generalized ZSL tasks, and qualitatively on (conditional) image and/or text generation/retrieval.

Gan-based zero-shot learning 论文整理的更多相关文章

(转) GAN论文整理
本文转自:http://www.jianshu.com/p/2acb804dd811 GAN论文整理作者 FinlayLiu 已关注 2016.11.09 13:21 字数 1551 阅读 1263 ...
Deep Learning论文笔记之（四）CNN卷积神经网络推导和实现（转）
Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文, ...
Deep Learning论文笔记之（八）Deep Learning最新综述
Deep Learning论文笔记之(八)Deep Learning最新综述 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感觉看完 ...
Deep Learning论文笔记之（六）Multi-Stage多级架构分析
Deep Learning论文笔记之(六)Multi-Stage多级架构分析 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些 ...
201904Online Human Action Recognition Based on Incremental Learning of Weighted Covariance Descriptors
论文标题:Online Human Action Recognition Based on Incremental Learning of Weighted Covariance Descriptor ...
（转）Paper list of Meta Learning/ Learning to Learn/ One Shot Learning/ Lifelong Learning
Meta Learning/ Learning to Learn/ One Shot Learning/ Lifelong Learning 2018-08-03 19:16:56 本文转自:http ...
Multi-attention Network for One Shot Learning
Multi-attention Network for One Shot Learning 2018-05-15 22:35:50 本文的贡献点在于: 1. 表明类别标签信息对 one shot l ...
Deep Learning论文笔记之（一）K-means特征学习
Deep Learning论文笔记之(一)K-means特征学习 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感 ...
Deep Learning论文笔记之（三）单层非监督学习网络分析
Deep Learning论文笔记之(三)单层非监督学习网络分析 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感 ...

随机推荐

Tomcat--安装部署
Tomcat安装部署 Tomcat简介官网:http://tomcat.apache.org/ Tomcat服务器是一个免费的开源代码的Web应用服务器,属于轻量级应用服务器,在中小型系统和并发访问 ...
axio 请求中参数是数组
前言最近在做 Vue 项目中,Get 请求中有的参数是数组,传 JSON 字符串是没有问题的,但是直接传数组就一直报错,有问题. 参数后面无故加了 [],例如:UserIds 变成 UserIds[ ...
libpng 漏洞分析
相关资源 PNG文件格式文档 http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html https://www.myway5.com/index.p ...
Java——CaptchaUtil生成二维码乱码
前言这个问题就是因为Linux上没有字体,你可以有两种方法,一个在生成的时候设置字体,一个就是安装字体. 默认的字体为Courier 乱码情况步骤安装字体工具 yum install -y fo ...
让更多浏览器支持html5元素的简单方法
当我们试图使用web上的新技术的时候,旧式浏览器总是我们心中不可磨灭的痛!事实上,所有浏览器都有或多或少的问题,现在还没有浏览器能够完整的识别和支持最新的html5结构元素.但是不用担心,你依然可以在 ...
对于模块加载：ES6、CommonJS、AMD、CMD的区别
运行和编译的概念编译包括编译和链接两步. 编译,把源代码翻译成机器能识别的代码或者某个中间状态的语言. 比如java只有JVM识别的字节码,C#中只有CLR能识别的MSIL.还简单的作一些比如检查有 ...
Node.js是什么？提供了哪些内容？
什么是Node.js? Node.js是基于Chrome V8 引擎的 JavaScript运行时(运行环境). Node.js提供了哪些内容? Node.js运行时,JavaScript代码运行时的 ...
封装原生promise函数
阿里面试题: 手动封装promise函数 <!DOCTYPE html> <html lang="en"> <head> <meta ch ...
洛谷 P1474 货币系统 Money Systems 题解
P1474 货币系统 Money Systems 题目描述母牛们不但创建了它们自己的政府而且选择了建立了自己的货币系统.由于它们特殊的思考方式,它们对货币的数值感到好奇. 传统地,一个货币系统是由1 ...
mysql 修改字段名称以及长度
//修改字段长度 alter table table1 modify name ); //修改字段名称以及长度 alter table table1 change name name_new ); a ...

Gan-based zero-shot learning 论文整理

Gan-based zero-shot learning 论文整理的更多相关文章

随机推荐

热门专题