【softmax分类器的加速器】

https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss

This is a faster way to train a softmax classifier over a huge number of classes.

【分类的结果集过大,选取子集】

https://www.tensorflow.org/api_guides/python/nn#Candidate_Sampling

Do you want to train a multiclass or multilabel model with thousands or millions of output classes (for example, a language model with a large vocabulary)? Training with a full Softmax is slow in this case, since all of the classes are evaluated for every training example. Candidate Sampling training algorithms can speed up your step times by only considering a small randomly-chosen subset of contrastive classes (called candidates) for each batch of training examples.

https://www.tensorflow.org/extras/candidate_sampling.pdf

【 compute F(x, y) for every class y ∈ L for every training example----耗时点,这是要解决的问题】

What is Candidate Sampling Say we have a multiclass or multi­label problem where each training example (x , ) consists of i Ti a context xi a small (multi)set of target classes Ti out of a large universe L of possible classes. For example, the problem might be to predicting the next word (or the set of future words) in a sentence given the previous words.

We wish to learn a compatibility function F(x, y) which says something about the compatibility of a class y with a context x . For example ­ the probability of the class given the context.

“Exhaustive” training methods such as softmax and logistic regression require us to compute F(x, y) for every class y ∈ L for every training example. When |L| is very large, this can be prohibitively expensive.

【the model having a very large target vocabulary by selecting only a small subset of the whole target vocabulary:子集】

https://arxiv.org/pdf/1412.2007.pdf

Neural machine translation, a recently proposed approach to machine translation based purely on neural networks, has shown promising results compared to the existing approaches such as phrase-based statistical machine translation. Despite its recent success, neural machine translation has its limitation in handling a larger vocabulary, as training complexity as well as decoding complexity increase proportionally to the number of target words. In this paper, we propose a method that allows us to use a very large target vocabulary without increasing training complexity, based on importance sampling. We show that decoding can be efficiently done even with the model having a very large target vocabulary by selecting only a small subset of the whole target vocabulary. The models trained by the proposed approach are empirically found to outperform the baseline models with a small vocabulary as well as the LSTM-based neural machine translation models. Furthermore, when we use the ensemble of a few models with very large target vocabularies, we achieve the state-of-the-art translation performance (measured by BLEU) on the English->German translation and almost as high performance as state-of-the-art English->French translation system.

On Using Very Large Target Vocabulary for Neural Machine Translation Candidate Sampling Sampled Softmax的更多相关文章

  1. 课程五(Sequence Models),第三周(Sequence models & Attention mechanism) —— 1.Programming assignments:Neural Machine Translation with Attention

    Neural Machine Translation Welcome to your first programming assignment for this week! You will buil ...

  2. Sequence Models Week 3 Neural Machine Translation

    Neural Machine Translation Welcome to your first programming assignment for this week! You will buil ...

  3. 神经机器翻译 - NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE

    论文:NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE 综述 背景及问题 背景: 翻译: 翻译模型学习条件分布 ...

  4. 对Neural Machine Translation by Jointly Learning to Align and Translate论文的详解

    读论文 Neural Machine Translation by Jointly Learning to Align and Translate 这个论文是在NLP中第一个使用attention机制 ...

  5. Effective Approaches to Attention-based Neural Machine Translation(Global和Local attention)

    这篇论文主要是提出了Global attention 和 Local attention 这个论文有一个译文,不过我没细看 Effective Approaches to Attention-base ...

  6. 【转载 | 翻译】Visualizing A Neural Machine Translation Model(神经机器翻译模型NMT的可视化)

    转载并翻译Jay Alammar的一篇博文:Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models Wi ...

  7. [笔记] encoder-decoder NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE

    原文地址 :[1409.0473] Neural Machine Translation by Jointly Learning to Align and Translate (arxiv.org) ...

  8. Introduction to Neural Machine Translation - part 1

    The Noise Channel Model \(p(e)\): the language Model \(p(f|e)\): the translation model where, \(e\): ...

  9. 论文阅读 | Robust Neural Machine Translation with Doubly Adversarial Inputs

    (1)用对抗性的源实例攻击翻译模型; (2)使用对抗性目标输入来保护翻译模型,提高其对对抗性源输入的鲁棒性. 生成对抗输入:基于梯度 (平均损失)  ->  AdvGen 我们的工作处理由白盒N ...

随机推荐

  1. Android学习--持久化(一) 文件存储

    持久化之   文件存储 这里把Android持久化全都整理一下,这一篇文章先简单的说一下文件的存储,通过下面一个简单的Demo,理解一下这个文件存储,先说说下面Demo的思路: 1.创建EditTex ...

  2. OnChencedChang

    (一) 1,布局 <?xml version="1.0" encoding="utf-8"?> <LinearLayout xmlns:and ...

  3. [Js]删除数组指定元素

    写在前面 在最近的项目中,有用到js对数组的操作,之前自己几乎没有用到这种方法,这里就记录一下,算是对学到的东西的一种总结吧. 数组对象splice方法 splice() 方法向/从数组中添加/删除项 ...

  4. 在Android中实现阴影效果

    在Android L推出后,Google提出了全新的设计语言:材质设计.其中很重要的一点就是阴影效果的使用,你可以为每一个View设置一个elevation值,相当于除了x.y之外的z值,z值决定了阴 ...

  5. osgcuda 【转】

    原文 : http://blog.sina.com.cn/s/blog_df1b276a0101inbi.html osgCompute是对代码的并行流处理器执行的抽象基库.库连接到OSG的(OSG) ...

  6. Drools的HelloWord例子

    添加drools框架运行的依赖 <!--Drools 规则相关 --> <dependency> <groupId>org.drools</groupId&g ...

  7. 关于Web项目的pom文件处理

    pom文件的方式需要修改的是 <packaging>war</packaging> <profiles> <profile> <id>com ...

  8. Could not open lock file/var/lib/dpkg/lock

    apt-get时出现错误提示: E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailabl ...

  9. 【重点突破】——使用Canvas进行绘图图像

    一.引言 本文主要是canvas绘图中绘制图像的部分,做了几个练习,综合起来,复习canvas绘图以及定时器的使用. 二.canvas绘制小飞机在指定位置 <!DOCTYPE html> ...

  10. JAVA Eclipse开发Android如何设置滚动条最大值最小值

    最小值默认为0,你最好在实现逻辑中修改 最大值为max 初始值为progress      <SeekBar      android:id="@+id/seekBarSpeedMov ...