Coursera, Deep Learning 5, Sequence Models, week3, Sequence models & Attention mechanism
Sequence to Sequence models
basic sequence-to-sequence model:
basic image-to-sequence or called image captioning model:

but there are some differences between how you write a model like this to generate a sequence, compared to how you were synthesizing novel text using a language model. One of the key differences is,you don't want a randomly chosen translation,you maybe want the most likely translation,or you don't want a randomly chosen caption, maybe not,but you might want the best caption and most likely caption.So let's see in the next video how you go about generating that.
Picking the most likely sentence

找出最大可能性的P(y|x),最常用的算法是beam search.

在介绍 beam search 之前,先了解一下 greedy search 已经为什么不用 greedy search?
greedy search 的意思是,在已知一个值word的情况下,求下一个值word的最可能的情况,以此类推。。。 下图是一个很好的例子说明 greedy search 不适用的情况, 就不如求核能的 y^ 的组合的概率 p(y^1, y^2, ...|x) 然后找出最大概率,当然这样也有问题,就是比如说 10 个word 的输出,在一个 10,000 大的corpus 里就有 10,000 10 种组合情况,需要诉诸于更好的算法,且继续往下看

Coursera, Deep Learning 5, Sequence Models, week3, Sequence models & Attention mechanism的更多相关文章
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Regularization)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Regularization Welcome to the second assignment of this week. Deep ...
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Optimization algorithms
Gradient descent Batch Gradient Decent, Mini-batch gradient descent, Stochastic gradient descent 还有很 ...
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Gradient Checking)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Gradient Checking Welcome to the final assignment for this week! In ...
- Coursera, Deep Learning 4, Convolutional Neural Networks - week4,
Face recognition One Shot Learning 只看一次图片,就能以后识别, 传统deep learning 很难做到这个. 而且如果要加一个人到数据库里面,就要重新train ...
- Coursera, Deep Learning 1, Neural Networks and Deep Learning - week1, Introduction to deep learning
整个deep learing 系列课程主要包括哪些内容 Intro to Deep learning
- Coursera, Deep Learning 4, Convolutional Neural Networks - week1
CNN 主要解决 computer vision 问题,同时解决input X 维度太大的问题. Edge detection 下面演示了convolution 的概念 下图的 vertical ed ...
- Coursera Deep Learning笔记 逻辑回归典型的训练过程
Deep Learning 用逻辑回归训练图片的典型步骤. 笔记摘自:https://xienaoban.github.io/posts/59595.html 1. 处理数据 1.1 向量化(Vect ...
- Deep Learning基础--理解LSTM/RNN中的Attention机制
导读 目前采用编码器-解码器 (Encode-Decode) 结构的模型非常热门,是因为它在许多领域较其他的传统模型方法都取得了更好的结果.这种结构的模型通常将输入序列编码成一个固定长度的向量表示,对 ...
- Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks
有哪些sequence model Notation: RNN - Recurrent Neural Network 传统NN 在解决sequence input 时有什么问题? RNN就没有上面的问 ...
随机推荐
- poj3630 Phone List
spy on一下,发现是trie裸题,结果一交就T... 然后把cin改成scanf就A了... trie的空间一定要开足,要不然RE #include <cstdio> #include ...
- 洛谷 P2158 仪仗队
欧拉函数入门题... 当然如果有兴趣也可以用反演做...类似这题 题意就是求,方阵从左下角出发能看到多少个点. 从0开始给坐标 发现一个点能被看到,那么横纵坐标互质. 然后求欧拉函数的前缀和,* 2 ...
- 洛谷P3674 小清新人渣的本愿
题意:多次询问,区间内是否存在两个数,使得它们的和为x,差为x,积为x. n,m,V <= 100000 解: 毒瘤bitset...... 假如我们有询问区间的一个桶,那么我们就可以做到O(n ...
- 【BZOJ3289】Mato的文件管理 莫队+树状数组
题目大意:给定一个长度为 N 的序列,M 个询问,每次询问区间逆序对的个数. 题解:用树状数组加速答案转移. 代码如下 #include <bits/stdc++.h> #define f ...
- plink: 等位型计数(allele count)
对genotype的等位型进行计数,需要用到以下参数: --freq Allele frequencies--counts Modifies --freq to report actual allel ...
- Mybatis非mapper代理配置
转: Mybatis非mapper代理配置 2017年04月26日 20:13:48 待长的小蘑菇 阅读数:870 版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog. ...
- android studio adb.exe已停止工作(全面成功版 进程的查询和开启)
先输入adb看是否存在. 如果不存在则:在系统path里添加C:\Users\nubia\AppData\Local\Android\sdk\platform-tools 因为这个目录里有adb 或者 ...
- 【清北学堂2018-刷题冲刺】Contest 1
Task 1:最小公倍数 输入n,求n与246913578的最小公倍数. 结果对1234567890取模. [样例输入] 3 [样例输出] 246913578 [数据规模和约定] 对于30%的数据 ...
- 【C#】C#获取文件夹下的所有文件
#基础知识 1.获得当前运行程序的路径 string rootPath = Directory.GetCurrentDirectory(); 2.获得该文件夹下的文件,返回类型为FileInfo st ...
- web.xml 文件头
Servlet 2.3 <!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN ...
