Tensorflow --BeamSearch
github:https://github.com/zle1992/Seq2Seq-Chatbot
1、 注意在infer阶段,需要需要reuse,
2、If you are using the BeamSearchDecoder
with a cell wrapped in AttentionWrapper
, then you must ensure that:
- The encoder output has been tiled to
beam_width
viatf.contrib.seq2seq.tile_batch
(NOTtf.tile
). - The
batch_size
argument passed to thezero_state
method of this wrapper is equal totrue_batch_size * beam_width
. - The initial state created with
zero_state
above contains acell_state
value containing properly tiled final state from the encoder.
import tensorflow as tf
from tensorflow.python.layers.core import Dense BEAM_WIDTH = 5
BATCH_SIZE = 128 # INPUTS
X = tf.placeholder(tf.int32, [BATCH_SIZE, None])
Y = tf.placeholder(tf.int32, [BATCH_SIZE, None])
X_seq_len = tf.placeholder(tf.int32, [BATCH_SIZE])
Y_seq_len = tf.placeholder(tf.int32, [BATCH_SIZE]) # ENCODER
encoder_out, encoder_state = tf.nn.dynamic_rnn(
cell = tf.nn.rnn_cell.BasicLSTMCell(128),
inputs = tf.contrib.layers.embed_sequence(X, 10000, 128),
sequence_length = X_seq_len,
dtype = tf.float32) # DECODER COMPONENTS
Y_vocab_size = 10000
decoder_embedding = tf.Variable(tf.random_uniform([Y_vocab_size, 128], -1.0, 1.0))
projection_layer = Dense(Y_vocab_size) # ATTENTION (TRAINING)
with tf.variable_scope('shared_attention_mechanism'):
attention_mechanism = tf.contrib.seq2seq.LuongAttention(
num_units = 128,
memory = encoder_out,
memory_sequence_length = X_seq_len) decoder_cell = tf.contrib.seq2seq.AttentionWrapper(
cell = tf.nn.rnn_cell.BasicLSTMCell(128),
attention_mechanism = attention_mechanism,
attention_layer_size = 128) # DECODER (TRAINING)
training_helper = tf.contrib.seq2seq.TrainingHelper(
inputs = tf.nn.embedding_lookup(decoder_embedding, Y),
sequence_length = Y_seq_len,
time_major = False)
training_decoder = tf.contrib.seq2seq.BasicDecoder(
cell = decoder_cell,
helper = training_helper,
initial_state = decoder_cell.zero_state(BATCH_SIZE,tf.float32).clone(cell_state=encoder_state),
output_layer = projection_layer)
with tf.variable_scope('decode_with_shared_attention'):
training_decoder_output, _, _ = tf.contrib.seq2seq.dynamic_decode(
decoder = training_decoder,
impute_finished = True,
maximum_iterations = tf.reduce_max(Y_seq_len))
training_logits = training_decoder_output.rnn_output # BEAM SEARCH TILE
encoder_out = tf.contrib.seq2seq.tile_batch(encoder_out, multiplier=BEAM_WIDTH)
X_seq_len = tf.contrib.seq2seq.tile_batch(X_seq_len, multiplier=BEAM_WIDTH)
encoder_state = tf.contrib.seq2seq.tile_batch(encoder_state, multiplier=BEAM_WIDTH) # ATTENTION (PREDICTING)
with tf.variable_scope('shared_attention_mechanism', reuse=True):
attention_mechanism = tf.contrib.seq2seq.LuongAttention(
num_units = 128,
memory = encoder_out,
memory_sequence_length = X_seq_len) decoder_cell = tf.contrib.seq2seq.AttentionWrapper(
cell = tf.nn.rnn_cell.BasicLSTMCell(128),
attention_mechanism = attention_mechanism,
attention_layer_size = 128) # DECODER (PREDICTING)
predicting_decoder = tf.contrib.seq2seq.BeamSearchDecoder(
cell = decoder_cell,
embedding = decoder_embedding,
start_tokens = tf.tile(tf.constant([1], dtype=tf.int32), [BATCH_SIZE]),
end_token = 2,
initial_state = decoder_cell.zero_state(BATCH_SIZE * BEAM_WIDTH,tf.float32).clone(cell_state=encoder_state),
beam_width = BEAM_WIDTH,
output_layer = projection_layer,
length_penalty_weight = 0.0)
with tf.variable_scope('decode_with_shared_attention', reuse=True):
predicting_decoder_output, _, _ = tf.contrib.seq2seq.dynamic_decode(
decoder = predicting_decoder,
impute_finished = False,
maximum_iterations = 2 * tf.reduce_max(Y_seq_len))
predicting_logits = predicting_decoder_output.predicted_ids[:, :, 0] print('successful')
参考:
https://gist.github.com/higepon/eb81ba0f6663a57ff1908442ce753084
https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BeamSearchDecoder
https://github.com/tensorflow/nmt#beam-search
Tensorflow --BeamSearch的更多相关文章
- tensorflow 笔记13:了解机器翻译,google NMT,Attention
一.关于Attention,关于NMT 未完待续... 以google 的 nmt 代码引入 探讨下端到端: 项目地址:https://github.com/tensorflow/nmt 机器翻译算是 ...
- Effective Tensorflow[转]
Effective TensorFlow Table of Contents TensorFlow Basics Understanding static and dynamic shapes Sco ...
- Tensorflow 官方版教程中文版
2015年11月9日,Google发布人工智能系统TensorFlow并宣布开源,同日,极客学院组织在线TensorFlow中文文档翻译.一个月后,30章文档全部翻译校对完成,上线并提供电子书下载,该 ...
- tensorflow学习笔记二:入门基础
TensorFlow用张量这种数据结构来表示所有的数据.用一阶张量来表示向量,如:v = [1.2, 2.3, 3.5] ,如二阶张量表示矩阵,如:m = [[1, 2, 3], [4, 5, 6], ...
- 用Tensorflow让神经网络自动创造音乐
#————————————————————————本文禁止转载,禁止用于各类讲座及ppt中,违者必究————————————————————————# 前几天看到一个有意思的分享,大意是讲如何用Ten ...
- tensorflow 一些好的blog链接和tensorflow gpu版本安装
pading :SAME,VALID 区别 http://blog.csdn.net/mao_xiao_feng/article/details/53444333 tensorflow实现的各种算法 ...
- tensorflow中的基本概念
本文是在阅读官方文档后的一些个人理解. 官方文档地址:https://www.tensorflow.org/versions/r0.12/get_started/basic_usage.html#ba ...
- kubernetes&tensorflow
谷歌内部--Borg Google Brain跑在数十万台机器上 谷歌电商商品分类深度学习模型跑在1000+台机器上 谷歌外部--Kubernetes(https://github.com/kuber ...
- tensorflow学习
tensorflow安装时遇到gcc: error trying to exec 'as': execvp: No such file or directory. 截止到2016年11月13号,源码编 ...
随机推荐
- JS_高程8.BOM window对象(1)
1.全局作用域 var age = 14; window.coloer = "pink"; console.log(delete window.age);//false 使用var ...
- HBase scan 时 异常 ScannerTimeoutException 解决
org.apache.Hadoop.hbase.client.ScannerTimeoutException: 60622ms passed since the last invocation, ti ...
- Java课程课后作业190315之最大连续子数组(二维数组版)
,, 在本周的课堂上,老师再一次提高了要求,将一维数组升级成为了二维数组,然后求出块状的连续子数组. 一开始还想着借鉴之前球一维数组的O(n)的算法,后来还是没有找到头绪,舍友讲了自己的办法,但是没有 ...
- codecademy课程笔记——JavaScript Promise
Promise是一种表示异步操作最终的结果的对象,一个Promise对象有三种状态 Pending: 初始状态 ,操作还未完成 Fullfilled:操作成功完成,且这个promise现在有一个r ...
- Android常用的工具类SharedPreferences封装类SPUtils
package com.zhy.utils; import java.lang.reflect.InvocationTargetException; import java.lang.reflect. ...
- Git&Version Control
Git Git(读音为/gɪt/.)是一个开源的分布式版本控制系统,可以有效.高速地处理从很小到非常大的项目版本管理. [1] Git 是 Linus Torvalds 为了帮助管理 Linux 内 ...
- js中函数创建的三种方式
1.函数声明 function sum1(n1,n2){ return n1+n2; }; 2.函数表达式,又叫函数字面量 var sum2=function(n1,n2){ re ...
- EXSI6怎么设置虚拟机从光驱启动
EXSI在安装完系统以后会默认从磁盘启动,假如需要进入救援模式则需要设置成光驱启动 设置
- apache-2.4.6 mod_bw-0.92 实现限速上传或下载
下载 mod_bw wget http://ivn.cl/files/source/mod_bw-0.92.tgz 解压到mod_bw tar -zxvf mod_bw-0.92.tgz -C mo ...
- 如何从日期对象python获取以毫秒(秒后3位小数)为单位的时间值?
要在python中,要获取具有毫秒(秒后3位小数)的日期字符串,请使用以下命令: %f 显示毫秒 import datetime # 获得当前时间 now=datetime.datetime.now( ...