tensorFlow 零散知识

收集一些碰到的关于细节的函数在这里记录下

1、tf.flags.DEFINE_xxx()

读别人家的代码的时候经常看到这个，结果两三天不看居然忘记了，这脑子绝对上锈了，决定记下来免得老是查来查去的。。。
内容包含如下几个我们经常看到的几个函数：
①tf.flags.DEFINE_xxx()
②FLAGS = tf.flags.FLAGS
③FLAGS._parse_flags()

简单的说：

用于帮助我们添加命令行的可选参数。
也就是说利用该函数我们可以实现在命令行中选择需要设定的参数来运行程序，
可以不用反复修改源代码中的参数，直接在命令行中进行参数的设定。

举个栗子：

程序train.py文件中的小部分代码如下所示：

FLAGS = tf.flags.FLAGS

tf.flags.DEFINE_string('name', 'default', 'name of the model')

tf.flags.DEFINE_integer('num_seqs', 100, 'number of seqs in one batch')

tf.flags.DEFINE_integer('num_steps', 100, 'length of one seq')

tf.flags.DEFINE_integer('lstm_size', 128, 'size of hidden state of lstm')

tf.flags.DEFINE_integer('num_layers', 2, 'number of lstm layers')

tf.flags.DEFINE_boolean('use_embedding', False, 'whether to use embedding')

tf.flags.DEFINE_integer('embedding_size', 128, 'size of embedding')

tf.flags.DEFINE_float('learning_rate', 0.001, 'learning_rate')

tf.flags.DEFINE_float('train_keep_prob', 0.5, 'dropout rate during training')

tf.flags.DEFINE_string('input_file', '', 'utf8 encoded text file')

tf.flags.DEFINE_integer('max_steps', 100000, 'max steps to train')

tf.flags.DEFINE_integer('save_every_n', 1000, 'save the model every n steps')

tf.flags.DEFINE_integer('log_every_n', 10, 'log to the screen every n steps')

tf.flags.DEFINE_integer('max_vocab', 3500, 'max char number')

#全局参数设置，显示在命令行

在命令行中我们为了执行train.py文件，在命令行中输入：

python train.py \

  --input_file data/shakespeare.txt  \

  --name shakespeare \

  --num_steps 50 \

  --num_seqs 32 \

  --learning_rate 0.01 \

  --max_steps 20000

通过输入不同的文件名、参数，可以快速完成程序的调参和更换训练集的操作，不需要进入源码中更改。

备注：在此感谢上述代码的作者

实践操作一下：

现在我们有如下代码：

import tensorflow as tf

#取上述代码中一部分进行实验

tf.flags.DEFINE_integer('num_seqs', 100, 'number of seqs in one batch')

tf.flags.DEFINE_integer('num_steps', 100, 'length of one seq')

tf.flags.DEFINE_integer('lstm_size', 128, 'size of hidden state of lstm')

#通过print()确定下面内容的功能

FLAGS = tf.flags.FLAGS #FLAGS保存命令行参数的数据

FLAGS._parse_flags() #将其解析成字典存储到FLAGS.__flags中

print(FLAGS.__flags)

print(FLAGS.num_seqs)

print("\nParameters:")

for attr, value in sorted(FLAGS.__flags.items()):

    print("{}={}".format(attr.upper(), value))

print("")

尝试执行一下上述代码了解其各行代码的功能，可能因为tensorflow版本原因出现报错现象。
查看解决办法可点击链接

2、tf.contrib.learn.preprocessing.VocabularyProcessor

tf.contrib.learn.preprocessing.VocabularyProcessor (max_document_length, min_frequency=0, vocabulary=None, tokenizer_fn=None)

参数：

max_document_length: 文档的最大长度。如果文本的长度大于最大长度，那么它会被剪切，反之则用0填充。
min_frequency: 词频的最小值，出现次数小于最小词频则不会被收录到词表中。
vocabulary: CategoricalVocabulary 对象。
tokenizer_fn：分词函数

代码：

from tensorflow.contrib import learn

import numpy as np

max_document_length = 4

x_text =[

    'i love you',

    'me too'

]

vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)

vocab_processor.fit(x_text)

print next(vocab_processor.transform(['i me too'])).tolist()

x = np.array(list(vocab_processor.fit_transform(x_text)))

print x

结果

[1, 4, 5, 0]

[[1 2 3 0]

 [4 5 0 0]]

3、embedding_lookup( )的用法

例子1

#!/usr/bin/env/python

# coding=utf-8

import tensorflow as tf

import numpy as np

input_ids = tf.placeholder(dtype=tf.int32, shape=[None])

embedding = tf.Variable(np.identity(5, dtype=np.int32))

input_embedding = tf.nn.embedding_lookup(embedding, input_ids)

sess = tf.InteractiveSession()

sess.run(tf.global_variables_initializer())

print(embedding.eval())

print(sess.run(input_embedding, feed_dict={input_ids:[1, 2, 3, 0, 3, 2, 1]}))

输出：

embedding = [[1 0 0 0 0]

             [0 1 0 0 0]

             [0 0 1 0 0]

             [0 0 0 1 0]

             [0 0 0 0 1]]

input_embedding = [[0 1 0 0 0]

                   [0 0 1 0 0]

                   [0 0 0 1 0]

                   [1 0 0 0 0]

                   [0 0 0 1 0]

                   [0 0 1 0 0]

                   [0 1 0 0 0]]

简单的讲就是根据input_ids中的id，寻找embedding中的对应元素。比如，input_ids=[1,3,5]，则找出embedding中下标为1,3,5的向量组成一个矩阵返回。

如果将input_ids改写成下面的格式：

input_embedding = tf.nn.embedding_lookup(embedding, input_ids)

print(sess.run(input_embedding, feed_dict={input_ids:[[1, 2], [2, 1], [3, 3]]}))

输出结果就会变成如下的格式：

[[[0 1 0 0 0]

  [0 0 1 0 0]]

 [[0 0 1 0 0]

  [0 1 0 0 0]]

 [[0 0 0 1 0]

  [0 0 0 1 0]]]

对比上下两个结果不难发现，相当于在np.array中直接采用下标数组获取数据。需要注意的细节是返回的tensor的dtype和传入的被查询的tensor的dtype保持一致；和ids的dtype无关。

例子2。

import tensorflow as tf

import numpy as np

input_ids = tf.placeholder(dtype=tf.int32, shape=[None])

embedding = a = np.asarray([[1, 2, 3], [4,5,6], [7,8,9], [11,22,33], [111,222,333]])

input_embedding = tf.nn.embedding_lookup(embedding, input_ids)

sess = tf.InteractiveSession()

sess.run(tf.global_variables_initializer())

print(sess.run(input_embedding, feed_dict={input_ids: [1, 2, 3, 0, 3, 2, 1]}))

print(input_embedding.shape)

输出：

[[ 4  5  6]

 [ 7  8  9]

 [11 22 33]

 [ 1  2  3]

 [11 22 33]

 [ 7  8  9]

 [ 4  5  6]]

(?, 3)

例子3、

input_ids = tf.placeholder(dtype=tf.int32, shape=[None,4])

# embedding = tf.Variable(np.identity(5, dtype=np.int32))

embedding = a = np.asarray([[1, 2, 3], [4,5,6], [7,8,9], [11,22,33], [111,222,333]])

input_embedding = tf.nn.embedding_lookup(embedding, input_ids)

sess = tf.InteractiveSession()

sess.run(tf.global_variables_initializer())

# print(embedding.eval())

print(sess.run(input_embedding, feed_dict={input_ids: [[1,2,3,4],[2,3,4,1]]}))

print(input_embedding.shape)

输出：

[[[  4   5   6]

  [  7   8   9]

  [ 11  22  33]

  [111 222 333]]

 [[  7   8   9]

  [ 11  22  33]

  [111 222 333]

  [  4   5   6]]]

(?, 4, 3)

总结：

这是一种初始化形式，有几点需要注意，

1、input_embdding的输出中，并没有embdding的第一个维度（行数），是因为这么做的原理是根据input_ids的下标确定取embdding的哪行数据，因此embdding总行数就

　　没太大意思

2、因为embdding是根据input_ids的值取对应下标的，因此在写程序时要注意，input_ids里面的值不能大于embdding的行数，不然取不到，换句话说，embdding在定义维度时

　　行数不能大于未来要输入input_ids里值得最大值

4、np.r_ 和 np.c_

np.r_是按列连接两个矩阵，就是把两矩阵上下相加，要求列数相等。

np.c_是按行连接两个矩阵，就是把两矩阵左右相加，要求行数相等。

5、tf.greater(a,b)

tf.greater(a,b)
功能：通过比较a、b两个值的大小来输出对错。
例如：当a=4，b=3时，输出结果为：true；当a=2，b=3时，输出结果为：false。

6、tf.where(）

tenflow 中tf.where(）用法

where(condition, x=None, y=None, name=None)

condition， x, y 相同维度，condition是bool型值，True/False

1，where(condition）的用法
condition是bool型值，True/False

返回值，是condition中元素为True对应的索引

看个例子：

import tensorflow as tf

a = [[1,2,3],[4,5,6]]

b = [[1,0,3],[1,5,1]]

condition1 = [[True,False,False],

             [False,True,True]]

condition2 = [[True,False,False],

             [False,True,False]]

with tf.Session() as sess:

    print(sess.run(tf.where(condition1)))

    print(sess.run(tf.where(condition2)))

输出：

[[0 0]

 [1 1]

 [1 2]]

[[0 0]

 [1 1]]

2， where(condition, x=None, y=None, name=None)的用法

condition， x, y 相同维度，condition是bool型值，True/False

返回值是对应元素，condition中元素为True的元素替换为x中的元素，为False的元素替换为y中对应元素

x只负责对应替换True的元素，y只负责对应替换False的元素，x，y各有分工

由于是替换，返回值的维度，和condition，x ， y都是相等的。

import tensorflow as tf

x = [[1,2,3],[4,5,6]]

y = [[7,8,9],[10,11,12]]

condition3 = [[True,False,False],

             [False,True,True]]

condition4 = [[True,False,False],

             [True,True,False]]

with tf.Session() as sess:

    print(sess.run(tf.where(condition3,x,y)))

    print(sess.run(tf.where(condition4,x,y)))

输出：

1， [[ 1  8  9]

    [10  5  6]]

2， [[ 1  8  9]

    [ 4  5 12]]

7，tf.train.exponential_decay

在神经网络的训练过程中，学习率(learning rate)控制着参数的更新速度，tf.train类下面的五种不同的学习速率的衰减方法。

tf.train.exponential_decay
tf.train.inverse_time_decay
tf.train.natural_exp_decay
tf.train.piecewise_constant
tf.train.polynomial_decay
本文只对exponential_decay做整理。通过
tf.train.exponential_decay函数实现指数衰减学习率。

步骤：
1.首先使用较大学习率(目的：为快速得到一个比较优的解);
2.然后通过迭代逐步减小学习率(目的：为使模型在训练后期更加稳定);

tf.train.exponential_decay(

    learning_rate,初始学习率

    global_step,当前迭代次数

    decay_steps,衰减速度（在迭代到该次数时学习率衰减为earning_rate * decay_rate）

    decay_rate,学习率衰减系数，通常介于0-1之间。

    staircase=False,(默认值为False,当为True时，（global_step/decay_steps）则被转化为整数) ,选择不同的衰减方式。

    name=None

)

学习率会按照以下公式变化：

decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps)

直观解释：假设给定初始学习率learning_rate为0.1，学习率衰减率为0.1，decay_steps为10000。
则随着迭代次数从1到10000，当前的学习率decayed_learning_rate慢慢的从0.1降低为0.1*0.1=0.01，
当迭代次数到20000，当前的学习率慢慢的从0.01降低为0.1*0.1^2=0.001，以此类推。
也就是说每10000次迭代，学习率衰减为前10000次的十分之一，该衰减是连续的，这是在staircase为False的情况下。

如果staircase为True，则global_step / decay_steps始终取整数，也就是说衰减是突变的，每decay_steps次变化一次，变化曲线是阶梯状。

8，tf.train.ExponentialMovingAverage（）

移动平均法是用一组最近的实际数据值来预测未来一期或几期内公司产品的需求量、公司产能等的一种常用方法。移动平均法适用于即期预测。当产品需求既不快速增长也不快速下降，且不存在季节性因素时，移动平均法能有效地消除预测中的随机波动，是非常有用的。移动平均法根据预测时使用的各元素的权重不同

移动平均法是一种简单平滑预测技术，它的基本思想是：根据时间序列资料、逐项推移，依次计算包含一定项数的序时平均值，以反映长期趋势的方法。因此，当时间序列的数值由于受周期变动和随机波动的影响，起伏较大，不易显示出事件的发展趋势时，使用移动平均法可以消除这些因素的影响，显示出事件的发展方向与趋势（即趋势线），然后依趋势线分析预测序列的长期趋势。

https://www.cnblogs.com/cloud-ken/p/7521609.html

9、tf.get_collection("“”)

从集合（参数）中取全部变量，生成一个列表

10、tf.add_n([])

列表内元素对应相加

11、tf.cast(x,dtype)

把x转为dtype类型

12、tf.argmax(x,axis)

返回最大值所在索引号，如，tf.argmax([1,0,0],1) 返回0

13、with tf.Graph().as_default() as g:

其内定义的节点在计算图g中