一、源代码下载

代码最初来源于Github:https://github.com/vijayvee/Recursive-neural-networks-TensorFlow,代码介绍如下:“This repository contains the implementation of a single hidden layer Recursive Neural Network.Implemented in python using TensorFlow. Used the trained models for the task of Positive/Negative sentiment analysis. This code is the solution for the third programming assignment from "CS224d: Deep learning for Natural Language Processing", Stanford University.”

由于其运行在python2版本,我对其进行了修改,以及对相关树进行了可视化。我修改后的可运行代码下载链接是(连同要处理的电影评论数据):

https://pan.baidu.com/s/1bJTulQPs_h25sdLlCcTqDA

运行环境是:windows10、anaconda上创建的tensorflow1.8环境、python3.6版本。

二、问题描述

在程序中使用log_device_placement=True,可以看到:

运算设备的选择是GPU,只有部分save/restore操作是CPU。

但是实际运行的时候,GPU Load为0。

我的电脑已经是GPU安装完整的,运行其它的神经网络程序,能够看到GPU Load的变化。

三、解决方案

提交到付费解决方案平台昂钛客https://www.angtk.com/

https://www.angtk.com/question/354

没有收到问题的解决办法。

四、部分发现

第一,RvNN网络是随着语料库中的句子(训练样本)长度的变化而变化。

第二,由于第一方面的特性,其必须设定一个reset_after。即每训练reset_after个句子,就需要保存模型,接着重新定义一个新的Graph,然后将已经保存模型中的

权值矩阵恢复到新的Graph中,继续进行训练。

我在我修改的代码中加入了保存计算图的操作,可以用tensorboard查看。观察发现,每训练reset_after个句子,就会生成reset_after个loss层(每个句子对应一个loss层),计算图会越来越大。

这也是为什么要重新定义Graph,然后继续训练。

(结构递归神经网络RvNN的核心就是一个前向层和一个重构层,这两个层不断应用于两个子节点,然后得出父节点。所以,这两个层的参数是被不断训练的)

五、解决方案

5.1:收集其它RvNN的实现

大部分的实现都是Richard Socher写的Matlab程序以及对应的Python版本,都包括了损失值的计算和梯度值的计算。我需要找到的是tensorflow版本上的实现。

参考网址:https://stats.stackexchange.com/questions/243221/recursive-neural-network-implementation-in-tensorflow里面提供了一些实现的方法

5.1.1  TensorFlow Fold

https://github.com/tensorflow/fold

TensorFlow Fold is a library for creating TensorFlow models that consume structured data, where the structure of the computation graph depends on the structure of the input data. For example, this model implements TreeLSTMs for sentiment analysis on parse trees of arbitrary shape/size/depth.

Fold implements dynamic batching. Batches of arbitrarily shaped computation graphs are transformed to produce a static computation graph. This graph has the same structure regardless of what input it receives, and can be executed efficiently by TensorFlow.

This animation shows a recursive neural network run with dynamic batching. Operations of the same type appearing at the same depth in the computation graph (indicated by color in the animiation) are batched together regardless of whether or not they appear in the same parse tree. The Embed operation converts words to vector representations. The fully connected (FC) operation combines word vectors to form vector representations of phrases. The output of the network is a vector representation of an entire sentence. Although only a single parse tree of a sentence is shown, the same network can run, and batch together operations, over multiple parse trees of arbitrary shapes and sizes. The TensorFlow concat, while_loop, and gather ops are created once, prior to variable initialization, by Loom, the low-level API for TensorFlow Fold.

(里面提到了三个运算,concat,while和gather)

5.1.2 Tensorflow implementation of Recursive Neural Networks using LSTM units

下载地址是:https://github.com/sapruash/RecursiveNN

Tensorflow implementation of Recursive Neural Networks using LSTM units as described in "Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks" by Kai Sheng Tai, Richard Socher, and Christopher D. Manning.

(这个是斯坦福Richard Socher教授的文章,他是RvNN的提出者,他在博士论文中阐述了这个网络结构,也因此成为了深度学习大神之一)

5.1.3 Recursive (not Recurrent!) Neural Networks in TensorFlow

KDnuggets

文章地址:https://www.kdnuggets.com/2016/06/recursive-neural-networks-tensorflow.html

代码下载地址(需要翻墙):https://gist.github.com/anj1/504768e05fda49a6e3338e798ae1cddd

我简单的从py2转到py3上以后,运行,发现Gpu load已经上来了,不再是0.

所以,我怀疑本文没有调用GPU的代码是因为网络结构定义中使用了dict的缘故。在字典中放入tensor向量,导致不被GPU运算支持。现在我对代码进行重构。


RvNN的两个缺点

The advantage of TreeNets is that they can be very powerful in learning hierarchical, tree-like structure. The disadvantages are, firstly, that the tree structure of every input sample must be known at training time. We will represent the tree structure like this (lisp-like notation):

(S (NP that movie) (VP was) (ADJP cool))

In each sub-expression, the type of the sub-expression must be given – in this case, we are parsing a sentence, and the type of the sub-expression is simply the part-of-speech (POS) tag. You can see that expressions with three elements (one head and two tail elements) correspond to binary operations, whereas those with four elements (one head and three tail elements) correspond to trinary operations, etc.

The second disadvantage of TreeNets is that training is hard because the tree structure changes for each training sample and it’s not easy to map training to mini-batches and so on.

6 调试解决问题。

6.1 调试Recursive (not Recurrent!) Neural Networks in TensorFlow

源代码

  1. import types
  2. import tensorflow as tf
  3. import numpy as np
  4.  
  5. # Expressions are represented as lists of lists,
  6. # in lisp style -- the symbol name is the head (first element)
  7. # of the list, and the arguments follow.
  8.  
  9. # add an expression to an expression list, recursively if necessary.
  10. def add_expr_to_list(exprlist, expr):
  11. # if expr is a atomic type
  12. if isinstance(expr, list):
  13. # Now for rest of expression
  14. for e in expr[1:]:
  15. # Add to list if necessary
  16. if not (e in exprlist):
  17. add_expr_to_list(exprlist, e)
  18. # Add index in list.
  19. exprlist.append(expr)
  20.  
  21. def expand_subexprs(exprlist):
  22. new_exprlist = []
  23. orig_indices = []
  24. for e in exprlist:
  25. add_expr_to_list(new_exprlist, e)
  26. orig_indices.append(len(new_exprlist)-1)
  27. return new_exprlist, orig_indices
  28.  
  29. def compile_expr(exprlist, expr):
  30. # start new list starting with head
  31. new_expr = [expr[0]]
  32. for e in expr[1:]:
  33. new_expr.append(exprlist.index(e))
  34. return new_expr
  35.  
  36. def compile_expr_list(exprlist):
  37. new_exprlist = []
  38. for e in exprlist:
  39. if isinstance(e, list):
  40. new_expr = compile_expr(exprlist, e)
  41. else:
  42. new_expr = e
  43. new_exprlist.append(new_expr)
  44. return new_exprlist
  45.  
  46. def expand_and_compile(exprlist):
  47. l, orig_indices = expand_subexprs(exprlist)
  48. return compile_expr_list(l), orig_indices
  49.  
  50. def new_weight(N1,N2):
  51. return tf.Variable(tf.random_normal([N1,N2]))
  52. def new_bias(N_hidden):
  53. return tf.Variable(tf.random_normal([N_hidden]))
  54.  
  55. def build_weights(exprlist,N_hidden,inp_vec_len,out_vec_len):
  56. W = dict() # dict of weights corresponding to each operation
  57. b = dict() # dict of biases corresponding to each operation
  58. W['input'] = new_weight(inp_vec_len, N_hidden)
  59. W['output'] = new_weight(N_hidden, out_vec_len)
  60. for expr in exprlist:
  61. if isinstance(expr, list):
  62. idx = expr[0]
  63. if not (idx in W):
  64. W[idx] = [new_weight(N_hidden,N_hidden) for i in expr[1:]]
  65. b[idx] = new_bias(N_hidden)
  66. return (W,b)
  67.  
  68. def build_rnn_graph(exprlist,W,b,inp_vec_len):
  69. # with W built up, create list of variables
  70. # intermediate variables
  71. in_vars = [e for e in exprlist if not isinstance(e,list)]
  72. N_input = len(in_vars)
  73. inp_tensor = tf.placeholder(tf.float32, (N_input, inp_vec_len), name='input1')
  74. V = [] # list of variables corresponding to each expr in exprlist
  75. for expr in exprlist:
  76. if isinstance(expr, list):
  77. # intermediate variables
  78. idx = expr[0]
  79. # add bias
  80. new_var = b[idx]
  81. # add input variables * weights
  82. for i in range(1,len(expr)):
  83. new_var = tf.add(new_var, tf.matmul(V[expr[i]], W[idx][i-1]))
  84. new_var = tf.nn.relu(new_var)
  85. else:
  86. # base (input) variables
  87. # TODO : variable or placeholder?
  88. i = in_vars.index(expr)
  89. i_v = tf.slice(inp_tensor, [i,0], [1,-1])
  90. new_var = tf.nn.relu(tf.matmul(i_v,W['input']))
  91. V.append(new_var)
  92. return (inp_tensor,V)
  93.  
  94. # take a compiled expression list and build its RNN graph
  95. def complete_rnn_graph(W,V,orig_indices,out_vec_len):
  96. # we store our matrices in a dict;
  97. # the dict format is as follows:
  98. # 'op':[mat_arg1,mat_arg2,...]
  99. # e.g. unary operations: '-':[mat_arg1]
  100. # binary operations: '+':[mat_arg1,mat_arg2]
  101. # create a list of our base variables
  102. N_output = len(orig_indices)
  103. out_tensor = tf.placeholder(tf.float32, (N_output, out_vec_len), name='output1')
  104.  
  105. # output variables
  106. ce = tf.reduce_sum(tf.zeros((1,1)))
  107. for idx in orig_indices:
  108. o = tf.nn.softmax(tf.matmul(V[idx], W['output']))
  109. t = tf.slice(out_tensor, [idx,0], [1,-1])
  110. ce = tf.add(ce, -tf.reduce_sum(t * tf.log(o)), name='loss')
  111. # TODO: output variables
  112. # return weights and variables and final loss
  113. return (out_tensor, ce)
  114.  
  115. # from subexpr_lists import *
  116. a = [ 1, ['+',1,1], ['*',1,1], ['*',['+',1,1],['+',1,1]], ['+',['+',1,1],['+',1,1]], ['+',['+',1,1],1 ], ['+',1,['+',1,1]]]
  117. # generate training graph
  118. l,o=expand_and_compile(a)
  119. W,b = build_weights(l,10,1,2)
  120. i_t,V = build_rnn_graph(l,W,b,1)
  121. o_t,ce = complete_rnn_graph(W,V,o,2)
  122. # generate testing graph
  123. a = [ ['+',['+',['+',1,1],['+',['+',1,1],['+',1,1]]],1] ] #
  124. l_tst,o_tst=expand_and_compile(a)
  125. i_t_tst,V_tst = build_rnn_graph(l_tst,W,b,1)
  126.  
  127. out_batch = np.transpose(np.array([[1,0,1,0,0,1,1],[0,1,0,1,1,0,0]]))
  128. print (ce)
  129. train_step = tf.train.GradientDescentOptimizer(0.001).minimize(ce)
  130. init = tf.initialize_all_variables()
  131. sess = tf.Session()
  132. sess.run(init)
  133. for i in range(5000):
  134. sess.run(train_step, feed_dict={i_t:np.array([[1]]),o_t:out_batch})
  135. print (l)
  136. print (l_tst)
  137. print (sess.run(tf.nn.softmax(tf.matmul(V[1], W['output'])), feed_dict={i_t:np.array([[1]])}))
  138. print (sess.run(tf.nn.softmax(tf.matmul(V[-1], W['output'])), feed_dict={i_t:np.array([[1]])}))
  139. print (sess.run(tf.nn.softmax(tf.matmul(V_tst[-2], W['output'])), feed_dict={i_t_tst:np.array([[1]])}))
  140. print (sess.run(tf.nn.softmax(tf.matmul(V_tst[-1], W['output'])), feed_dict={i_t_tst:np.array([[1]])}))

运行代码,能够看到GPU_load不为0。

仿造RvNN的方式,(即由于网络结构随着语料库中句子的变化而变化,每一次都是新建图,并且加载保存的模型)修改代码如下,

  1. import types
  2. import tensorflow as tf
  3. import numpy as np
  4. import os
  5.  
  6. # Expressions are represented as lists of lists,
  7. # in lisp style -- the symbol name is the head (first element)
  8. # of the list, and the arguments follow.
  9.  
  10. # add an expression to an expression list, recursively if necessary.
  11. def add_expr_to_list(exprlist, expr):
  12. # if expr is a atomic type
  13. if isinstance(expr, list):
  14. # Now for rest of expression
  15. for e in expr[1:]:
  16. # Add to list if necessary
  17. if not (e in exprlist):
  18. add_expr_to_list(exprlist, e)
  19. # Add index in list.
  20. exprlist.append(expr)
  21.  
  22. def expand_subexprs(exprlist):
  23. new_exprlist = []
  24. orig_indices = []
  25. for e in exprlist:
  26. add_expr_to_list(new_exprlist, e)
  27. orig_indices.append(len(new_exprlist)-1)
  28. return new_exprlist, orig_indices
  29.  
  30. def compile_expr(exprlist, expr):
  31. # start new list starting with head
  32. new_expr = [expr[0]]
  33. for e in expr[1:]:
  34. new_expr.append(exprlist.index(e))
  35. return new_expr
  36.  
  37. def compile_expr_list(exprlist):
  38. new_exprlist = []
  39. for e in exprlist:
  40. if isinstance(e, list):
  41. new_expr = compile_expr(exprlist, e)
  42. else:
  43. new_expr = e
  44. new_exprlist.append(new_expr)
  45. return new_exprlist
  46.  
  47. def expand_and_compile(exprlist):
  48. l, orig_indices = expand_subexprs(exprlist)
  49. return compile_expr_list(l), orig_indices
  50.  
  51. def new_weight(N1,N2):
  52. return tf.Variable(tf.random_normal([N1,N2]))
  53. def new_bias(N_hidden):
  54. return tf.Variable(tf.random_normal([N_hidden]))
  55.  
  56. def build_weights(exprlist,N_hidden,inp_vec_len,out_vec_len):
  57. W = dict() # dict of weights corresponding to each operation
  58. b = dict() # dict of biases corresponding to each operation
  59. W['input'] = new_weight(inp_vec_len, N_hidden)
  60. W['output'] = new_weight(N_hidden, out_vec_len)
  61. for expr in exprlist:
  62. if isinstance(expr, list):
  63. idx = expr[0]
  64. if not (idx in W):
  65. W[idx] = [new_weight(N_hidden,N_hidden) for i in expr[1:]]
  66. b[idx] = new_bias(N_hidden)
  67. return (W,b)
  68.  
  69. def build_rnn_graph(exprlist,W,b,inp_vec_len):
  70. # with W built up, create list of variables
  71. # intermediate variables
  72. in_vars = [e for e in exprlist if not isinstance(e,list)]
  73. N_input = len(in_vars)
  74. inp_tensor = tf.placeholder(tf.float32, (N_input, inp_vec_len), name='input1')
  75. V = [] # list of variables corresponding to each expr in exprlist
  76. for expr in exprlist:
  77. if isinstance(expr, list):
  78. # intermediate variables
  79. idx = expr[0]
  80. # add bias
  81. new_var = b[idx]
  82. # add input variables * weights
  83. for i in range(1,len(expr)):
  84. new_var = tf.add(new_var, tf.matmul(V[expr[i]], W[idx][i-1]))
  85. new_var = tf.nn.relu(new_var)
  86. else:
  87. # base (input) variables
  88. # TODO : variable or placeholder?
  89. i = in_vars.index(expr)
  90. i_v = tf.slice(inp_tensor, [i,0], [1,-1])
  91. new_var = tf.nn.relu(tf.matmul(i_v,W['input']))
  92. V.append(new_var)
  93. return (inp_tensor,V)
  94.  
  95. # take a compiled expression list and build its RNN graph
  96. def complete_rnn_graph(W,V,orig_indices,out_vec_len):
  97. # we store our matrices in a dict;
  98. # the dict format is as follows:
  99. # 'op':[mat_arg1,mat_arg2,...]
  100. # e.g. unary operations: '-':[mat_arg1]
  101. # binary operations: '+':[mat_arg1,mat_arg2]
  102. # create a list of our base variables
  103. N_output = len(orig_indices)
  104. out_tensor = tf.placeholder(tf.float32, (N_output, out_vec_len), name='output1')
  105.  
  106. # output variables
  107. ce = tf.reduce_sum(tf.zeros((1,1)))
  108. for idx in orig_indices:
  109. o = tf.nn.softmax(tf.matmul(V[idx], W['output']))
  110. t = tf.slice(out_tensor, [idx,0], [1,-1])
  111. ce = tf.add(ce, -tf.reduce_sum(t * tf.log(o)), name='loss')
  112. # TODO: output variables
  113. # return weights and variables and final loss
  114. return (out_tensor, ce)
  115.  
  116. # from subexpr_lists import *
  117. a = [ 1, ['+',1,1], ['*',1,1], ['*',['+',1,1],['+',1,1]], ['+',['+',1,1],['+',1,1]], ['+',['+',1,1],1 ], ['+',1,['+',1,1]]]
  118. # generate training graph
  119. l,o=expand_and_compile(a)
  120.  
  121. new_model=True
  122. RESET_AFTER=50
  123. a = [ 1, ['+',1,1], ['*',1,1], ['*',['+',1,1],['+',1,1]], ['+',['+',1,1],['+',1,1]], ['+',['+',1,1],1 ], ['+',1,['+',1,1]]]
  124. # generate training graph
  125. out_batch = np.transpose(np.array([[1,0,1,0,0,1,1],[0,1,0,1,1,0,0]]))
  126. l,o=expand_and_compile(a)
  127. for i in range(5000):
  128. with tf.Graph().as_default(), tf.Session() as sess:
  129. W,b = build_weights(l,10,1,2)
  130. i_t,V = build_rnn_graph(l,W,b,1)
  131. o_t,ce = complete_rnn_graph(W,V,o,2)
  132. train_step = tf.train.GradientDescentOptimizer(0.001).minimize(ce)
  133. if new_model:
  134. init = tf.initialize_all_variables()
  135. sess.run(init)
  136. new_model=False #xiaojie添加
  137. else:
  138. saver = tf.train.Saver()
  139. saver.restore(sess, './weights/xiaojie.temp')
  140. sess.run(train_step, feed_dict={i_t:np.array([[1]]),o_t:out_batch})
  141. # step=0
  142. # for step in range(1000):
  143. # if step > 900:
  144. # break
  145. # sess.run(train_step, feed_dict={i_t:np.array([[1]]),o_t:out_batch})
  146. # step +=1
  147. saver = tf.train.Saver()
  148. if not os.path.exists("./weights"):
  149. os.makedirs("./weights")
  150. saver.save(sess, './weights/xiaojie.temp')
  151. #for i in range(5000):
  152. # sess.run(train_step, feed_dict={i_t:np.array([[1]]),o_t:out_batch})
  153. # generate testing graph
  154. a = [ ['+',['+',['+',1,1],['+',['+',1,1],['+',1,1]]],1] ] #
  155. l_tst,o_tst=expand_and_compile(a)
  156. i_t_tst,V_tst = build_rnn_graph(l_tst,W,b,1)
  157.  
  158. out_batch = np.transpose(np.array([[1,0,1,0,0,1,1],[0,1,0,1,1,0,0]]))
  159.  
  160. print (l_tst)
  161. print (sess.run(tf.nn.softmax(tf.matmul(V[1], W['output'])), feed_dict={i_t:np.array([[1]])}))
  162. print (sess.run(tf.nn.softmax(tf.matmul(V[-1], W['output'])), feed_dict={i_t:np.array([[1]])}))
  163. print (sess.run(tf.nn.softmax(tf.matmul(V_tst[-2], W['output'])), feed_dict={i_t_tst:np.array([[1]])}))
  164. print (sess.run(tf.nn.softmax(tf.matmul(V_tst[-1], W['output'])), feed_dict={i_t_tst:np.array([[1]])}))

会发现GPU_load为0!

此时,对代码进行修改:

  1. sess.run(train_step, feed_dict={i_t:np.array([[1]]),o_t:out_batch})

改为:

  1. step=0
  2. for step in range(1000):
  3. if step > 900:
  4. break
  5. sess.run(train_step, feed_dict={i_t:np.array([[1]]),o_t:out_batch})
  6. step +=1

此时再运行,GPU_load不为0了!

说明在相同的网络结构上运行多次,才会发挥GPU的计算能力。

6.2 调试Tensorflow implementation of Recursive Neural Networks using LSTM units

源代码下载地址是:https://github.com/sapruash/RecursiveNN

我对代码做了两种移植:一种是将其从py2变为py3,主要针对print带括号,xrange变为range,然后是range前加list才能对其进行shuffle以及iter等操作。

其次是,我电脑tensorflow版本是1.8版本,较高,对tf.concat以及tf.split等的参数传递顺序等进行了修正。

修正版下载地址是:

https://pan.baidu.com/s/1lpQsIjFIj37r4IBNHIZNlA

直接运行以后,可以发现,GPU_load是不为0的。

观察它的特点是:

第一:没有随着语料库去构建网络,而是根据最长的句子长度去构建网络。

  1. def train(restore=False):
  2.  
  3. config=Config()
  4.  
  5. data,vocab = utils.load_sentiment_treebank(DIR,config.fine_grained)
  6.  
  7. train_set, dev_set, test_set = data['train'], data['dev'], data['test']
  8. print ('train', len(train_set))
  9. print ('dev', len(dev_set))
  10. print ('test', len(test_set))
  11.  
  12. num_emb = len(vocab)
  13. num_labels = 5 if config.fine_grained else 3
  14. for _, dataset in data.items():
  15. labels = [label for _, label in dataset]
  16. assert set(labels) <= set(range(num_labels)), set(labels)
  17. print ('num emb', num_emb)
  18. print ('num labels', num_labels)
  19.  
  20. config.num_emb=num_emb
  21. config.output_dim = num_labels
  22.  
  23. config.maxseqlen=utils.get_max_len_data(data)
  24. config.maxnodesize=utils.get_max_node_size(data)
  25.  
  26. print (config.maxnodesize,config.maxseqlen ," maxsize")
  27. #return
  28. random.seed()
  29. np.random.seed()
  30.  
  31. with tf.Graph().as_default():
  32.  
  33. #model = tf_seq_lstm.tf_seqLSTM(config)
  34. model = tf_tree_lstm.tf_NarytreeLSTM(config)
  35.  
  36. init=tf.initialize_all_variables()
  37. saver = tf.train.Saver()
  38. best_valid_score=0.0
  39. best_valid_epoch=0
  40. dev_score=0.0
  41. test_score=0.0
  42. with tf.Session() as sess:
  43.  
  44. sess.run(init)
  45. start_time=time.time()
  46.  
  47. if restore:saver.restore(sess,'./ckpt/tree_rnn_weights')
  48. for epoch in range(config.num_epochs):
  49. print ('epoch', epoch)
  50. avg_loss=0.0
  51. avg_loss = train_epoch(model, train_set,sess)
  52. print ('avg loss', avg_loss)
  53.  
  54. dev_score=evaluate(model,dev_set,sess)
  55. print ('dev-scoer', dev_score)
  56.  
  57. if dev_score > best_valid_score:
  58. best_valid_score=dev_score
  59. best_valid_epoch=epoch
  60. saver.save(sess,'./ckpt/tree_rnn_weights')
  61.  
  62. if epoch -best_valid_epoch > config.early_stopping:
  63. break
  64.  
  65. print ("time per epochis {0}".format(
  66. time.time()-start_time))
  67. test_score = evaluate(model,test_set,sess)
  68. print (test_score,'test_score')

其中,train_epoch调用的是:

  1. def train_epoch(model,data,sess):
  2. loss=model.train(data,sess)
  3. return loss

实际运行时调用的是tf_tree_lstm类的方法train

  1. def train(self,data,sess):
  2. from random import shuffle
  3. #data_idxs=range(len(data))
  4. #xiaojie modify
  5. data_idxs=list(range(len(data)))
  6. shuffle(data_idxs)
  7. losses=[]
  8. for i in range(0,len(data),self.batch_size):
  9. batch_size = min(i+self.batch_size,len(data))-i
  10. if batch_size < self.batch_size:break
  11.  
  12. batch_idxs=data_idxs[i:i+batch_size]
  13. batch_data=[data[ix] for ix in batch_idxs]#[i:i+batch_size]
  14.  
  15. input_b,treestr_b,labels_b=extract_batch_tree_data(batch_data,self.config.maxnodesize)
  16.  
  17. feed={self.input:input_b,self.treestr:treestr_b,self.labels:labels_b,self.dropout:self.config.dropout,self.batch_len:len(input_b)}
  18.  
  19. loss,_,_=sess.run([self.loss,self.train_op1,self.train_op2],feed_dict=feed)
  20. #sess.run(self.train_op,feed_dict=feed)
  21.  
  22. losses.append(loss)
  23. avg_loss=np.mean(losses)
  24. sstr='avg loss %.2f at example %d of %d\r' % (avg_loss, i, len(data))
  25. sys.stdout.write(sstr)
  26. sys.stdout.flush()
  27.  
  28. #if i>1000: break
  29. return np.mean(losses)

可以看到,对于每个句子,压根不存在重新构建网络的过程,而是将数据用feed的方式传入!!!

所以,研究这段代码,就可以解决我在本文最初提出的无法调用GPU进行运算的问题。

结论

结构递归网络,构建网络的过程只能用CPU
所以,不断构建网络,无法发挥GPU
这就是我的解释。唯一的办法就是,对网络结构进行重构。我现在也理解为什么,tensorflow提供的RtNN单元必须是定长的原因了。
假如对tensorflow的RtNN单元调试,我相信,它解决的,就是我现在面临的RvNN的问题。 
也就是说,设计一个固定的递归树网络结构,同时处理数据的不定长输入。而不是,根据每个输入句子的变化,动态构建网络结构。所以github上给的斯坦福课后作业程序是有问题的!
修改版本的RvNN单元,后续再更新。
这个RvNN单元就是模仿Tensorflow提供的常见的RNN单元。
修正方式参见:https://github.com/sapruash/RecursiveNN
修正后代码待更新。

7 Recursive AutoEncoder结构递归自编码器(tensorflow)不能调用GPU进行计算的问题(非机器配置,而是网络结构的问题)的更多相关文章

  1. 3. Recursive AutoEncoder(递归自动编码器)

    1. AutoEncoder介绍 2. Applications of AutoEncoder in NLP 3. Recursive Autoencoder(递归自动编码器) 4. Stacked ...

  2. TensorFlow如何提高GPU训练效率和利用率

    前言 首先,如果你现在已经很熟悉tf.data+estimator了,可以把文章x掉了╮( ̄▽ ̄””)╭ 但是!如果现在还是在进行session.run(..)的话!尤其是苦恼于GPU显存都塞满了利用 ...

  3. 人工智能范畴及深度学习主流框架,谷歌 TensorFlow,IBM Watson认知计算领域IntelligentBehavior介绍

    人工智能范畴及深度学习主流框架,谷歌 TensorFlow,IBM Watson认知计算领域IntelligentBehavior介绍 ================================ ...

  4. [源码解析] TensorFlow 分布式之 MirroredStrategy 分发计算

    [源码解析] TensorFlow 分布式之 MirroredStrategy 分发计算 目录 [源码解析] TensorFlow 分布式之 MirroredStrategy 分发计算 0x1. 运行 ...

  5. TensorFlow之多核GPU的并行运算

    tensorflow多GPU并行计算 TensorFlow可以利用GPU加速深度学习模型的训练过程,在这里介绍一下利用多个GPU或者机器时,TensorFlow是如何进行多GPU并行计算的. 首先,T ...

  6. 如何检查tensorflow环境是否能正常调用GPU

    检查keras/tensorflow是否正常调用GPU代码 os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os. ...

  7. keras 或 tensorflow 调用GPU报错:Blas GEMM launch failed

    GPU版的tensorflow在模型训练时遇到Blas GEMM launch failed错误,或者keras遇到相同错误(keras 一般将tensorflow作为backend,如果安装了GPU ...

  8. 极简安装 TensorFlow 2.0 GPU

    前言 之前写了几篇关于 TensorFlow 1.x GPU 版本安装的博客,但几乎没怎么学习过.之前基本在搞 Machine Learning 和 Data Mining 方面的东西,极少用到 NN ...

  9. TensorFlow中使用GPU

    TensorFlow默认会占用设备上所有的GPU以及每个GPU的所有显存:如果指定了某块GPU,也会默认一次性占用该GPU的所有显存.可以通过以下方式解决: 1 Python代码中设置环境变量,指定G ...

随机推荐

  1. (转)PLSQL Developer 12.0.7连接Oracle12c数据库

    版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog.csdn.net/sl1992/article/details/80489413 1.下载安装PL/SQL Develo ...

  2. Spring Security构建Rest服务-0900-rememberMe记住我

    Spring security记住我基本原理: 登录的时候,请求发送给过滤器UsernamePasswordAuthenticationFilter,当该过滤器认证成功后,会调用RememberMeS ...

  3. Win下Eclipse提交Hadoop程序出错:org.apache.hadoop.security.AccessControlException: Permission denied: user=D

    描述:在Windows下使用Eclipse进行Hadoop的程序编写,然后Run on hadoop 后,出现如下错误: 11/10/28 16:05:53 INFO mapred.JobClient ...

  4. SpringMVC+AJAX+JSON

    在做一个ajax发送json到springmvc的控制层,控制层的对象中有一个List集合,ajax调用总是报415错误.发现了一个一直没有注意到的问题,借机记录一下. (细节部分都忽略了,在最后的d ...

  5. javascript中的浅拷贝和深拷贝 分类: JavaScript 2015-05-07 15:29 831人阅读 评论(1) 收藏

    1.js对象浅拷贝 简单的赋值就是浅拷贝.因为对象和数组在赋值的时候都是引用传递.赋值的时候只是传递一个指针. 看下面的实例代码: var a = [1,2,3]; var b =a ; var te ...

  6. 一分钟让你学会使用Android AsyncTask

    AsyncTask相信大多数朋友对它的用法都已经非常熟悉,这里记录一下主要是献给那些刚刚接触的Android 或者AsyncTask的同学们,高手请绕道. AsyncTask类是Android1.5版 ...

  7. zabbix邮件内容乱码与邮件内容为附件解决办法

    在zabbix的实际使用过程中,在收到邮件预警的时候,我们会发现邮件内容是乱码的,在手机端收到的是附件,而且附件下载后的文件类型是打不开的.这样我们不知道我们是哪个服务器的哪项服务出了问题,接下来我们 ...

  8. Java保存文本文件

    String requestData = "something you want to save"; String saveFilePath = "C:\\Users\\ ...

  9. rails 国际化、validate校验、flash提示

      1.Rails的国际化 根据特定的locale信息,提取相应的内容 通过config/environment.rb,指定应用的转换文件 config.i18n.load_path +=Dir[Ra ...

  10. 使用C# 开始第一个QQ机器人

    本示例将会使用”嘤鹉学舌”这个小插件的实现来演示如何使用Newbe.Mahua实现第一个机器人插件. 插件功能 自动将发送者的消息回发给发送人,嘤鹉(Parrot,其实是说嘤嘤嘤怪)学舌. 开发环境要 ...