theano scan optimization
selected from Theano Doc
Optimizing Scan
performance
Minimizing Scan Usage
performan as much of the computation as possible outside of Scan
. This may have the effect increasing memory usage but also reduce the overhead introduce by Scan
.
Explicitly passing inputs of the inner function to scan
It's more efficient to explicitly pass parameter as non-sequence inputs.
Examples: Gibbs Sampling
Version One:
import theano
from theano import tensor as T
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
def OneStep(vsample) :
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
values, updates = theano.scan(OneStep, outputs_info=sample, n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
Version Two:
W = theano.shared(W_values) # we assume that ``W_values`` contains the
# initial values of your weight matrix
bvis = theano.shared(bvis_values)
bhid = theano.shared(bhid_values)
trng = T.shared_randomstreams.RandomStreams(1234)
# OneStep, with explicit use of the shared variables (W, bvis, bhid)
def OneStep(vsample, W, bvis, bhid):
hmean = T.nnet.sigmoid(theano.dot(vsample, W) + bhid)
hsample = trng.binomial(size=hmean.shape, n=1, p=hmean)
vmean = T.nnet.sigmoid(theano.dot(hsample, W.T) + bvis)
return trng.binomial(size=vsample.shape, n=1, p=vmean,
dtype=theano.config.floatX)
sample = theano.tensor.vector()
# The new scan, with the shared variables passed as non_sequences
values, updates = theano.scan(fn=OneStep,
outputs_info=sample,
non_sequences=[W, bvis, bhid],
n_steps=10)
gibbs10 = theano.function([sample], values[-1], updates=updates)
Deactivating garbage collecting in Scan
Deactivating garbage collecting in Scan can allow it to reuse memory between executins instead of always having to allocate new memory. Scan
reuses memory between iterations of the same execution but frees the memory after the last iteration.
config.scan.allow_gc=False
Graph Optimizations
There are patterns that Theano can't optimize. the LSTM tutorial provides an example of optimization that theano can't perform. Instead of performing many matrix multiplications between matrix \(x_t\) and each of the shared msatrices \(W_i,W_c,W_f\) and \(W_o\), the matrixes \(W_{*}\) are merged into a single shared \(W\) and the graph performans a single larger matrix multiplication between \(W\) and \(x_t\). The resulting matrix is then sliced to obtain the results of that the small individial matrix multiplications by a single larger one and thus improves performance at the cost of a potentially higher memory usage.
theano scan optimization的更多相关文章
- theano中的scan用法
scan函数是theano中的循环函数,相当于for loop.在读别人的代码时第一次看到,有点迷糊,不知道输入.输出怎么定义,网上也很少有example,大多数都是相互转载同一篇.所以,还是要看官方 ...
- Theano学习-scan循环
\(1.Scan\) 通用的一般形式,可用于循环 减少和映射(对维数循环)是特殊的 \(scan\) 对输入序列进行 \(scan\) 操作,每一步都能得到一个输出 \(scan\) 能看到定义函数的 ...
- theano学习
import numpy import theano.tensor as T from theano import function x = T.dscalar('x') y = T.dscalar( ...
- LSTM 分类器笔记及Theano实现
相关讨论 http://tieba.baidu.com/p/3960350008 基于教程http://deeplearning.net/tutorial/lstm.html LSTM基本原理http ...
- 关于thenao.scan() fn函数参数的说明
theano.scan()原型: theano.scan( fn, sequences=None, outputs_info=None, non_sequences=None, n_steps=Non ...
- Theano学习-梯度计算
1. 计算梯度 创建一个函数 \(y\) ,并且计算关于其参数 \(x\) 的微分. 为了实现这一功能,将使用函数 \(T.grad\) . 例如:计算 \(x^2\) 关于参数 \(x\) 的梯度. ...
- IMPLEMENTING A GRU/LSTM RNN WITH PYTHON AND THEANO - 学习笔记
catalogue . 引言 . LSTM NETWORKS . LSTM 的变体 . GRUs (Gated Recurrent Units) . IMPLEMENTATION GRUs 0. 引言 ...
- theano安装问题
WARNING (theano.configdefaults): g++ not available, if using conda: `conda install m2w64-toolchain` ...
- theano使用
一 theano内置数据类型 只有thenao.shared()类型才有get_value()成员函数(返回numpy.ndarray)? 1. 惯常处理 x = T.matrix('x') # t ...
随机推荐
- 八皇后算法的另一种实现(c#版本)
八皇后: 八皇后问题,是一个古老而著名的问题,是回溯算法的典型案例.该问题是国际西洋棋棋手马克斯·贝瑟尔于1848年提出:在8×8格的国际象棋上摆放八个皇后,使其不能互相攻击,即任意两个皇后都不能处于 ...
- app字体被放大效果发虚
IOS App所有字体被放大,显示效果发虚 小小程序猿 我的博客:http://daycoding.com 分析原因: 由于新版本上线更换了LaunchImage,没有注意美工给的图片尺寸,由于图片尺 ...
- 长按TextField或TextView显示中文的粘贴复制
首先要确保手机当前系统为中文,只需要在 plist 文件中添加 Localized resources can be mixed = YES 就行了
- Android中使用Notification实现宽视图通知栏(Notification示例二)
Notification是在你的应用常规界面之外展示的消息.当app让系统发送一个消息的时候,消息首先以图表的形式显示在通知栏.要查看消息的详情需要进入通知抽屉(notificationdrawer) ...
- UDAD 用户故事驱动的敏捷开发 – 演讲实录
敏捷发展到今天已经在软件行业得到了广泛认可,但大多数敏捷方法都是为了解决某一特定问题而总结出来的特定方法或实践,一直缺乏一个可以将整个开发过程串接起来的成体系的方法.用户故事驱动的敏捷开发(User ...
- css之浮动
标准文档流 将窗体自上而下分成一行行, 并在每行中按从左至右的顺序排放元素,即为文档流.每个非浮动块级元素都独占一行, 浮动元素则按规定浮在行的一端. 若当前行容不下, 则另起新行再浮动. 标准流的微 ...
- Java 异常处理
异常是程序中的一些错误,但并不是所有的错误都是异常,并且错误有时候是可以避免的. 比如说,你的代码少了一个分号,那么运行出来结果是提示是错误java.lang.Error:如果你用System.out ...
- ASP.NET Cookie(二)--控制Cookie的范围
默认情况下,一个站点的全部Cookie都一起存储在客户端上,而且所有Cookie都会随着对该站点发送的任何请求一起发送到服务器.也就是说,一个站点中的每个页面都能获得该站点的所有Cookie.但是,可 ...
- Appium python API 总结
Appium python api 根据testerhome的文章,再补充一些文章里面没有提及的API [TOC] [1]find element driver 的方法 注意:这几个方法只能通过sel ...
- Java的JDBC操作
Java的JDBC操作 [TOC] 1.JDBC入门 1.1.什么是JDBC JDBC从物理结构上来说就是java语言访问数据库的一套接口集合,本质上是java语言根数据库之间的协议.JDBC提供一组 ...