Here is where we'll make our mini-batches for training. Remember that we want our batches to be multiple sequences of some desired number of sequence steps. Considering a simple example, our batches would look like this:

We have our text encoded as integers as one long array in encoded. Let's create a function that will give us an iterator for our batches. I like using generator functions to do this. Then we can pass encoded into this function and get our batch generator.

The first thing we need to do is discard some of the text so we only have completely full batches. Each batch contains N×MN×M characters, where NN is the batch size (the number of sequences) and MM is the number of steps. Then, to get the number of batches we can make from some array arr, you divide the length of arr by the batch size. Once you know the number of batches and the batch size, you can get the total number of characters to keep.

After that, we need to split arr into NN sequences. You can do this using arr.reshape(size) where size is a tuple containing the dimensions sizes of the reshaped array. We know we want NN sequences (n_seqs below), let's make that the size of the first dimension. For the second dimension, you can use -1 as a placeholder in the size, it'll fill up the array with the appropriate data for you. After this, you should have an array that is N×(M∗K)N×(M∗K) where KK is the number of batches.

Now that we have this array, we can iterate through it to get our batches. The idea is each batch is a N×MN×M window on the array. For each subsequent batch, the window moves over by n_steps. We also want to create both the input and target arrays. Remember that the targets are the inputs shifted over one character. You'll usually see the first input character used as the last target character, so something like this:

y[:, :-1], y[:, -1] = x[:, 1:], x[:, 0]

where x is the input batch and y is the target batch.

The way I like to do this window is use range to take steps of size n_steps from 00 to arr.shape[1], the total number of steps in each sequence. That way, the integers you get from range always point to the start of a batch, and each window is n_steps wide.

 def get_batches(arr, n_seqs, n_steps):
'''Create a generator that returns batches of size
n_seqs x n_steps from arr. Arguments
---------
arr: Array you want to make batches from
n_seqs: Batch size, the number of sequences per batch
n_steps: Number of sequence steps per batch
'''
# Get the number of characters per batch and number of batches we can make
characters_per_batch = n_seqs * n_steps
n_batches = len(arr) // characters_per_batch # Keep only enough characters to make full batches
arr = arr[:n_batches*characters_per_batch] # Reshape into n_seqs rows
arr = arr.reshape((n_seqs, -1)) for n in range(0, arr.shape[1], n_steps):
# The features
x = arr[:, n:n+n_steps]
# The targets, shifted by one
y = np.zeros_like(x)
y[:, :-1], y[:, -1] = x[:, 1:], x[:, 0]
yield x, y batches = get_batches(encoded, 10, 50)
x, y = next(batches) print('x\n', x[:10, :10])
print('\ny\n', y[:10, :10])

Making training mini-batches的更多相关文章

  1. 第四章——训练模型(Training Models)

    前几章在不知道原理的情况下,已经学会使用了多个机器学习模型机器算法.Scikit-Learn很方便,以至于隐藏了太多的实现细节. 知其然知其所以然是必要的,这有利于快速选择合适的模型.正确的训练算法. ...

  2. deeplearning.ai 改善深层神经网络 week2 优化算法 听课笔记

    这一周的主题是优化算法. 1.  Mini-batch: 上一门课讨论的向量化的目的是去掉for循环加速优化计算,X = [x(1) x(2) x(3) ... x(m)],X的每一个列向量x(i)是 ...

  3. Must Know Tips/Tricks in Deep Neural Networks

    Must Know Tips/Tricks in Deep Neural Networks (by Xiu-Shen Wei)   Deep Neural Networks, especially C ...

  4. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Assignment(Optimization Methods)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. 请不要ctrl+c/ctrl+v作业. Optimization Methods Until now, you've always u ...

  5. MLlib之LR算法源码学习

    /** * :: DeveloperApi :: * GeneralizedLinearModel (GLM) represents a model trained using * Generaliz ...

  6. Must Know Tips/Tricks in Deep Neural Networks (by Xiu-Shen Wei)

    http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html Deep Neural Networks, especially Conv ...

  7. GeneralizedLinearAlgorithm in Spark MLLib

    GeneralizedLinearAlgorithm SparkMllib涉及到的算法 Classification Linear Support Vector Machines (SVMs) Log ...

  8. 课程二(Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization),第二周(Optimization algorithms) —— 2.Programming assignments:Optimization

    Optimization Welcome to the optimization's programming assignment of the hyper-parameters tuning spe ...

  9. 几种梯度下降方法对比(Batch gradient descent、Mini-batch gradient descent 和 stochastic gradient descent)

    https://blog.csdn.net/u012328159/article/details/80252012 我们在训练神经网络模型时,最常用的就是梯度下降,这篇博客主要介绍下几种梯度下降的变种 ...

  10. 吴恩达深度学习笔记(五) —— 优化算法:Mini-Batch GD、Momentum、RMSprop、Adam、学习率衰减

    主要内容: 一.Mini-Batch Gradient descent 二.Momentum 四.RMSprop 五.Adam 六.优化算法性能比较 七.学习率衰减 一.Mini-Batch Grad ...

随机推荐

  1. 基于FPGA的DDR3多端口读写存储管理系统设计

    基于FPGA的DDR3多端口读写存储管理系统设计 文章出处:电子技术设计 发布时间: 2015/03/12 | 1747 次阅读 每天新产品 时刻新体验专业薄膜开关打样工厂,12小时加急出货   机载 ...

  2. [k8s]kubectl windows配置(kubernetic) && kubectl config set-context使用Kubernetic

    参考: https://feisky.gitbooks.io/kubernetes/components/kubectl.html https://kubernetes.io/docs/tasks/t ...

  3. html页面中js判断浏览器是否是IE浏览器及IE浏览器版本

    HTML里: HTML代码中,在编写网页代码时,各种浏览器的兼容性是个必须考虑的问题,有些时候无法找到适合所有浏览器的写法,就只能写根据浏览器种类区别的代码,这时就要用到判断代码了.在HTML代码中, ...

  4. 221. Add Two Numbers II【medium】

    You have two numbers represented by a linked list, where each node contains a single digit. The digi ...

  5. cp/scp命令详解

    cp:拷贝命令 用法: cp [参数] source dest cp [参数] source ... directory 说明:将一个档案拷贝至另一个档案,或数个档案拷贝到另一目录 参数: -a 尽可 ...

  6. 消除^M

    在Linux和windows之间移动文件时,总是会出现在windows下编辑的文件在Linux打开时每一行都显示一个^M,虽然不影响使用,但是却影响美观,于是就想自己实现个小程序来进行转换. 要实现转 ...

  7. myeclipse能启动tomcat但是用startup.bat无法启动

    myeclipse能启动tomcat但是用startup.bat无法启动 这个问题困扰了我一天,把一天的周末时间白白花费了.各种百度,各种尝试都没办法解决.在江湖上闯,难道就只有百度一招吗? 不是,我 ...

  8. (译)Getting Started——1.2.4 Tutorial:Storyboard(故事板)

    该教程是基于你在前面的课程中构建的项目上进行的.学完本教程后,你将使用你前面学到的视图.视图控制器.动作.导航的内容,还会为应用创建一些关键的用户界面,并在场景中添加行为 以下就是本节课的内容: 1. ...

  9. What is Web Application Architecture? How It Works, Trends, Best Practices and More

    At Stackify, we understand the amount of effort that goes into creating great applications. That’s w ...

  10. poj2987 Firing 最大权闭合子图 边权有正有负

    /** 题目:poj2987 Firing 最大权闭合子图 边权有正有负 链接:http://poj.org/problem?id=2987 题意:由于金融危机,公司要裁员,如果裁了员工x,那么x的下 ...