Reference: https://www.cs.swarthmore.edu/~meeden/cs81/s10/BackPropDeriv.pdf

  I spent nearly one hour to deduce the vector form of the back propagation. Just in case that I may forget, but need to utilize them, I will write down all the formula here to make a backup.

Structure:

  Standard BP Network with $\displaystyle \lambda$ hidden layers, one input layer and one output layer.

  Activation function: sigmoid.

Notations:

$\displaystyle W^{i+1,i}$, denotes the weight matrix connecting from $i$th layer to $i+1$th layer.

$\displaystyle N^i$, denotes the net input of the $i$th layer.

$\displaystyle A^i$, denotes the activation input of the $i$th layer.

$\displaystyle \delta ^i$, denotes the error of the $i$th layer.

$\displaystyle \epsilon$, denotes the learning rate.

*, stands for element by element multiplication.

(omit), stands for matrix multiplication.

  Specifically,

$\displaystyle X$, denotes the input layer, while equals $\displaystyle A^0$.

$\displaystyle A^{\lambda + 1}$, denotes the output layer.

$\displaystyle Y$, denotes the expected output.

Propagations:

  Forward:

$\displaystyle N^i = W^{i,i-1}A^{i-1}$.

$\displaystyle A^i = \frac{1}{1+e^{-N^i}}$.

  Backward:

$\displaystyle \Delta W^{i+1,i} = \epsilon \delta^{i+1}(A^{i})^{T}$.

$\displaystyle \delta ^i = ((\delta^{i+1})^{T}W^{i+1,i})^{T}*A^{i}*(1-A^{i})$.

$\displaystyle \delta ^{\lambda + 1} = (Y - A^{\lambda + 1})*A^{\lambda + 1}*(1-A^{\lambda + 1})$.

Deduction:

  I am not capable of taking the partial derivative of vector or matrix over vector or matrix, so I derive these formulas by observing the formula for each element in the matrix and extend it to the vector form.

$\displaystyle \Delta W^{\lambda+1,\lambda}_{i,j} = \epsilon (Y_i - A^{\lambda+1}_i)A^{\lambda+1}_i(1-A^{\lambda +1}_i)A^{\lambda}_j$.

  Let's assume $\displaystyle \delta ^{\lambda+1}_{i} := (Y_i - A^{\lambda+1}_i)A^{\lambda+1}_i(1-A^{\lambda +1}_i)$.

$\displaystyle \Delta W^{\lambda,\lambda-1}_{i,j}=\epsilon (\delta^{\lambda+1})^{T}W^{\lambda+1,\lambda}_{col(i)}A_i^{\lambda}(1-A_i^{\lambda})A_j^{\lambda-1}$.

  Let's assume $\displaystyle \delta ^{\lambda}_{i} := (\delta^{\lambda+1})^{T}W^{\lambda+1,\lambda}_{col(i)}A_i^{\lambda}(1-A_i^{\lambda})$.

  The left are reserved for the readers to complete.

[Machine Learning][BP]The Vectorized Back Propagation Algorithm的更多相关文章

  1. CheeseZH: Stanford University: Machine Learning Ex4:Training Neural Network(Backpropagation Algorithm)

    1. Feedforward and cost function; 2.Regularized cost function: 3.Sigmoid gradient The gradient for t ...

  2. Bayesian machine learning

    from: http://www.metacademy.org/roadmaps/rgrosse/bayesian_machine_learning Created by: Roger Grosse( ...

  3. 机器学习算法之旅A Tour of Machine Learning Algorithms

    In this post we take a tour of the most popular machine learning algorithms. It is useful to tour th ...

  4. [GPU] Machine Learning on C++

    一.MPI为何物? 初步了解:MPI集群环境搭建 二.重新认识Spark 链接:https://www.zhihu.com/question/48743915/answer/115738668 马铁大 ...

  5. A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning

    A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning by Jason Brownlee on S ...

  6. Machine Learning—Mixtures of Gaussians and the EM algorithm

    印象笔记同步分享:Machine Learning-Mixtures of Gaussians and the EM algorithm

  7. AUTOML --- Machine Learning for Automated Algorithm Design.

    自动算法的机器学习: Machine Learning for Automated Algorithm Design. http://www.ml4aad.org/ AutoML——降低机器学习门槛的 ...

  8. (转)Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning

    Introduction Optimization is always the ultimate goal whether you are dealing with a real life probl ...

  9. machine learning model(algorithm model) .vs. statistical model

    https://www.analyticsvidhya.com/blog/2015/07/difference-machine-learning-statistical-modeling/ http: ...

随机推荐

  1. SpringMVC:自定义视图及其执行过程

    一:自定义视图 1.自定义一个实现View接口的类,添加@Component注解,将其放入SpringIOC容器 package com.zzj.view; import java.io.PrintW ...

  2. 分析配置DispatcherServlet类时load-on-startup标签作用

    <servlet> <servlet-name>DispatcherServlet</servlet-name> <servlet-class>org. ...

  3. 【转载】XShell 连接 VirtualBox CentOS7

    1.安装 XShell 网址:http://sw.bos.baidu.com/sw-search-sp/software/07a1d9cec0638/Xshell-5.0.1339.exe 尽量不要安 ...

  4. Linux centosVMware vim 编辑模式、vim命令模式、vim实践

    一.编辑模式.命令模式 在一般模式下输入:或/可进入命令模式.在该模式下可进行走索某个字符或字符串,也可保存.替换.退出.显示行号等. /word:在光标之后查找一个字符串word,按n向后继续搜索 ...

  5. Python基础_ONLINE习题集_03 数据类型

    3.1 将元组(1,2,3) 和集合{"four",5,6}合成一个列表 tuple,set,list = (1,2,3),{"four",5,6},[] fo ...

  6. 前端学习笔记系列一:15vscode汉化、快速复制行、网页背景图有效设置、 dl~dt~dd标签使用

    ctrl+shift+p,调出configure display language,选择en或zh,若没有则选择安装使用其它语言,则直接呼出扩展程序搜索界面,选择,然后安装,重启即可. shift+a ...

  7. 渗透测试神器Burp Suite v1.7.11发布(含下载)

    BurpSuite是一款信息安全从业人员必备的集成型的渗透测试工具,它采用自动测试和半自动测试的方式,包含了Proxy,Spider,Scanner,Intruder,Repeater,Sequenc ...

  8. Unity添加小米游戏SDK

    因为游戏要上线小米的平台,所以游戏就要添加小米SDK,整了3天总算是把小米SDK添加上了~~ 多亏找到了这个帖子:Unity3D接入小米盒子SDK. (小米人家论坛有官方贴出来的其他开发者的接入经验~ ...

  9. jmeter之Xpath提取器

    首先创建线程组,添加http请求,具体的设置如图1所示: 图1 然后,再添加后置处理器中的XPath Extractor,具体的参数设置,以及表达式如图2: 图2 可以添加Debug PostProc ...

  10. Kubernetes Dashboard 【转】

    前面章节 Kubernetes 所有的操作我们都是通过命令行工具 kubectl 完成的.为了提供更丰富的用户体验,Kubernetes 还开发了一个基于 Web 的 Dashboard,用户可以用 ...