Machine Learning--week4 神经网络的基本概念

之前的学习成果并不能解决复杂的非线性问题

Neural Networks

Sigmoid(logistic) activation function: activation function is another term for $g(z) = \frac{1}{1+e^{-z}}$

activation: the value that's computed by and as output by a specific

weights = parameters = $\theta$

input units: $x_1,x_2, x_3,\dots, x_n$

bias unit/ bias neuron: $x_0$ 与 $a_0^{(j)}$

input units 和 hypothesis 之间的layer 由activation 构成

input wire/ output wire：input wire是指指向目标neuron的箭头，output wire是指从目标neuron指出的箭头

$a_i^{(j)}$: "activation" of neuron $i$ or of unit $i$ in layer $j$

$\Theta^{(j)}$: matrix of weights controlling the function mapping form layer $j$ to layer $j+1$

（注意$\Theta$是大写的，因为它需要用到矩阵的形式了）

layer 1 == input layer

layer n == output layer (the last layer)

layer 2 ~ layer n-1 == hidden layer

for example:
\[
\begin{align}
\text{output of layer 1(a hidden lyer)}&\begin{cases}a_1^{(2)} &= g(\Theta_{10}^{(1)}x_0 + \Theta_{11}^{(1)}x_1 + \Theta_{12}^{(1)}x_2 + \Theta_{13}^{(1)}x_3)\\
a_2^{(2)} &= g(\Theta_{20}^{(1)}x_0 + \Theta_{21}^{(1)}x_1 + \Theta_{22}^{(1)}x_2 + \Theta_{23}^{(1)}x_3)\\
a_3^{(2)} &= g(\Theta_{30}^{(1)}x_0 + \Theta_{31}^{(1)}x_1 + \Theta_{32}^{(1)}x_2 + \Theta_{33}^{(1)}x_3)\end{cases}\\
\text{output layer}&\begin{cases}h_\Theta(x) = a_1^{(3)} = g(\Theta_{10}^{(2)}a_0^{(2)} + \Theta_{11}^{(2)}a_1^{(2)} +\Theta_{12}^{(2)}a_2^{(2)} + \Theta_{13}^{(2)}a_3^{(2)})\end{cases}
\end{align}
\]

直观点就是：
\[
\begin{align}
\text{output of layer 1(a hidden lyer)}
&\begin{cases}
a_1^{(2)} &= g(\Theta_{1}^{(1)}a^{(1)})\\
a_2^{(2)} &= g(\Theta_{2}^{(1)}a^{(1)})\\
a_3^{(2)} &= g(\Theta_{3}^{(1)}a^{(1)})
\end{cases}\\
\text{output layer}
&\begin{cases}
h_\Theta(x) = a_1^{(3)} = g(\Theta_{1}^{(2)}a^{(2)})
\end{cases}
\end{align}
\]

)generally, $\Theta^{(j)}$ will be of dimension $s_{j+1} \times (s_j+1)$, if network has $s_j$ units in layer $j$ and $s_{j+1}$ units in layer $j+1$. ($s_j+1$中的$+1$ comes from the addition in $\Theta^{(j)}$ of the "bias nodes," $x_0$ and $\Theta_0^{(j)}$ . In other words the output nodes will not include the bias nodes while the inputs will. )

定义 $a^{(1)} = x$

$z^{j+1} = \Theta^{(j)}a^{(j)}$

$x_k^{(j+1)} = \Theta_{k,0}^{(j)}a_0^{(j)} + \Theta_{k,1}^{(j)}a_1^{(j)} + \dots + \Theta_{k,n^{(j)}}^{(j)}a_{n^{(j)}}^{(j)}\quad ,(n^{(j)} \text{ means layer j has } n^{(j)} \text{ activation})$

$a^{(j)} = g(z^{(j)}) = g(\Theta^{(j-1)}a^{(j-1)})\quad(j\ge2)$

设有 $n$ 个 layers, then the last matrix $\Theta^{(n)}$ will have only one row which is multiplied by one column $a^{(j)}$ so that our result is a single number:

$h_\Theta(x) = a^{(n+1)}=g(z^{(n+1)})$

Add $a_0^{(j)}=1$

Forward Propagation：向前传播

Neural Networks 实际上是使用$a^{(n-1)}$layer作为训练logistic regression的特征的，而非input layer，在$\Theta^{(1)}$中选择不同的参数可能得到一些复杂的特征，从而的到更好的hypothesis，这样做比直接用$x_1,x_2,\dots ,x_n$作为训练特征更好

architecture(架构)：the way that neural networks are connected

逻辑表达式对应的$\theta$：

${\rm AND} = (x_1 \bigwedge x_2)$:
- $\Theta = \begin{bmatrix}-30 &20& 20 \end{bmatrix}$
${\rm NOR} = (\lnot x_1 \bigwedge \lnot x_2)$:
- $\Theta = \begin{bmatrix}10 & -20& -20 \end{bmatrix}$
${\rm OR} = (x_1 \bigvee x_2)$:
- $\Theta = \begin{bmatrix}-10 &20& 20 \end{bmatrix}$
${\rm NOT} = (\lnot x)$:
- $\Theta = \begin{bmatrix}-10 & 20\end{bmatrix}$
${\rm XNOR} = (\lnot x_1 \bigwedge \lnot x_2) \bigvee ( x_1 \bigwedge x_2)$
- 需要一个hidden layer: $a_1^{(2)} == (\lnot x_1 \bigwedge \lnot x_2),\quad a_2^{(2)} == (x_1 \bigwedge x_2)$
- output layer: $a^{(3)} == (a_1^{(2)} \bigvee a_2^{(2)})$

逻辑表达式的实现：

令$x=\begin{bmatrix}1 \\ x_1\\x_2 \end{bmatrix}$, 则 $a_i = g(\Theta_ix)$就得到$\Theta_i$对应的逻辑运算符运算$x_1,x_2$的结果了

比如 $\Theta_i = \begin{bmatrix}-10 &20& 20 \end{bmatrix}$那么$a_i == x_1 \bigvee x_2$

像${\rm XNOR}$这种复杂的逻辑表达式需要借助hidden layer才能算出来

对于 multiclass Classification:

用$y = \begin{bmatrix}1\\0\\0\\0 \end{bmatrix}, \begin{bmatrix}0\\1\\0\\0 \end{bmatrix}, \begin{bmatrix}0\\0\\1\\0 \end{bmatrix}, \begin{bmatrix}0\\0\\0\\1 \end{bmatrix},\begin{bmatrix}0\\0\\0\\0 \end{bmatrix}$来表示不同的class，

Machine Learning--week4 神经网络的基本概念的更多相关文章

【Machine Learning】机器学习及其基础概念简介
机器学习及其基础概念简介作者:白宁超 2016年12月23日21:24:51 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现的深入理解.本系列文章是作者结 ...
Machine Learning 学习笔记1 - 基本概念以及各分类
What is machine learning? 并没有广泛认可的定义来准确定义机器学习.以下定义均为译文,若以后有时间,将补充原英文...... 定义1.来自Arthur Samuel(上世纪50 ...
[Machine Learning & Algorithm] 神经网络基础
目前,深度学习(Deep Learning,简称DL)在算法领域可谓是大红大紫,现在不只是互联网.人工智能,生活中的各大领域都能反映出深度学习引领的巨大变革.要学习深度学习,那么首先要熟悉神经网络(N ...
【Machine Learning】KNN算法虹膜图片识别
K-近邻算法虹膜图片识别实战作者:白宁超 2017年1月3日18:26:33 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现的深入理解.本系列文章是作者结 ...
【Machine Learning】Python开发工具：Anaconda+Sublime
Python开发工具:Anaconda+Sublime 作者:白宁超 2016年12月23日21:24:51 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现 ...
【Machine Learning】决策树案例：基于python的商品购买能力预测系统
决策树在商品购买能力预测案例中的算法实现作者:白宁超 2016年12月24日22:05:42 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现的深入理解.本 ...
Coursera《machine learning》--（8）神经网络表述
本笔记为Coursera在线课程<Machine Learning>中的神经网络章节的笔记. 八.神经网络:表述(Neural Networks: Representation) 本节主要 ...
Andrew Ng 的 Machine Learning 课程学习 (week4) Multi-class Classification and Neural Networks
这学期一直在跟进 Coursera上的 Machina Learning 公开课, 老师Andrew Ng是coursera的创始人之一,Machine Learning方面的大牛.这门课程对想要了解 ...
Python -- machine learning， neural network -- PyBrain 机器学习神经网络
I am using pybrain on my Linuxmint 13 x86_64 PC. As what it is described: PyBrain is a modular Machi ...
【机器学习Machine Learning】资料大全
昨天总结了深度学习的资料,今天把机器学习的资料也总结一下(友情提示:有些网站需要"科学上网"^_^) 推荐几本好书: 1.Pattern Recognition and Machi ...

随机推荐

java应用，直接请求没问题，通过nginx跳转状态吗400
今天配置金融的测试环境,直接调用java应用返回状态200,通通过nginx跳转,会返回400,真是一头雾水..... 参考文档: https://www.cnblogs.com/yanghj010/ ...
解决键盘输入被JDB占用的问题
解决键盘输入被JDB占用的问题本周的任务"迭代和JDB"在使用JDB调试时需要键盘输入数据,但我在正确的位置输入数据后发现JDB提示如图所示的错误. 上网查找后得知该错误的产生是 ...
python使用suds调用webservice接口
最近做接口对接,遇到了.net开发的webservice接口,因为python第一次与webservice对接,连问带查,最后使用suds库来实现了 1.安装suds mac: sudo pip in ...
react中进入某个详情页URL路劲参数Id获取问题
<Route path={`${match.url}/detail/:id`} component={AppManageAddDetail} /> const { match:{param ...
目标检测（四）Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
作者:Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun SPPnet.Fast R-CNN等目标检测算法已经大幅降低了目标检测网络的运行时间. ...
angular--解决angular图片加载失败问题
基于angular4写的一个指令,在ionic3.x项目在用.因为加载图片超时等原因导致图片显示不出来,需要替换成默认或者指定图片 1.err-src.ts import { Directive,In ...
OCP-第三节课.md
一. dataguard stream 字节流技术: 二. 突然断电:触发实例恢复过程: 三. 宕机:赔钱四. Redis.MQ(消息中间件.队列管理器.缓存)(内存数据库) 五. IBM MQ ...
使用Typescript写的Vue初学者Hello World实例(实现按需加载、跨域调试、await/async）
万事开头难,一个好的Hello World程序可以节省我们好多的学习时间,帮助我们快速入门.Hello World程序之所以是入门必读必会,就是因为其代码量少,简单易懂.但我觉得,还应该做到功能丰富, ...
word之选中文本
在word和notepad中: 特别是在文件很大,如果用鼠标下滑的话,不知道会滑多久呢, 快捷键+鼠标点击截至处
单端测序(Single-read)和双端测序(Paired-end和Mate-pair)的关系
https://blog.csdn.net/hanli1992/article/details/82982434

Machine Learning--week4 神经网络的基本概念

Machine Learning--week4 神经网络的基本概念的更多相关文章

随机推荐

热门专题