refer to:

机器学习公开课笔记(5):神经网络(Neural Network)

CS224d笔记3——神经网络

深度学习与自然语言处理(4)_斯坦福cs224d 大作业测验1与解答

CS224d Problem set 1作业

softmax:

def softmax(x):

    assert len(x.shape) > 1
x -= np.max(x, axis=1, keepdims=True)
x = np.exp(x) / np.sum(np.exp(x), axis=1, keepdims=True) return x

sigmoid & sigmoid_grad:

def sigmoid(x):

    result = 1.0 / (1.0 + np.exp(-x))

    return result

def sigmoid_grad(f):

    f=f*(1.0-f)

    return f

gradcheck_naive:

def gradcheck_naive(f, x):
"""
Gradient check for a function f
- f should be a function that takes a single argument and outputs the
cost and its gradients
- x is the point (numpy array) to check the gradient at
""" rndstate = random.getstate()
random.setstate(rndstate)
fx, grad = f(x) # Evaluate function value at original point
h = 1e-4 # Iterate over all indexes in x
it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
while not it.finished:
ix = it.multi_index ### try modifying x[ix] with h defined above to compute numerical gradients
### make sure you call random.setstate(rndstate) before calling f(x) each
### time, this will make it
### possible to test cost functions with built in randomness later
### YOUR CODE HERE:
old_val = x[ix]
x[ix] = old_val - h
random.setstate(rndstate)
( fxh1, _ ) = f(x) x[ix] = old_val + h
random.setstate(rndstate)
( fxh2, _ ) = f(x) numgrad = (fxh2 - fxh1)/(2*h)
x[ix] = old_val
### END YOUR CODE # Compare gradients
reldiff = abs(numgrad - grad[ix]) / max(1, abs(numgrad), abs(grad[ix]))
if reldiff > 1e-5:
print "Gradient check failed."
print "First gradient error found at index %s" % str(ix)
print "Your gradient: %f \t Numerical gradient: %f" % (grad[ix], numgrad)
return it.iternext() # Step to next dimension print "Gradient check passed!"

neural.py

import numpy as np
import random from q1_softmax import softmax
from q2_sigmoid import sigmoid, sigmoid_grad
from q2_gradcheck import gradcheck_naive def forward_backward_prop(data, labels, params, dimensions):
"""
Forward and backward propagation for a two-layer sigmoidal network Compute the forward propagation and for the cross entropy cost,
and backward propagation for the gradients for all parameters.
""" ### Unpack network parameters (do not modify)
ofs = 0
Dx, H, Dy = (dimensions[0], dimensions[1], dimensions[2]) W1 = np.reshape(params[ofs:ofs+ Dx * H], (Dx, H))
ofs += Dx * H
b1 = np.reshape(params[ofs:ofs + H], (1, H))
ofs += H
W2 = np.reshape(params[ofs:ofs + H * Dy], (H, Dy))
ofs += H * Dy
b2 = np.reshape(params[ofs:ofs + Dy], (1, Dy)) N, D = data.shape # data --> N x D
# W1 --> D x H
# b1 --> 1 x H
# W2 --> H x V
# b2 --> 1 x V
# labels --> N x V ### YOUR CODE HERE: forward propagation
Z1 = np.dot(data, W1) + b1 # N x H
A1 = sigmoid(Z1) # N x H
Z2 = np.dot(A1, W2) + b2 # N x V
A2 = softmax(Z2) # N x V # cross entropy cost #first method
#B = np.exp(Z2) # N x V
#b = np.sum(B, axis=1) + 1e-8 # N x 1
#z = np.log(b) # N x 1
#cost = np.sum(z) - np.sum(Z2 * labels)
#cost /= N #second method
cost = - np.sum(np.log(A2[labels == 1]))/N
### END YOUR CODE
#cost = b2[0,-1] ### YOUR CODE HERE: backward propagation formula:
delta2 = A2 - labels # N x V delta2=A2-y
gradb2 = np.sum(delta2, axis=0) # 1 x V gradb2<--delta2
gradb2 /= N # 1 x V
gradW2 = np.dot(A1.T, delta2) # H x V gradW2=A1.T*delta2
gradW2 /= N # H x V
delta1 = sigmoid_grad(A1) * np.dot(delta2, W2.T)# N x H delta1=f'(A1)*delta2*W2.T
gradb1 = np.sum(delta1, axis=0) # 1 x H gradb1<--delta1
gradb1 /= N # 1 x H
gradW1 = np.dot(data.T, delta1) # D x H gradW1=X.T*delta1
gradW1 /= N # D x H
### END YOUR CODE ### Stack gradients (do not modify)
grad = np.concatenate((gradW1.flatten(), gradb1.flatten(),
gradW2.flatten(), gradb2.flatten())) return cost, grad def sanity_check():
"""
Set up fake data and parameters for the neural network, and test using
gradcheck.
"""
print "Running sanity check..." N = 20
dimensions = [10, 5, 10]
data = np.random.randn(N, dimensions[0]) # each row will be a datum 20*10
labels = np.zeros((N, dimensions[2]))
for i in xrange(N):
labels[i,random.randint(0,dimensions[2]-1)] = 1 #one-hot vector params = np.random.randn((dimensions[0] + 1) * dimensions[1] + (
dimensions[1] + 1) * dimensions[2], ) gradcheck_naive(lambda params: forward_backward_prop(data, labels, params,
dimensions), params) if __name__ == "__main__":
sanity_check()

CS224d assignment 1【Neural Network Basics】的更多相关文章

  1. 吴恩达《深度学习》-课后测验-第一门课 (Neural Networks and Deep Learning)-Week 2 - Neural Network Basics(第二周测验 - 神经网络基础)

    Week 2 Quiz - Neural Network Basics(第二周测验 - 神经网络基础) 1. What does a neuron compute?(神经元节点计算什么?) [ ] A ...

  2. 【Neural Network】林轩田机器学习技法

    首先从单层神经网络开始介绍 最简单的单层神经网络可以看成是多个Perception的线性组合,这种简单的组合可以达到一些复杂的boundary. 比如,最简单的逻辑运算AND  OR NOT都可以由多 ...

  3. Neural Network Basics

    在学习NLP之前还是要打好基础,第二部分就是神经网络基础. 知识点总结: 1.神经网络概要: 2. 神经网络表示: 第0层为输入层(input layer).隐藏层(hidden layer).输出层 ...

  4. 课程一(Neural Networks and Deep Learning),第二周(Basics of Neural Network programming)—— 1、10个测验题(Neural Network Basics)

    --------------------------------------------------中文翻译---------------------------------------------- ...

  5. 【DeepLearning学习笔记】Coursera课程《Neural Networks and Deep Learning》——Week2 Neural Networks Basics课堂笔记

    Coursera课程<Neural Networks and Deep Learning> deeplearning.ai Week2 Neural Networks Basics 2.1 ...

  6. XiangBai——【AAAI2017】TextBoxes_A Fast Text Detector with a Single Deep Neural Network

    XiangBai--[AAAI2017]TextBoxes:A Fast Text Detector with a Single Deep Neural Network 目录 作者和相关链接 方法概括 ...

  7. 论文阅读(Weilin Huang——【TIP2016】Text-Attentional Convolutional Neural Network for Scene Text Detection)

    Weilin Huang--[TIP2015]Text-Attentional Convolutional Neural Network for Scene Text Detection) 目录 作者 ...

  8. 论文阅读(Xiang Bai——【PAMI2017】An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition)

    白翔的CRNN论文阅读 1.  论文题目 Xiang Bai--[PAMI2017]An End-to-End Trainable Neural Network for Image-based Seq ...

  9. 【面向代码】学习 Deep Learning(三)Convolution Neural Network(CNN)

    ========================================================================================== 最近一直在看Dee ...

随机推荐

  1. jQuery中clone和clone(true)的区别

    今天要写的是clone和clone(true)的区别 两者长得很像呀,clone(true)比clone()  多了一个true.看下图白白的牙,笑起来就是这么灿烂.有了true就跟笑起来一样,有了笑 ...

  2. java环境配置

    一.下载安装sdk jdk和jre的区别: jdk:是编译环境(编译器),把java文件编译成class文件 jre:是运行环境(运行器),执行class文件需要使用jre eclipse开发出jav ...

  3. python更新后yum问题

    How to switch between Python versions on Fedora Linux Currently, the default python version on Fedor ...

  4. 分享:录制gif小图片工具

    今天博主分享一个录制gif小图片的工具[LICEcap]: 有的时候,图片解释起来不够直观,如果是一段小动画,别人一看就懂了. 工具我放在百度网盘上面,当然也可以自己在网上下载. 下载地址:http: ...

  5. C# BlockCollection

    1.BlockCollection集合是一个拥有阻塞功能的集合,它就是完成了经典生产者消费者的算法功能. 它没有实现底层的存储结构,而是使用了IProducerConsumerCollection接口 ...

  6. R中的<-和=赋值符号的细致区别

    <-创建的变量的作用范围可以在整个顶层环境,而=仅仅在一个局部环境. 但要<-创建的变量如果是在函数实参传递的时候创建的,其的作用范围可以在整个顶层环境,有一个前提条件:对应的形参在函数内 ...

  7. tyvj1148 小船弯弯

    描述 童年的我们,充满了新奇的想法.这天,小朋友们用彩虹画笔在云霞上绘制了世界上最美丽的图画.那描绘的是一条大河波浪宽,风吹稻花香两岸的情景.欣赏着自己的作品,小朋友们别提多开心了.这时,Q小朋友对C ...

  8. tyvj1125 JR's chop

    描述 JR有很多双筷子.确切的说应该是很多根,因为筷子的长度不一,很难判断出哪两根是一双的.JR家里来了K个客人,JR留下他们吃晚饭.加上JR,JR的girl friend和JR的朋友内涵,共K+3个 ...

  9. C# *= 运算顺序

    a *= a + b *c; 不管等号右边有没有括号,总是先算右边: 即等价于 a = a *(a + b*c); using System; using System.Collections.Gen ...

  10. 【html】学习记录-拖放(drag and drop)

    目的:实现拖动目标并放置到指定区域.   使元素可拖动,涉及到元素的全局属性draggable <img draggable="true" /> 即img元素设置为可拖 ...