Logistic Regression 算法向量化实现及心得
Author: 相忠良(Zhong-Liang Xiang)
Email: ugoood@163.com
Date: Sep. 23st, 2017
根据 Andrew Ng 老师的深度学习课程课后作业及指导,参照吴老师代码完成了这个LR的coding.
(重要)吴老师建议,数据应组织成下列形式,有利于扫除编程bug:
- X.shape = (n_x, m), n_x是样本维度,m是样本个数
- Y.shape = (1, m)
- w, b应该分开,其中:
- b is a scaler
- w.shape = (n_x, 1)
- A = sigmoid(np.dot(w.T, X)+b), A.shape = (1, m)
- dw.shape = (n_x, 1)
- db is a scaler
- dZ = A - Y, dZ.shape = A.shape = Y.shape = (1, m)
- 重要建议:
- 勇于使用 reshape, 使之成为我们需要的维度, 要始终使用明确维度的行、列向量和 matrix;
- 绝不使用 a = np.random.randn(5), a.shape = (5,)这种"rank 1 array".因为这东西使用时不符合直觉;
- 应该用 a = np.random.randn(5,1) 或者 (1,5) 这种非常明确的列或行向量(very important)!
- 若出现2所示内容,解决办法是:a = a.reshape(5,1) 或者 (1,5)重新明确shape!;
- 要经常并随意使用 assert(a.shape == (5,1)) 这种断言;
- 要仔细检查我们的 matrix, vector的维度.
自己的总结:
1. 先完成推导,明确输入输出以及哪些变量是已知的,哪些是待求的.
2. 写出程序伪代码.
3. 针对伪代码,逐条完成程序的 vectorize 过程. 这时要小心地,自输入开始地,维护好各种 vector, matrix 的维度, 必要时随需求,毫不犹豫地使用 reshape.
4. 上述第3条保证了程序中尽量地少使用 for loop.
5. 遵从 Andrew Ng 老师的上述建议,尤其是对 X, Y, A, w, b, dw, db, dZ 这些 vector, matrix 们的 shape 的把握.
符合上述规则和自己的总结,编出个机器学习算法就很简单了.
我整合吴老师的课后作业,加了少许修改,做出 Logisitc Regression 算法的代码, 如下:
# !/usr/bin/python
# -*- coding:utf-8 -*-
"""
Re-implement Logistic Regression algorithm as a practice
使用该 LR re-implement 的前提:
Due to the binary classifier of LR
The label of a sample must be as probability
train data 的标签必须转成0,1的形式
"""
# Author: 相忠良(Zhong-Liang Xiang) <ugoood@163.com>
# Finished on September 23rd, 2017
import h5py
import numpy as np
def load_dataset():
train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
def load_data():
train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes = load_dataset()
train_X = (train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T) / 255. # flatten and divide 255
test_X = (test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T) / 255. # flatten and divide 255
return train_X, train_set_y_orig, test_X, test_set_y_orig, classes
def sigmoid(z):
"""
Compute the sigmoid of z
Arguments:
z -- A scalar or numpy array of any size.
Return:
s -- sigmoid(z)
"""
s = 1.0 / (1 + np.exp(-z))
return s
def init_with_zeros(dim):
"""
This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.
Argument:
dim -- size of the w vector we want (or number of parameters in this case)
Returns:
w -- initialized vector of shape (dim, 1)
b -- initialized scalar (corresponds to the bias)
"""
w = np.zeros((dim, 1))
b = 0
assert (w.shape == (dim, 1))
assert (isinstance(b, float) or isinstance(b, int))
return w, b
def propagate(w, b, X, Y):
"""
Implement the cost function and its gradient for the propagation
Arguments:
w -- weights, a numpy array of size (num_px * num_px * 3, 1)
b -- bias, a scalar
X -- data of size (num_px * num_px * 3, number of examples)
Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)
Return:
cost -- negative log-likelihood cost for logistic regression
dw -- gradient of the loss with respect to w, thus same shape as w
db -- gradient of the loss with respect to b, thus same shape as b
"""
# FORWARD PROPAGATION(FROM X TO COST)
m = X.shape[1] # 样本个数
A = sigmoid(np.dot(w.T, X) + b) # activation (1 * m)
cost = (np.dot(np.log(A), Y.T) + np.dot(np.log(1 - A), (1 - Y).T)) / -m # a scaler
# BACKWARD PROPAGATION (TO FIND GRAD)
dZ = A - Y # (1 * m)
dw = np.dot(X, dZ.T) / m # (n_x, 1) n_x 是 样本的维度
db = np.sum(dZ) / m # a scaler
# ASSERT
assert (dw.shape == w.shape)
assert (db.dtype == float)
cost = np.squeeze(cost) # 变成个数字
assert (cost.shape == ())
grads = {"dw": dw,
"db": db}
return grads, cost
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost):
"""
This function optimizes w and b by running a gradient descent algorithm
Arguments:
w -- weights, a numpy array of size (num_px * num_px * 3, 1)
b -- bias, a scalar
X -- data of shape (num_px * num_px * 3, number of examples)
Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)
num_iterations -- number of iterations of the optimization loop
learning_rate -- learning rate of the gradient descent update rule
print_cost -- True to print the loss every 100 steps
Returns:
params -- dictionary containing the weights w and bias b
grads -- dictionary containing the gradients of the weights and bias with respect to the cost function
costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve.
"""
costs = [] # 将迭代过程中算出的cost收集起来
for i in range(num_iterations):
# Cost and gradient calculation
grads, cost = propagate(w, b, X, Y)
# Retrieve derivatives from grads
dw = grads["dw"]
db = grads["db"]
# update dw, db
w = w - learning_rate * dw
b = b - learning_rate * db
# Record the costs
if i % 100 == 0:
costs.append(cost)
# Print the cost every 100 iterations
if print_cost and i % 100 == 0:
print("Cost after iteration %i: %f" % (i, cost))
params = {"w": w,
"b": b}
grads = {"dw": dw,
"db": db}
return params, grads, costs
class MyLogisticRegression:
costs = []
params = {} # w, b
grads = {} # dw, db
num_iterations = 0
learning_rate = 0.
print_cost = False
def __init__(self, num_iterations=1000, learning_rate=0.01, print_cost=False):
# 初始化超參數 num_iterations, learning_rate
self.num_iterations = num_iterations
self.learning_rate = learning_rate
self.print_cost = print_cost
return
def fit(self, X, Y):
n_x = X.shape[0] # dim of X
w, b = init_with_zeros(n_x) # initialize w,b with zeros, w.shape=(n_x, 1), b=0 a scaler.
# 前向传播获取cost,反向传播获取grads,并更新params.这种事情做了num_iterations次,学习率为learning_rate
self.params, self.grads, self.costs = optimize(w, b, X, Y, self.num_iterations, self.learning_rate,
self.print_cost)
# fit函数的结果是获取params.顺便得到了grads, costs, 便于我们查看并对costs画图,以检查模型是否学到了东西.
def predict(self, X):
m = X.shape[1] # the number of samples
Y_predict = np.zeros((1, m)) # initialize Y_predict
w = self.params["w"] # 获取已经训练好的 w
b = self.params["b"] # 获取已经训练好的 b
A = sigmoid(np.dot(w.T, X) + b) # 根据 训练好的w,b,计算 p(Y=1|X)
# 将预测概率p(Y=1|X)转换为标签值, 大于0.5的标签值为1,否则为0
for i in range(A.shape[1]):
Y_predict[0, i] = 1 if A[0, i] > 0.5 else 0
assert (Y_predict.shape == (1, m))
return Y_predict
def score(self, X, y):
pass
## 测试用例
train_X, train_y, test_X, test_y, classes = load_data()
cls = MyLogisticRegression(num_iterations=2000, learning_rate=0.005, print_cost=True)
cls.fit(train_X, train_y)
Y_predict_test = cls.predict(test_X)
Y_predict_train = cls.predict(train_X)
print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_predict_train - train_y)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_predict_test - test_y)) * 100))
"""
运行结果
/usr/bin/python2.7 /home/xiang/桌面/ML_Course_20170314/xiang_code/Xiang_ml_in_practice/MyLogisticRegression.py
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.584508
Cost after iteration 200: 0.466949
Cost after iteration 300: 0.376007
Cost after iteration 400: 0.331463
Cost after iteration 500: 0.303273
Cost after iteration 600: 0.279880
Cost after iteration 700: 0.260042
Cost after iteration 800: 0.242941
Cost after iteration 900: 0.228004
Cost after iteration 1000: 0.214820
Cost after iteration 1100: 0.203078
Cost after iteration 1200: 0.192544
Cost after iteration 1300: 0.183033
Cost after iteration 1400: 0.174399
Cost after iteration 1500: 0.166521
Cost after iteration 1600: 0.159305
Cost after iteration 1700: 0.152667
Cost after iteration 1800: 0.146542
Cost after iteration 1900: 0.140872
train accuracy: 99.043062201 %
test accuracy: 70.0 %
"""
Logistic Regression 算法向量化实现及心得的更多相关文章
- 学习Logistic Regression的笔记与理解(转)
学习Logistic Regression的笔记与理解 1.首先从结果往前来看下how logistic regression make predictions. 设我们某个测试数据为X(x0,x1, ...
- Neural Networks and Deep Learning(week2)Logistic Regression with a Neural Network mindset(实现一个图像识别算法)
Logistic Regression with a Neural Network mindset You will learn to: Build the general architecture ...
- 逻辑回归(Logistic Regression)算法小结
一.逻辑回归简述: 回顾线性回归算法,对于给定的一些n维特征(x1,x2,x3,......xn),我们想通过对这些特征进行加权求和汇总的方法来描绘出事物的最终运算结果.从而衍生出我们线性回归的计算公 ...
- 机器学习算法与Python实践之(七)逻辑回归(Logistic Regression)
http://blog.csdn.net/zouxy09/article/details/20319673 机器学习算法与Python实践之(七)逻辑回归(Logistic Regression) z ...
- 【算法】Logistic regression (逻辑回归) 概述
Logistic regression (逻辑回归)是当前业界比较常用的机器学习方法,用于估计某种事物的可能性.比如某用户购买某商品的可能性,某病人患有某种疾病的可能性,以及某广告被用户点击的可能性等 ...
- 分类算法之逻辑回归(Logistic Regression
分类算法之逻辑回归(Logistic Regression) 1.二分类问题 现在有一家医院,想要对病人的病情进行分析,其中有一项就是关于良性\恶性肿瘤的判断,现在有一批数据集是关于肿瘤大小的,任务就 ...
- 通俗地说逻辑回归【Logistic regression】算法(二)sklearn逻辑回归实战
前情提要: 通俗地说逻辑回归[Logistic regression]算法(一) 逻辑回归模型原理介绍 上一篇主要介绍了逻辑回归中,相对理论化的知识,这次主要是对上篇做一点点补充,以及介绍sklear ...
- 机器学习---三种线性算法的比较(线性回归,感知机,逻辑回归)(Machine Learning Linear Regression Perceptron Logistic Regression Comparison)
最小二乘线性回归,感知机,逻辑回归的比较: 最小二乘线性回归 Least Squares Linear Regression 感知机 Perceptron 二分类逻辑回归 Binary Logis ...
- Python机器学习算法 — 逻辑回归(Logistic Regression)
逻辑回归--简介 逻辑回归(Logistic Regression)就是这样的一个过程:面对一个回归或者分类问题,建立代价函数,然后通过优化方法迭代求解出最优的模型参数,然后测试验证我们这个求解的模型 ...
随机推荐
- SQL字符串操作汇总
SQL字符串操作汇总 --将字符串中从某个字符开始截取一段字符,然后将另外一个字符串插入此处 select stuff('hello,world!',4,4,'****') --返回值hel*** ...
- python Django之Ajax
python Django之Ajax AJAX,Asynchronous JavaScript and XML (异步的JavaScript和XML),一种创建交互式网页应用的网页开发技术方案. 异步 ...
- jacascript 函数参数与 arguments 对象
前言:这是笔者学习之后自己的理解与整理.如果有错误或者疑问的地方,请大家指正,我会持续更新! 调用函数时,实参和形参需要一一对应,但如果参数多了的话,会很苦恼: 我们可以用键值对(字面量对象)的方式传 ...
- Java程序优化之替换swtich
关键字switch语句用于多条件判断,功能类似于if-else语句,两者性能也差不多,不能说switch会降低系统性能.在绝大部门情况下,switch语句还是有性能提升空间的. 但是在项目代码中,如果 ...
- ios11,弹出层内的input框光标错位 键盘弹出时,输入信息,光标一直乱跳
之前开发了一个微信项目,维护期中苹果手机突然出现光标错位现象,经过排查,发现是最新的ios11系统的锅. 具体情况:弹出层使用position: fixed:弹出层内附带input/textare ...
- python字符串-内置方法列举
所谓内置方法,就是凡是字符串都能用的方法,这个方法在创建字符串的类中,下面是总结: 首先,我们要学习一个获取帮助的内置函数 help(对象) ,对象可以是一个我们创建出来的,也可以是创建对象的那个类, ...
- 时序数据库(TSDB)-为万物互联插上一双翅膀
本文由 网易云发布. 时序数据库(TSDB)是一种特定类型的数据库,主要用来存储时序数据.随着5G技术的不断成熟,物联网技术将会使得万物互联.物联网时代之前只有手机.电脑可以联网,以后所有设备都会联 ...
- [JLOI 2014]松鼠的新家
Description 松鼠的新家是一棵树,前几天刚刚装修了新家,新家有n个房间,并且有n-1根树枝连接,每个房间都可以相互到达,且俩个房间之间的路线都是唯一的.天哪,他居然真的住在”树“上. 松鼠想 ...
- [CQOI2013]棋盘游戏
Description 一个n*n(n>=2)棋盘上有黑白棋子各一枚.游戏者A和B轮流移动棋子,A先走. A的移动规则:只能移动白棋子.可以往上下左右四个方向之一移动一格. B的移动规则:只能移 ...
- codeforces round #419 E. Karen and Supermarket
On the way home, Karen decided to stop by the supermarket to buy some groceries. She needs to buy a ...