逻辑回归神经网络实现手写数字识别

如果更习惯看Jupyter的形式，请戳Gitthub_逻辑回归softmax神经网络实现手写数字识别.ipynb

1 - 导入模块

import numpy as np

import matplotlib.pyplot as plt

from PIL import  Image

from ld_mnist import load_digits

%matplotlib inline

2 - 导入数据及数据预处理

mnist = load_digits()

Extracting C:/Users/marsggbo/Documents/Code/ML/TF Tutorial/data/MNIST_data\train-images-idx3-ubyte.gz

Extracting C:/Users/marsggbo/Documents/Code/ML/TF Tutorial/data/MNIST_data\train-labels-idx1-ubyte.gz

Extracting C:/Users/marsggbo/Documents/Code/ML/TF Tutorial/data/MNIST_data\t10k-images-idx3-ubyte.gz

Extracting C:/Users/marsggbo/Documents/Code/ML/TF Tutorial/data/MNIST_data\t10k-labels-idx1-ubyte.gz

print("Train: "+ str(mnist.train.images.shape))

print("Train: "+ str(mnist.train.labels.shape))

print("Test: "+ str(mnist.test.images.shape))

print("Test: "+ str(mnist.test.labels.shape))

Train: (55000, 784)

Train: (55000, 10)

Test: (10000, 784)

Test: (10000, 10)

mnist数据采用的是TensorFlow的一个函数进行读取的，由上面的结果可以知道训练集数据X_train有55000个，每个X的数据长度是784（28*28）。

另外由于数据集的数量较多，所以TensorFlow提供了批量提取数据的方法，从而大大提高了运行速率，方法如下：

x_batch, y_batch = mnist.train.next_batch(100)

print(x_batch.shape)

print(y_batch.shape)

>>>

(100, 784)

(100, 10)

x_train, y_train, x_test, y_test = mnist.train.images, mnist.train.labels, mnist.test.images, mnist.test.labels

因为训练集的数据太大，所以可以再划分成训练集，验证集，测试集，比例为6:2:2

x_train_batch, y_train_batch = mnist.train.next_batch(30000)

x_cv_batch, y_cv_batch = mnist.train.next_batch(15000)

x_test_batch, y_test_batch = mnist.train.next_batch(10000)

print(x_train_batch.shape)

print(y_cv_batch.shape)

print(y_test_batch.shape)

(30000, 784)

(15000, 10)

(10000, 10)

展示手写数字

nums = 6

for i in range(1,nums+1):

    plt.subplot(1,nums,i)

    plt.imshow(x_train[i].reshape(28,28), cmap="gray")

3 - 算法介绍

3.1 算法

对单个样本数据 \(x^{(i)}\):

\[z^{(i)} = w^T x^{(i)} + b \tag{1}
\]

\[\hat{y}^{(i)} = a^{(i)} = softmax(z^{(i)})\tag{2}
\]

损失函数

训练数据集总的损失函数表达式

需要注意的是公式(1)中的\(w^Tx^{(i)}\)，这个需要视情况而定,因为需要根据数据维度的不同而进行改变。例如在本次项目中，\(x∈R^{55000 × 784}, w∈R^{784 × 10},y∈R^{55000×10}\)，所以\(z^{(i)} = x^{(i)}w + b\)

关键步骤

初始化模型参数
使用参数最小化cost function
使用学习得到的参数进行预测
分析结果和总结

3.2 初始化模型参数

# 初始化模型参数

def init_params(dim1, dim2):

    '''

    dim: 表示权重w的个数，一般来说w维度要与样本x_train.shape[1]和y_train.shape[1]相匹配

    '''

    w = np.zeros((dim1,dim2))

    return w

w  = init_params(2,1)

print(w)

[[ 0.]

 [ 0.]]

3.3 定义softmax函数

参考Python - softmax 实现

def softmax(x):

    """

    Compute the softmax function for each row of the input x.

    Arguments:

    x -- A N dimensional vector or M x N dimensional numpy matrix.

    Return:

    x -- You are allowed to modify x in-place

    """

    orig_shape = x.shape

    if len(x.shape) > 1:

        # Matrix

        exp_minmax = lambda x: np.exp(x - np.max(x))

        denom = lambda x: 1.0 / np.sum(x)

        x = np.apply_along_axis(exp_minmax,1,x)

        denominator = np.apply_along_axis(denom,1,x) 

        if len(denominator.shape) == 1:

            denominator = denominator.reshape((denominator.shape[0],1))

        x = x * denominator

    else:

        # Vector

        x_max = np.max(x)

        x = x - x_max

        numerator = np.exp(x)

        denominator =  1.0 / np.sum(numerator)

        x = numerator.dot(denominator)

    assert x.shape == orig_shape

    return x

a = np.array([[1,2,3,4],[1,2,3,4]])

print(softmax(a))

np.sum(softmax(a))

[[ 0.0320586   0.08714432  0.23688282  0.64391426]

 [ 0.0320586   0.08714432  0.23688282  0.64391426]]

2.0

3.4 - 前向&反向传播(Forward and Backward propagation)

参数初始化后，可以开始实现FP和BP算法来让参数自学习了。

Forward Propagation:

获取数据X
计算 \(A = softmax(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})\)
计算 cost function:

def propagation(w, c, X, Y):

    '''

    前向传播

    '''

    m = X.shape[0]

    A = softmax(np.dot(X,w))

    J  = -1/m * np.sum(Y*np.log(A)) + 0.5*c*np.sum(w*w)

    dw = -1/m * np.dot(X.T, (Y-A)) + c*w

    update = {"dw":dw, "cost": J}

    return update

def optimization(w, c, X, Y, learning_rate=0.1, iterations=1000, print_info=False):

    '''

    反向优化

    '''

    costs = []

    for i in range(iterations):

        update = propagation(w, c, X, Y)

        w -= learning_rate * update['dw']

        if i %100==0:

            costs.append(update['cost'])

        if i%100==0 and print_info==True:

            print("Iteration " + str(i+1) + " Cost = " + str(update['cost']))

    results = {'w':w, 'costs': costs}

    return results

def predict(w, X):

    '''

    预测

    '''

    return softmax(np.dot(X, w))

def accuracy(y_hat, Y):

    '''

    统计准确率

    '''

    max_index = np.argmax(y_hat, axis=1)

    y_hat[np.arange(y_hat.shape[0]), max_index] = 1

    accuracy = np.sum(np.argmax(y_hat, axis=1)==np.argmax(Y, axis=1))

    accuracy = accuracy *1.0/Y.shape[0]

    return accuracy

def model(w, c, X, Y, learning_rate=0.1, iterations=1000, print_info=False):

    results = optimization(w, c, X, Y, learning_rate, iterations, print_info)

    w = results['w']

    costs = results['costs']

    y_hat = predict(w, X)

    accuracy = accuracy(y_hat, Y)

    print("After %d iterations,the total accuracy is %f"%(iterations, accuracy))

    results = {

        'w':w,

        'costs':costs,

        'accuracy':accuracy,

        'iterations':iterations,

        'learning_rate':learning_rate,

        'y_hat':y_hat,

        'c':c

    }

    return results

4 - 验证模型

w = init_params(x_train_batch.shape[1], y_train_batch.shape[1])

c = 0

results_train = model(w, c, x_train_batch, y_train_batch, learning_rate=0.3, iterations=1000, print_info=True)

print(results_train)

Iteration 1 Cost = 2.30258509299

Iteration 101 Cost = 0.444039646187

Iteration 201 Cost = 0.383446527394

Iteration 301 Cost = 0.357022940232

Iteration 401 Cost = 0.341184601147

Iteration 501 Cost = 0.330260258921

Iteration 601 Cost = 0.322097106964

Iteration 701 Cost = 0.315671301537

Iteration 801 Cost = 0.310423971361

Iteration 901 Cost = 0.306020145234

After 1000 iterations,the total accuracy is 0.915800

{'w': array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],

       [ 0.,  0.,  0., ...,  0.,  0.,  0.],

       [ 0.,  0.,  0., ...,  0.,  0.,  0.],

       ...,

       [ 0.,  0.,  0., ...,  0.,  0.,  0.],

       [ 0.,  0.,  0., ...,  0.,  0.,  0.],

       [ 0.,  0.,  0., ...,  0.,  0.,  0.]]), 'costs': [2.302585092994045, 0.44403964618714781, 0.38344652739376933, 0.35702294023246306, 0.34118460114650634, 0.33026025892089478, 0.32209710696427363, 0.31567130153696982, 0.31042397136133199, 0.30602014523405535], 'accuracy': 0.91579999999999995, 'iterations': 1000, 'learning_rate': 0.3, 'y_hat': array([[  1.15531353e-03,   1.72628369e-09,   2.24683134e-03, ...,

          4.06392375e-08,   1.19337142e-04,   2.07493343e-06],

       [  1.41786837e-01,   1.11756123e-03,   2.79188805e-02, ...,

          6.80002693e-03,   1.00000000e+00,   1.25721652e-01],

       [  9.52758112e-05,   1.41141596e-06,   2.04835561e-03, ...,

          1.21014773e-04,   2.50044218e-02,   1.00000000e+00],

       ...,

       [  1.79945865e-07,   6.74560778e-05,   1.53151951e-05, ...,

          2.44907396e-05,   1.71333912e-04,   1.08085629e-02],

       [  2.59724603e-05,   6.36785472e-10,   1.00000000e+00, ...,

          2.70273729e-08,   2.10287536e-06,   2.48876734e-08],

       [  1.00000000e+00,   9.96462215e-15,   5.55562364e-08, ...,

          2.01973615e-08,   1.57821049e-07,   3.37994451e-09]]), 'c': 0}

plt.plot(results_train['costs'])

[<matplotlib.lines.Line2D at 0x283b1d75ef0>]

params = [[0, 0.3],[0,0.5],[5,0.3],[5,0.5]]

results_cv = {}

for i in range(len(params)):

    result = model(results_train['w'],0, x_cv_batch, y_cv_batch, learning_rate=0.5, iterations=1000, print_info=False)

    print("{0} iteration done!".format(i))

    results_cv[i] = result

After 1000 iterations,the total accuracy is 0.931333

0 iteration done!

After 1000 iterations,the total accuracy is 0.936867

1 iteration done!

After 1000 iterations,the total accuracy is 0.940200

2 iteration done!

After 1000 iterations,the total accuracy is 0.942200

3 iteration done!

for i in range(len(params)):

    print("{0} iteration accuracy: {1} ".format(i+1, results_cv[i]['accuracy']))

for i in range(len(params)):

    plt.subplot(len(params), 1,i+1)

    plt.plot(results_cv[i]['costs'])

1 iteration accuracy: 0.9313333333333333

2 iteration accuracy: 0.9368666666666666

3 iteration accuracy: 0.9402

4 iteration accuracy: 0.9422

验证测试集准确率

y_hat_test = predict(w, x_test_batch)

accu = accuracy(y_hat_test, y_test_batch)

print(accu)

0.9111

5 - 测试真实手写数字

读取之前保存的权重数据

# w = results_cv[3]['w']

# np.save('weights.npy',w)

w = np.load('weights.npy')

w.shape

(784, 10)

图片转化成txt的代码可参考python实现图片转化成可读文件

# 已经将图片转化成txt格式

files = ['3.txt','31.txt','5.txt','8.txt','9.txt','6.txt','91.txt']

# 将txt数据转化成np.array

def pic2np(file):

    with open(file, 'r') as f:

        x = f.readlines()

        data = []

        for i in range(len(x)):

            x[i] = x[i].split('\n')[0]

            for j in range(len(x[0])):

                data.append(int(x[i][j]))

        data = np.array(data)

        return data.reshape(-1,784)

# 验证准确性

i = 1

count = 0

for file in files:

    x = pic2np(file)

    y = np.argmax(predict(w, x))

    print("实际值{0}-预测值{1}".format( int(file.split('.')[0][0]) , y) )

    if y == int(file.split('.')[0][0]):

        count += 1

    plt.subplot(2, len(files), i)

    plt.imshow(x.reshape(28,28))

    i += 1

print("准确率为{0}".format(count/len(files)))

实际值3-预测值6

实际值3-预测值3

实际值5-预测值3

实际值8-预测值3

实际值9-预测值3

实际值6-预测值6

实际值9-预测值7

准确率为0.2857142857142857

由上面的结果可见我自己写的数字还是蛮有个性的。。。。居然7个只认对了2个。看来算法还是需要提高的

6 - Softmax 梯度下降算法推导

softmax损失函数求导推导过程

softmax分类算法原理(用python实现)的更多相关文章

Logistic回归分类算法原理分析与代码实现
前言本文将介绍机器学习分类算法中的Logistic回归分类算法并给出伪代码,Python代码实现. (说明:从本文开始,将接触到最优化算法相关的学习.旨在将这些最优化的算法用于训练出一个非线性的函数 ...
第一篇：K-近邻分类算法原理分析与代码实现
前言本文介绍机器学习分类算法中的K-近邻算法并给出伪代码与Python代码实现. 算法原理首先获取训练集中与目标对象距离最近的k个对象,然后再获取这k个对象的分类标签,求出其中出现频数最大的标签. ...
第七篇：Logistic回归分类算法原理分析与代码实现
前言本文将介绍机器学习分类算法中的Logistic回归分类算法并给出伪代码,Python代码实现. (说明:从本文开始,将接触到最优化算法相关的学习.旨在将这些最优化的算法用于训练出一个非线性的函数 ...
感知器做二分类的原理及python实现
本文目录: 1. 感知器 2. 感知器的训练法则 3. 梯度下降和delta法则 4. python实现 1. 感知器[1] 人工神经网络以感知器(perceptron)为基础.感知器以一个实数值向量 ...
K近邻分类算法实现 in Python
K近邻(KNN):分类算法 * KNN是non-parametric分类器(不做分布形式的假设,直接从数据估计概率密度),是memory-based learning. * KNN不适用于高维数据(c ...
深入学习主成分分析（PCA）算法原理（Python实现）
一:引入问题首先看一个表格,下表是某些学生的语文,数学,物理,化学成绩统计: 首先,假设这些科目成绩不相关,也就是说某一科目考多少分与其他科目没有关系,那么如何判断三个学生的优秀程度呢?首先我们一眼 ...
KNN算法原理（python代码实现）
kNN(k-nearest neighbor algorithm)算法的核心思想是如果一个样本在特征空间中的k个最相邻的样本中的大多数属于某一个类别,则该样本也属于这个类别,并具有这个类别上样本的特性 ...
【机器学习】：Kmeans均值聚类算法原理(附带Python代码实现)
这个算法中文名为k均值聚类算法,首先我们在二维的特殊条件下讨论其实现的过程,方便大家理解. 第一步.随机生成质心由于这是一个无监督学习的算法,因此我们首先在一个二维的坐标轴下随机给定一堆点,并随即给 ...
（数据科学学习手札13）K-medoids聚类算法原理简介&Python与R的实现
前几篇我们较为详细地介绍了K-means聚类法的实现方法和具体实战,这种方法虽然快速高效,是大规模数据聚类分析中首选的方法,但是它也有一些短板,比如在数据集中有脏数据时,由于其对每一个类的准则函数为平 ...

随机推荐

关于svg
动画:css3动画,canvas(js动画),svg(html动画). svg基本元素 version: 表示 <svg> 的版本,目前只有 1.0,1.1 两种 xmlns:http:/ ...
JAVAFX-5事件总结
事件监听在RIA 或者说桌面客户端gui android 开发中,事件的机制是必须的要学习了解的, 分类处理类型在Java GUI 和swing中,事件通常通过实现listener的接口函数,并 ...
Qt颜色下拉框
上周为了用Qt写一个类似颜色下拉框的东西,查阅了网上的多数相关资料,依然没有我想要的.终于在周四的时候下定决心重写QCombobox类来实现功能,现在把它贴出来,望看到的人,批评指正.废话不多说,先上 ...
CentOS6.5 下Nginx 的安装与配置
昨天买了个服务器最近在配置一些基础环境,想在访问www.wzpbk.com:8080 不想要后面的:8080就能直接访问到,听说了Nginx就研究下给服务器装上传说中大名鼎鼎 Nginx 他能反向代 ...
ViewPager使用记录3——循环展示
ViewPager是v4支持库中的一个控件,相信几乎所有接触Android开发的人都对它不陌生.之所以还要在这里翻旧账,是因为我在最近的项目中有多个需求用到了它,觉得自己对它的认识不够深刻.我计划从最 ...
jq，返回上一页，小记history.back(-1)和history.go(-1)区别
<input type="button" name="back" value="重新填写" onclick="javascr ...
Pyhton爬虫实战 - 抓取BOSS直聘职位描述和数据清洗
Pyhton爬虫实战 - 抓取BOSS直聘职位描述和数据清洗零.致谢感谢BOSS直聘相对权威的招聘信息,使本人有了这次比较有意思的研究之旅. 由于爬虫持续爬取 www.zhipin.com 网 ...
Hibernate框架进阶（上篇）
导读前面一片文章介绍了Hibernate框架的入门,主要是讲解Hibernate的环境搭建和简单测试,有兴趣的童鞋出门左转.本文在入门的基础上进行Hibernate的进阶讲解,分为上中下三篇,本篇为 ...
Android自定义processor实现bindView功能
一.简介在现阶段的Android开发中,注解越来越流行起来,比如ButterKnife,Retrofit,Dragger,EventBus等等都选择使用注解来配置.按照处理时期,注解又分为两种类型, ...
C. Kyoya and Colored Balls(Codeforces Round #309 (Div. 2))
C. Kyoya and Colored Balls Kyoya Ootori has a bag with n colored balls that are colored with k diffe ...

softmax分类算法原理(用python实现)