1 概述

基础的理论知识参考线性SVM与Softmax分类器

代码实现环境:python3

2 数据预处理

2.1 加载数据

将原始数据集放入“data/cifar10/”文件夹下。

  1. ### 加载cifar10数据集
  2. import os
  3. import pickle
  4. import random
  5. import numpy as np
  6. import matplotlib.pyplot as plt
  7. def load_CIFAR_batch(filename):
  8. """
  9. cifar-10数据集是分batch存储的,这是载入单个batch
  10. @参数 filename: cifar文件名
  11. @r返回值: X, Y: cifar batch中的 data 和 labels
  12. """
  13. with open(filename,'rb') as f:
  14. datadict=pickle.load(f,encoding='bytes')
  15. X=datadict[b'data']
  16. Y=datadict[b'labels']
  17. X=X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float")
  18. Y=np.array(Y)
  19. return X, Y
  20. def load_CIFAR10(ROOT):
  21. """
  22. 读取载入整个 CIFAR-10 数据集
  23. @参数 ROOT: 根目录名
  24. @return: X_train, Y_train: 训练集 data 和 labels
  25. X_test, Y_test: 测试集 data 和 labels
  26. """
  27. xs=[]
  28. ys=[]
  29. for b in range(1,6):
  30. f=os.path.join(ROOT, "data_batch_%d" % (b, ))
  31. X, Y=load_CIFAR_batch(f)
  32. xs.append(X)
  33. ys.append(Y)
  34. X_train=np.concatenate(xs)
  35. Y_train=np.concatenate(ys)
  36. del X, Y
  37. X_test, Y_test=load_CIFAR_batch(os.path.join(ROOT, "test_batch"))
  38. return X_train, Y_train, X_test, Y_test
  39. X_train, y_train, X_test, y_test = load_CIFAR10('data/cifar10/')
  40. print(X_train.shape)
  41. print(y_train.shape)
  42. print(X_test.shape)
  43. print( y_test.shape)

运行结果如下:

  1. (50000, 32, 32, 3)
  2. (50000,)
  3. (10000, 32, 32, 3)
  4. (10000,)

2.2 划分数据集

将加载好的数据集划分为训练集,验证集,测试集。

  1. # 划分训练集,验证集,测试集
  2. num_train = 49000
  3. num_val = 1000
  4. num_test = 1000
  5. num_dev = 500#也是验证集,调节超参数使用
  6. # Validation set
  7. mask = range(num_train, num_train + num_val)
  8. X_val = X_train[mask]
  9. y_val = y_train[mask]
  10. # Train set
  11. mask = range(num_train)
  12. X_train = X_train[mask]
  13. y_train = y_train[mask]
  14. # Test set
  15. mask = range(num_test)
  16. X_test = X_test[mask]
  17. y_test = y_test[mask]
  18. # Development set
  19. mask = np.random.choice(num_train, num_dev, replace=False)
  20. X_dev = X_train[mask]
  21. y_dev = y_train[mask]
  22. #Reshape the images data into rows
  23. X_train = np.reshape(X_train, (X_train.shape[0], -1))
  24. X_val = np.reshape(X_val, (X_val.shape[0], -1))
  25. X_test = np.reshape(X_test, (X_test.shape[0], -1))
  26. X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))
  27. print('Train data shape: ', X_train.shape)
  28. print('Validation data shape: ', X_val.shape)
  29. print('Test data shape: ', X_test.shape)
  30. print('Development data shape: ', X_dev.shape)

运行结果如下:

  1. Train data shape: (49000, 3072)
  2. Validation data shape: (1000, 3072)
  3. Test data shape: (1000, 3072)
  4. Development data shape: (500, 3072)

2.3 归一化

将划分好的数据集归一化,即:所有划分好的数据集减去均值图像。

  1. # Processing: subtract the mean images
  2. mean_image = np.mean(X_train, axis=0)
  3. X_train -= mean_image
  4. X_val -= mean_image
  5. X_test -= mean_image
  6. X_dev -= mean_image

3 线性Softmax分类器

3.1 定义线性Softmax分类器

  1. #Define a linear Softmax classifier
  2. class Softmax(object):
  3. def __init__(self):
  4. self.W = None
  5. def loss_vectorized(self, X, y, reg):
  6. """
  7. Structured Softmax loss function, vectorized implementation (without loops).
  8. Inputs:
  9. - X: A numpy array of shape (num_train, D) contain the training data
  10. consisting of num_train samples each of dimension D
  11. - y: A numpy array of shape (num_train,) contain the training labels,
  12. where y[i] is the label of X[i]
  13. - reg: float, regularization strength
  14. Return:
  15. - loss: the loss value between predict value and ground truth
  16. - dW: gradient of W
  17. """
  18. # Initialize loss and dW
  19. loss = 0.0
  20. dW = np.zeros(self.W.shape)
  21. # Compute the loss and dW
  22. num_train = X.shape[0]
  23. num_classes = self.W.shape[1]
  24. # loss
  25. scores = np.dot(X, self.W)
  26. scores -= np.max(scores, axis=1).reshape(-1, 1)
  27. softmax_output = np.exp(scores) / np.sum(np.exp(scores), axis=1).reshape(-1, 1)
  28. loss = np.sum(-np.log(softmax_output[range(softmax_output.shape[0]), list(y)]))
  29. loss /= num_train
  30. loss += 0.5 * reg * np.sum(self.W * self.W)
  31. # dW
  32. dS = softmax_output
  33. dS[range(dS.shape[0]), list(y)] += -1
  34. dW = np.dot(X.T, dS)
  35. dW /= num_train
  36. dW += reg * self.W
  37. return loss, dW
  38. def train(self, X, y, learning_rate = 1e-3, reg = 1e-5, num_iters = 100,
  39. batch_size = 200, print_flag = False):
  40. """
  41. Train Softmax classifier using SGD
  42. Inputs:
  43. - X: A numpy array of shape (num_train, D) contain the training data
  44. consisting of num_train samples each of dimension D
  45. - y: A numpy array of shape (num_train,) contain the training labels,
  46. where y[i] is the label of X[i], y[i] = c, 0 <= c <= C
  47. - learning rate: (float) learning rate for optimization
  48. - reg: (float) regularization strength
  49. - num_iters: (integer) numbers of steps to take when optimization
  50. - batch_size: (integer) number of training examples to use at each step
  51. - print_flag: (boolean) If true, print the progress during optimization
  52. Outputs:
  53. - loss_history: A list containing the loss at each training iteration
  54. """
  55. loss_history = []
  56. num_train = X.shape[0]
  57. dim = X.shape[1]
  58. num_classes = np.max(y) + 1
  59. # Initialize W
  60. if self.W == None:
  61. self.W = 0.001 * np.random.randn(dim, num_classes)
  62. # iteration and optimization
  63. for t in range(num_iters):
  64. idx_batch = np.random.choice(num_train, batch_size, replace=True)
  65. X_batch = X[idx_batch]
  66. y_batch = y[idx_batch]
  67. loss, dW = self.loss_vectorized(X_batch, y_batch, reg)
  68. loss_history.append(loss)
  69. self.W += -learning_rate * dW
  70. if print_flag and t%100 == 0:
  71. print('iteration %d / %d: loss %f' % (t, num_iters, loss))
  72. return loss_history
  73. def predict(self, X):
  74. """
  75. Use the trained weights of Softmax to predict data labels
  76. Inputs:
  77. - X: A numpy array of shape (num_train, D) contain the training data
  78. Outputs:
  79. - y_pred: A numpy array, predicted labels for the data in X
  80. """
  81. y_pred = np.zeros(X.shape[0])
  82. scores = np.dot(X, self.W)
  83. y_pred = np.argmax(scores, axis=1)
  84. return y_pred

3.2 无交叉验证

3.2.1 训练模型

  1. # 训练
  2. softmax = Softmax()
  3. loss_history = softmax.train(X_train, y_train, learning_rate = 1e-7, reg = 2.5e4, num_iters = 1500,
  4. batch_size = 200, print_flag = True)

运行结果如下:

  1. iteration 0 / 1500: loss 386.819945
  2. iteration 100 / 1500: loss 233.345487
  3. iteration 200 / 1500: loss 141.912560
  4. iteration 300 / 1500: loss 86.616391
  5. iteration 400 / 1500: loss 53.114667
  6. iteration 500 / 1500: loss 32.912990
  7. iteration 600 / 1500: loss 20.637937
  8. iteration 700 / 1500: loss 13.341617
  9. iteration 800 / 1500: loss 8.934886
  10. iteration 900 / 1500: loss 6.200619
  11. iteration 1000 / 1500: loss 4.516009
  12. iteration 1100 / 1500: loss 3.514955
  13. iteration 1200 / 1500: loss 2.883086
  14. iteration 1300 / 1500: loss 2.538239
  15. iteration 1400 / 1500: loss 2.365773

3.2.2 预测模型

  1. # Training set
  2. y_pred = softmax.predict(X_train)
  3. num_correct = np.sum(y_pred == y_train)
  4. accuracy = np.mean(y_pred == y_train)
  5. print('Training correct %d/%d: The accuracy is %f' % (num_correct, X_train.shape[0], accuracy))
  6. # Test set
  7. y_pred = softmax.predict(X_test)
  8. num_correct = np.sum(y_pred == y_test)
  9. accuracy = np.mean(y_pred == y_test)
  10. print('Test correct %d/%d: The accuracy is %f' % (num_correct, X_test.shape[0], accuracy))

运行结果如下:

  1. Training correct 17246/49000: The accuracy is 0.351959
  2. Test correct 358/1000: The accuracy is 0.358000

3.3 有交叉验证

3.3.1 训练模型


  1. learning_rates = [1.4e-7, 1.5e-7, 1.6e-7]
  2. regularization_strengths = [8000.0, 9000.0, 10000.0, 11000.0, 18000.0, 19000.0, 20000.0, 21000.0]
  3. results = {}
  4. best_lr = None
  5. best_reg = None
  6. best_val = -1 # The highest validation accuracy that we have seen so far.
  7. best_softmax = None # The LinearSVM object that achieved the highest validation rate.
  8. for lr in learning_rates:
  9. for reg in regularization_strengths:
  10. softmax = Softmax()
  11. loss_history = softmax.train(X_train, y_train, learning_rate = lr, reg = reg, num_iters = 3000)
  12. y_train_pred = softmax.predict(X_train)
  13. accuracy_train = np.mean(y_train_pred == y_train)
  14. y_val_pred = softmax.predict(X_val)
  15. accuracy_val = np.mean(y_val_pred == y_val)
  16. results[(lr, reg)] = accuracy_train, accuracy_val
  17. if accuracy_val > best_val:
  18. best_lr = lr
  19. best_reg = reg
  20. best_val = accuracy_val
  21. best_softmax = softmax
  22. print('lr: %e reg: %e train accuracy: %f val accuracy: %f' %
  23. (lr, reg, results[(lr, reg)][0], results[(lr, reg)][1]))
  24. print('Best validation accuracy during cross-validation:\nlr = %e, reg = %e, best_val = %f' %
  25. (best_lr, best_reg, best_val))

运行结果为:

  1. lr: 1.400000e-07 reg: 8.000000e+03 train accuracy: 0.378184 val accuracy: 0.391000
  2. lr: 1.400000e-07 reg: 9.000000e+03 train accuracy: 0.374714 val accuracy: 0.387000
  3. lr: 1.400000e-07 reg: 1.000000e+04 train accuracy: 0.376000 val accuracy: 0.391000
  4. lr: 1.400000e-07 reg: 1.100000e+04 train accuracy: 0.373898 val accuracy: 0.387000
  5. lr: 1.400000e-07 reg: 1.800000e+04 train accuracy: 0.360347 val accuracy: 0.373000
  6. lr: 1.400000e-07 reg: 1.900000e+04 train accuracy: 0.354612 val accuracy: 0.379000
  7. lr: 1.400000e-07 reg: 2.000000e+04 train accuracy: 0.357184 val accuracy: 0.379000
  8. lr: 1.400000e-07 reg: 2.100000e+04 train accuracy: 0.357061 val accuracy: 0.380000
  9. lr: 1.500000e-07 reg: 8.000000e+03 train accuracy: 0.378633 val accuracy: 0.397000
  10. lr: 1.500000e-07 reg: 9.000000e+03 train accuracy: 0.377918 val accuracy: 0.399000
  11. lr: 1.500000e-07 reg: 1.000000e+04 train accuracy: 0.376347 val accuracy: 0.383000
  12. lr: 1.500000e-07 reg: 1.100000e+04 train accuracy: 0.374469 val accuracy: 0.391000
  13. lr: 1.500000e-07 reg: 1.800000e+04 train accuracy: 0.362714 val accuracy: 0.373000
  14. lr: 1.500000e-07 reg: 1.900000e+04 train accuracy: 0.358633 val accuracy: 0.370000
  15. lr: 1.500000e-07 reg: 2.000000e+04 train accuracy: 0.358939 val accuracy: 0.373000
  16. lr: 1.500000e-07 reg: 2.100000e+04 train accuracy: 0.360367 val accuracy: 0.379000
  17. lr: 1.600000e-07 reg: 8.000000e+03 train accuracy: 0.378143 val accuracy: 0.397000
  18. lr: 1.600000e-07 reg: 9.000000e+03 train accuracy: 0.372449 val accuracy: 0.386000
  19. lr: 1.600000e-07 reg: 1.000000e+04 train accuracy: 0.376184 val accuracy: 0.379000
  20. lr: 1.600000e-07 reg: 1.100000e+04 train accuracy: 0.369776 val accuracy: 0.377000
  21. lr: 1.600000e-07 reg: 1.800000e+04 train accuracy: 0.359735 val accuracy: 0.378000
  22. lr: 1.600000e-07 reg: 1.900000e+04 train accuracy: 0.359653 val accuracy: 0.374000
  23. lr: 1.600000e-07 reg: 2.000000e+04 train accuracy: 0.356041 val accuracy: 0.370000
  24. lr: 1.600000e-07 reg: 2.100000e+04 train accuracy: 0.353694 val accuracy: 0.370000
  25. Best validation accuracy during cross-validation:
  26. lr = 1.500000e-07, reg = 9.000000e+03, best_val = 0.399000

3.3.2 预测模型

  1. #Use the best softmax to test
  2. y_pred = best_softmax.predict(X_test)
  3. num_correct = np.sum(y_pred == y_test)
  4. accuracy = np.mean(y_pred == y_test)
  5. print('Test correct %d/%d: The accuracy is %f' % (num_correct, num_test, accuracy))

运行结果如下:

  1. Test correct 375/1000: The accuracy is 0.375000

补充:线性SVM分类器与线性Softmax分类器只是损失函数不一样!!!

线性Softmax分类器实战的更多相关文章

  1. 线性SVM分类器实战

    1 概述 基础的理论知识参考线性SVM与Softmax分类器. 代码实现环境:python3 2 数据处理 2.1 加载数据集 将原始数据集放入"data/cifar10/"文件夹 ...

  2. 深度学习与计算机视觉系列(3)_线性SVM与SoftMax分类器

    作者: 寒小阳 &&龙心尘 时间:2015年11月. 出处: http://blog.csdn.net/han_xiaoyang/article/details/49949535 ht ...

  3. 线性SVM与Softmax分类器

    1 引入 上一篇介绍了图像分类问题.图像分类的任务,就是从已有的固定分类标签集合中选择一个并分配给一张图像.我们还介绍了k-Nearest Neighbor (k-NN)分类器,该分类器的基本思想是通 ...

  4. 基于sklearn的分类器实战

    已迁移到我新博客,阅读体验更佳基于sklearn的分类器实战 完整代码实现见github:click me 一.实验说明 1.1 任务描述 1.2 数据说明 一共有十个数据集,数据集中的数据属性有全部 ...

  5. Logistic 分类器与 softmax分类器

    首先说明啊:logistic分类器是以Bernoulli(伯努利) 分布为模型建模的,它可以用来分两种类别:而softmax分类器以多项式分布(Multinomial Distribution)为模型 ...

  6. LDA(线性判别分类器)学习笔记

    Linear Discriminant Analysis(线性判别分类器)是对费舍尔的线性鉴别方法(FLD)的归纳,属于监督学习的方法. LDA的基本思想是将高维的模式样本投影到最佳鉴别矢量空间,以达 ...

  7. [DeeplearningAI笔记]序列模型2.6Word2Vec/Skip-grams/hierarchical softmax classifier 分级softmax 分类器

    5.2自然语言处理 觉得有用的话,欢迎一起讨论相互学习~Follow Me 2.6 Word2Vec Word2Vec相对于原先介绍的词嵌入的方法来说更加的简单快速. Mikolov T, Chen ...

  8. softmax分类器+cross entropy损失函数的求导

    softmax是logisitic regression在多酚类问题上的推广,\(W=[w_1,w_2,...,w_c]\)为各个类的权重因子,\(b\)为各类的门槛值.不要想象成超平面,否则很难理解 ...

  9. 【CS231N】3、Softmax分类器

    wiki百科:softmax函数的本质就是将一个K维的任意实数向量压缩(映射)成另一个K维的实数向量,其中向量中的每个元素取值都介于(0,1)之间. 一.疑问 二.知识点 1. softmax函数公式 ...

随机推荐

  1. Swift 学习笔记 (函数)

    函数 函数是一个独立的代码块,用来执行特定的任务.Swift中的函数与Object-C中的函数一样,但是声明与书写的方式不太一样,现在我们就通过几个例子介绍一下Swift中的函数.简单的来说,他与JS ...

  2. PAT 天梯赛 L2-022. 重排链表 【数据结构】

    题目链接 https://www.patest.cn/contests/gplt/L2-022 思路 先用结构体 把每个结点信息保存下来 然后深搜一下 遍历一下整个链表 然后就重新排一下 但是要注意一 ...

  3. Android Weekly Notes Issue #276

    September 24th, 2017 Android Weekly Issue #276 本期内容包括LifeCycle与Architecture的相关文章,以及新的JSON解析库Moshi的介绍 ...

  4. Linux线程的几种结束方式

    Linux创建线程使用 int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) ...

  5. 20145239 实验一 Java开发环境的熟悉(Windows + IDEA)

    实验一 Java开发环境的熟悉(Windows + IDEA) 实验内容 1.使用JDK编译.运行简单的Java程序:2.使用Eclipse 编辑.编译.运行.调试Java程序. 实验知识点 1.JV ...

  6. HDU 1032 The 3n + 1 problem (这个题必须写博客)

    The 3n + 1 problem Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Other ...

  7. .net中后台c#数组与前台js数组交互

    第一步:定义cs数组  cs文件里后台程序中要有数组,这个数组要定义成公共的数组.  public string[] lat = null;  public string[] lng = null; ...

  8. VC++动态链接库(DLL)编程深入浅出:Q&A(原创)

    Q1:extern “C” 是做什么用的? A1:一种情况是多个文件中,变量声明或者函数声明,需要extern “C”,这种情况在这里不做讨论. 在dll工程中,被extern "C&quo ...

  9. CardView以及RecycleView的一些问题

    下面这些属性在listview的标签里有用,在recyclerView里没用. tools:listitem="@layout/list_single_answer_item_borrowe ...

  10. 集训Day1

    雅礼集训2017Day1的题 感觉上不可做实际上还挺简单的吧 T1 区间加 区间除法向下取整 查询区间和 区间最小值 大力上线段树,把除法标记推到底,加法标记就是按照线段树的来 先拿30 然后60的数 ...