CS231n 2016 通关第三章-Softmax 作业

在完成SVM作业的基础上，Softmax的作业相对比较轻松。

完成本作业需要熟悉与掌握的知识：

cell 1 设置绘图默认参数

 mport random

 import numpy as np

 from cs231n.data_utils import load_CIFAR10

 import matplotlib.pyplot as plt

 %matplotlib inline

 plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots

 plt.rcParams['image.interpolation'] = 'nearest'

 plt.rcParams['image.cmap'] = 'gray'

 # for auto-reloading extenrnal modules

 # see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython

 %load_ext autoreload

 %autoreload 2

cell 2 读取数据，并显示各个数据的尺寸：

 def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000, num_dev=500):

   """

   Load the CIFAR-10 dataset from disk and perform preprocessing to prepare

   it for the linear classifier. These are the same steps as we used for the

   SVM, but condensed to a single function.

   """

   # Load the raw CIFAR-10 data

   cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'

   X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

   # subsample the data

   mask = range(num_training, num_training + num_validation)

   X_val = X_train[mask]

   y_val = y_train[mask]

   mask = range(num_training)

   X_train = X_train[mask]

   y_train = y_train[mask]

   mask = range(num_test)

   X_test = X_test[mask]

   y_test = y_test[mask]

   mask = np.random.choice(num_training, num_dev, replace=False)

   X_dev = X_train[mask]

   y_dev = y_train[mask]

   # Preprocessing: reshape the image data into rows

   X_train = np.reshape(X_train, (X_train.shape[0], -1))

   X_val = np.reshape(X_val, (X_val.shape[0], -1))

   X_test = np.reshape(X_test, (X_test.shape[0], -1))

   X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))

   # Normalize the data: subtract the mean image

   mean_image = np.mean(X_train, axis = 0)

   X_train -= mean_image

   X_val -= mean_image

   X_test -= mean_image

   X_dev -= mean_image

   # add bias dimension and transform into columns

   X_train = np.hstack([X_train, np.ones((X_train.shape[0], 1))])

   X_val = np.hstack([X_val, np.ones((X_val.shape[0], 1))])

   X_test = np.hstack([X_test, np.ones((X_test.shape[0], 1))])

   X_dev = np.hstack([X_dev, np.ones((X_dev.shape[0], 1))])

   return X_train, y_train, X_val, y_val, X_test, y_test, X_dev, y_dev

 # Invoke the above function to get our data.

 X_train, y_train, X_val, y_val, X_test, y_test, X_dev, y_dev = get_CIFAR10_data()

 print 'Train data shape: ', X_train.shape

 print 'Train labels shape: ', y_train.shape

 print 'Validation data shape: ', X_val.shape

 print 'Validation labels shape: ', y_val.shape

 print 'Test data shape: ', X_test.shape

 print 'Test labels shape: ', y_test.shape

 print 'dev data shape: ', X_dev.shape

 print 'dev labels shape: ', y_dev.shape

数据维度结果：

cell 3 用for循环实现Softmax的loss function 与grad：

 # First implement the naive softmax loss function with nested loops.

 # Open the file cs231n/classifiers/softmax.py and implement the

 # softmax_loss_naive function.

 from cs231n.classifiers.softmax import softmax_loss_naive

 import time

 # Generate a random softmax weight matrix and use it to compute the loss.

 W = np.random.randn(3073, 10) * 0.0001

 loss, grad = softmax_loss_naive(W, X_dev, y_dev, 0.0)

 # As a rough sanity check, our loss should be something close to -log(0.1).

 print 'loss: %f' % loss

 print 'sanity check: %f' % (-np.log(0.1))

对应的py文件的代码：

 def softmax_loss_naive(W, X, y, reg):

   """

   Softmax loss function, naive implementation (with loops)

   Inputs have dimension D, there are C classes, and we operate on minibatches

   of N examples.

   Inputs:

   - W: A numpy array of shape (D, C) containing weights.

   - X: A numpy array of shape (N, D) containing a minibatch of data.

   - y: A numpy array of shape (N,) containing training labels; y[i] = c means

     that X[i] has label c, where 0 <= c < C.

   - reg: (float) regularization strength

   Returns a tuple of:

   - loss as single float

   - gradient with respect to weights W; an array of same shape as W

   """

   # Initialize the loss and gradient to zero.

   loss = 0.0

   dW = np.zeros_like(W)

   #############################################################################

   # TODO: Compute the softmax loss and its gradient using explicit loops.     #

   # Store the loss in loss and the gradient in dW. If you are not careful     #

   # here, it is easy to run into numeric instability. Don't forget the        #

   # regularization!                                                           #

   #############################################################################

   num_calss = W.shape[1]

   num_train = X.shape[0]

   buf_e = np.zeros(num_calss)

 #print buf_e.shape

   for i in xrange(num_train) :

     for j in xrange(num_calss) :

   #1*3073 * 3073*1 = 1 >>>10

       buf_e[j] = np.dot(X[i,:],W[:,j])

     buf_e -= np.max(buf_e)

     buf_e = np.exp(buf_e)

     buf_sum = np.sum(buf_e)

     buf = buf_e/ buf_sum

     loss -= np.log(buf[y[i]] )

     for j in xrange(num_calss):

         dW[:,j] +=( buf[j] - (j ==y[i]) )*X[i,:].T

   #regularization with elementwise production

   loss /= num_train

   dW /= num_train

   loss += 0.5 * reg * np.sum(W * W)

   dW +=reg*W

    #gradient 

   #############################################################################

   #                          END OF YOUR CODE                                 #

   #############################################################################

   return loss, dW

计算得到的结果：

使用了课程上所讲的验证方式。

问题：

cell 4 使用数值计算法对解析法得到的grad进行检验：

 # Complete the implementation of softmax_loss_naive and implement a (naive)

 # version of the gradient that uses nested loops.

 loss, grad = softmax_loss_naive(W, X_dev, y_dev, 0.0)

 # As we did for the SVM, use numeric gradient checking as a debugging tool.

 # The numeric gradient should be close to the analytic gradient.

 from cs231n.gradient_check import grad_check_sparse

 f = lambda w: softmax_loss_naive(w, X_dev, y_dev, 0.0)[0]

 grad_numerical = grad_check_sparse(f, W, grad, 10)

 # similar to SVM case, do another gradient check with regularization

 loss, grad = softmax_loss_naive(W, X_dev, y_dev, 1e2)

 f = lambda w: softmax_loss_naive(w, X_dev, y_dev, 1e2)[0]

 grad_numerical = grad_check_sparse(f, W, grad, 10)

计算结果：

cell 5 使用向量法来实现loss funvtion与grad，并与使用for循环法比较：

 # Now that we have a naive implementation of the softmax loss function and its gradient,

 # implement a vectorized version in softmax_loss_vectorized.

 # The two versions should compute the same results, but the vectorized version should be

 # much faster.

 tic = time.time()

 loss_naive, grad_naive = softmax_loss_naive(W, X_dev, y_dev, 0.00001)

 toc = time.time()

 print 'naive loss: %e computed in %fs' % (loss_naive, toc - tic)

 from cs231n.classifiers.softmax import softmax_loss_vectorized

 tic = time.time()

 loss_vectorized, grad_vectorized = softmax_loss_vectorized(W, X_dev, y_dev, 0.00001)

 toc = time.time()

 print 'vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic)

 # As we did for the SVM, we use the Frobenius norm to compare the two versions

 # of the gradient.

 grad_difference = np.linalg.norm(grad_naive - grad_vectorized, ord='fro')

 print 'Loss difference: %f' % np.abs(loss_naive - loss_vectorized)

 print 'Gradient difference: %f' % grad_difference

比较的结果：

向量法的具体代码实现：

 def softmax_loss_vectorized(W, X, y, reg):

   """

   Softmax loss function, vectorized version.

   Inputs and outputs are the same as softmax_loss_naive.

   """

   # Initialize the loss and gradient to zero.

   loss = 0.0

   dW = np.zeros_like(W)

   num_calss = W.shape[1]

   num_train = X.shape[0]

   #############################################################################

   # TODO: Compute the softmax loss and its gradient using no explicit loops.  #

   # Store the loss in loss and the gradient in dW. If you are not careful     #

   # here, it is easy to run into numeric instability. Don't forget the        #

   # regularization!                                                           #

   #############################################################################

   #500*3073 3073*10 >>>500*10

   buf_e  = np.dot(X,W)

   # 10 * 500 - 1*500   T

   buf_e = np.subtract( buf_e.T , np.max(buf_e , axis = 1) ).T

   buf_e = np.exp(buf_e)

   #10*500 - 1*500 T

   buf_e = np.divide( buf_e.T , np.sum(buf_e , axis = 1) ).T

   #get loss

   #print buf.shape

   loss = - np.sum(np.log ( buf_e[np.arange(num_train),y] ) )

   #get grad

   buf_e[np.arange(num_train),y]  -= 1

   # 3073 * 500 * 500*10

   loss /=num_train  + 0.5 * reg * np.sum(W * W)

   dW = np.dot(X.T,buf_e)/num_train + reg*W

   #############################################################################

   #                          END OF YOUR CODE                                 #

   #############################################################################

   return loss, dW

cell 6 使用验证集与训练集做超参数选取：

 # Use the validation set to tune hyperparameters (regularization strength and

 # learning rate). You should experiment with different ranges for the learning

 # rates and regularization strengths; if you are careful you should be able to

 # get a classification accuracy of over 0.35 on the validation set.

 from cs231n.classifiers import Softmax

 results = {}

 best_val = -1

 best_softmax = None

 learning_rates = np.logspace(-10, 10, 10)# [1e-7, 2e-7,3e-7,4e-7,5e-7]

 regularization_strengths = np.logspace(-3, 6, 10) #[1e4,5e4,1e5,5e5,1e6,5e6,1e7,5e7,1e8]

 ################################################################################

 # TODO:                                                                        #

 # Use the validation set to set the learning rate and regularization strength. #

 # This should be identical to the validation that you did for the SVM; save    #

 # the best trained softmax classifer in best_softmax.                          #

 ################################################################################

 iters = 1500

 for lr in learning_rates:

     for rs in regularization_strengths:

         softmax = Softmax()

         softmax.train(X_train, y_train, learning_rate=lr, reg=rs, num_iters=iters)

         y_train_pred = softmax.predict(X_train)

         accu_train = np.mean(y_train == y_train_pred)

         y_val_pred = softmax.predict(X_val)

         accu_val = np.mean(y_val == y_val_pred)

         results[(lr, rs)] = (accu_train, accu_val)

         if best_val < accu_val:

             best_val = accu_val

             best_softmax = softmax

 ################################################################################

 #                              END OF YOUR CODE                                #

 ################################################################################

 # Print out results.

 for lr, reg in sorted(results):

     train_accuracy, val_accuracy = results[(lr, reg)]

     print 'lr %e reg %e train accuracy: %f val accuracy: %f' % (

                 lr, reg, train_accuracy, val_accuracy)

 print 'best validation accuracy achieved during cross-validation: %f' % best_val

得到较好的结果：

cell 7 选取较好的超参数的模型，对测试集进行测试，计算准确率：

 # evaluate on test set

 # Evaluate the best softmax on test set

 y_test_pred = best_softmax.predict(X_test)

 test_accuracy = np.mean(y_test == y_test_pred)

 print 'softmax on raw pixels final test set accuracy: %f' % (test_accuracy, )

结果：0.378

cell 8 可视化w值：

 # Visualize the learned weights for each class

 w = best_softmax.W[:-1,:] # strip out the bias

 w = w.reshape(32, 32, 3, 10)

 w_min, w_max = np.min(w), np.max(w)

 classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

 for i in xrange(10):

   plt.subplot(2, 5, i + 1)

   # Rescale the weights to be between 0 and 255

   wimg = 255.0 * (w[:, :, :, i].squeeze() - w_min) / (w_max - w_min)

   plt.imshow(wimg.astype('uint8'))

   plt.axis('off')

   plt.title(classes[i])

结果：

注：在softmax 与svm的超参数选择时，使用了共同的类，以及类中不同的相应的方法。具体的文件内容与注释如下。

附：通关CS231n企鹅群：578975100 validation：DL-CS231n

CS231n 2016 通关第三章-Softmax 作业的更多相关文章

CS231n 2016 通关第三章-SVM 作业分析
作业内容,完成作业便可熟悉如下内容: cell 1 设置绘图默认参数 # Run some setup code for this notebook. import random import nu ...
CS231n 2016 通关第三章-SVM与Softmax
1===本节课对应视频内容的第三讲,对应PPT是Lecture3 2===本节课的收获 ===熟悉SVM及其多分类问题 ===熟悉softmax分类问题 ===了解优化思想由上节课即KNN的分析步骤 ...
CS231n 2016 通关第四章-NN 作业
cell 1 显示设置初始化 # A bit of setup import numpy as np import matplotlib.pyplot as plt from cs231n.class ...
CS231n 2016 通关第五章 Training NN Part1
在上一次总结中,总结了NN的基本结构. 接下来的几次课,对一些具体细节进行讲解. 比如激活函数.参数初始化.参数更新等等. ====================================== ...
CS231n 2016 通关第六章 Training NN Part2
本章节讲解参数更新 dropout ================================================================================= ...
CS231n 2016 通关第四章-反向传播与神经网络（第一部分）
在上次的分享中,介绍了模型建立与使用梯度下降法优化参数.梯度校验,以及一些超参数的经验. 本节课的主要内容: 1==链式法则 2==深度学习框架中链式法则 3==全连接神经网络 =========== ...
CSAPP深入理解计算机系统(第二版)第三章家庭作业答案
<深入理解计算机系统(第二版)>CSAPP 第三章家庭作业这一章介绍了AT&T的汇编指令比较重要本人完成了<深入理解计算机系统(第二版)>(以下简称CSAPP) ...
C++第三章课后作业答案及解析---指针的使用
今天继续完成上周没有完成的习题---C++第三章课后作业,本章题涉及指针的使用,有指向对象的指针做函数参数,对象的引用以及友元类的使用方法等它们具体的使用方法在下面的题目中会有具体的解析(解析标注在 ...
Hand on Machine Learning第三章课后作业(1)：垃圾邮件分类
import os import email import email.policy 1. 读取邮件数据 SPAM_PATH = os.path.join( "E:\\3.Study\\机器 ...

随机推荐

meta标签多种用法
<meta name=”google” content=”notranslate” /> <!-- 有时,Google在结果页面会提供一个翻译链接,但有时候你不希望出现这个链接,你可 ...
Android 学习之逐帧动画（Frame）
帧动画就是将一些列图片.依次播放. 利用肉眼的"视觉暂留"的原理,给用户的感觉是动画的错觉,逐帧动画的原理和早期的电影原理是一样的. a:须要定义逐帧动画,能够通过代码定义.也能够 ...
SQL获取年月日方法
方法一:利用DATENAME 在SQL数据库中,DATENAME(datetype,date)函数的作用是从日期中提取指定部分数据,其返回类型是nvarchar.datetype类型见附表1. SEL ...
利用反射和泛型把Model对象按行储存进数据库以及按行取出然后转换成Model 类实例 MVC网站通用配置项管理
利用反射和泛型把Model对象按行储存进数据库以及按行取出然后转换成Model 类实例 MVC网站通用配置项管理 2018-3-10 15:18 | 发布:Admin | 分类:代码库 | 评论: ...
JavaScript 日期格式化简单有用
JavaScript 日期格式化简单有用代码例如以下,引入jquery后直接后增加下面代码刷新可測试 Date.prototype.Format = function (fmt) { //auth ...
pyqt5 学习总结
关于基类一般的文件都会基于QWidget,QtWidgets.QMainWindow 或QDialog,like this class Example(QWidget): QWidget类是所有用户 ...
kubernetes故障现场一之Orphaned pod
系列目录问题描述:周五写字楼整体停电,周一再来的时候发现很多pod的状态都是Terminating,经排查是因为测试环境kubernetes集群中的有些节点是PC机,停电后需要手动开机才能起来.起来 ...
Java爬虫快速开发工具uncs的部署攻略
写在前面 uncs是java快速开发爬虫的工具,简单便捷,经过大量版本迭代和生产验证,可以适用大多数网站,推荐使用. 一.基本用法 1.1 开发包获取目前只能在公司内网maven服务器获取到 < ...
话说Session
Session这个概念,对于搞软件的来说,再熟悉不过了.就拿我来说,Hibernate, Shiro, Spring, JSP, Web Server等等,全都涉及到Session. 不怕笑话,一直都 ...
互联网金融MySQL优化参数标准
InnoDB配置从MySQL 5.5版本开始,InnoDB就是默认的存储引擎并且它比任何其它存储引擎的使用要多得多.那也是为什么它需要小心配置的原因. innodb_file_per_table 表 ...

CS231n 2016 通关 第三章-Softmax 作业

CS231n 2016 通关 第三章-Softmax 作业的更多相关文章

随机推荐

热门专题

CS231n 2016 通关第三章-Softmax 作业

CS231n 2016 通关第三章-Softmax 作业的更多相关文章