Machine Learning Lab1

打算把Andrew Ng教授的#Machine Learning#相关的6个实验一一实现了贴出来~



实验内容: 线性拟合




identity matrix——单位矩阵


Linear regression with one variable


In this part of this exercise, you will implement linear regression with one variable to predict profits for a food truck. Suppose you are the CEO of a restaurant franchise and are considering different cities for opening
a new outlet. The chain already has trucks in various cities and you have data for profits and populations from the cities. You would like to use this data to help you select which city to expand to next.







function A = warmUpExercise()
%WARMUPEXERCISE Example function in octave
% A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrix A = eye(5);
% ============= YOUR CODE HERE ==============
% Instructions: Return the 5x5 identity matrix
% In octave, we return values by defining which variables
% represent the return values (at the top of the file)
% and then set them accordingly.
% for row = 1:5
% for col = 1:5
% if row == col
% A(row,col) = 1;
% else
% A(row,col) = 0;
% end
% end
% end
% =========================================== end


function plotData(x, y)
%PLOTDATA Plots the data points x and y into a new figure
% PLOTDATA(x,y) plots the data points and gives the figure axes labels of
% population and profit. % ====================== YOUR CODE HERE ======================
% Instructions: Plot the training data into a figure using the
% "figure" and "plot" commands. Set the axes labels using
% the "xlabel" and "ylabel" commands. Assume the
% population and revenue data have been passed in
% as the x and y arguments of this function.
% Hint: You can use the 'rx' option with plot to have the markers
% appear as red crosses. Furthermore, you can make the
% markers larger by using plot(..., 'rx', 'MarkerSize', 10); figure; % open a new figure window
ylabel('Profit in $10,000s');
xlabel('Population of City in 10,000s'); % ============================================================ end


这里是介绍怎么计算cost function。



注意。这里h是大小和y一致的矩阵,而theta是一个二维列向量初始的theta = [0 ; 0];(再次强调,列向量)


假设theta大了。那么表现为h - y大于0,于是alpha(就是一个常数,用作刻画精度)乘以这个(h-y).会得到一个正数,于是theta 减去这个数,就变变小

同理theta小了, 会减去一个负数(由于h - y 小于0)。那么theta会变大

(强调,这里使用的theta变大变小。均针对于theta 矩阵内的某一元素而言,并不是指整个矩阵)

      那么仅仅要迭代的次数住够多。会得到一个非常恰当的theta使得cost function的值足够小(其实,在我们的实验中,迭代了1500次,使得cost function最小达到了0)。


function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y % Initialize some useful values
m = length(y); % number of training examples % You need to return the following variables correctly
J = 0; % ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost. % % Implementation method one:
% temp = (X*theta-y).^2;
% J = sum(temp(:))./(2*m); % Implementation method two:
temp = (X*theta - y);
J = (temp'*temp)./(2*m);
% ========================================================================= end


function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha % Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCost) and gradient here.
% temp = (X'*(X*theta -y))./m;
theta = theta - (alpha*(temp)); % ============================================================ % Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta); end end

OK! 最后会得到最佳的theta.蓝色的直线是我们的结果.近似的我们能够觉得人口越大的城市,盈利越多。


     接着已知最优theta的情况下,又分析了在一定范围内cost function值的变化,越是远离最长处。cost function的值就越大,拟合的效果就越差。


Linear regression with multiple variables




The file ex1data2.txt contains a training set of housing prices in Portland, Oregon. The first column is the size of the house (in square feet), the second column is the number of bedrooms, and the third column is the price

of the house.

关于房间大小。房子大小,房价之间的关系,数据来源于Portland 和Oregon。

x = [2104 3], y = 399900 房子大小是2104,有3个卧室。房价是399900,其它的同理

 x = [1600 3], y = 329900 

 x = [2400 3], y = 369000 

 x = [1416 2], y = 232000 

 x = [3000 4], y = 539900

...   ...


By looking at the values, note that house sizes are about 1000 times the number of bedrooms. When features differ by orders of magnitude, first performing feature scaling can make gradient descent converge

much more quickly.


function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X
% FEATURENORMALIZE(X) returns a normalized version of X where
% the mean value of each feature is 0 and the standard deviation
% is 1. This is often a good preprocessing step to do when
% working with learning algorithms. % You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2)); % ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
% of the feature and subtract it from the dataset,
% storing the mean value in mu. Next, compute the
% standard deviation of each feature and divide
% each feature by it's standard deviation, storing
% the standard deviation in sigma.
% Note that X is a matrix where each column is a
% feature and each row is an example. You need
% to perform the normalization separately for
% each feature.
% Hint: You might find the 'mean' and 'std' functions useful.
% for temp = 1:size(X,2)
mu(temp) = mean(X(:,temp));
sigma(temp) = std(X(:,temp));
X_norm(:,temp) = (X_norm(:,temp) - mu(temp))./sigma(temp);
end % ============================================================ end


function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
% J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y % Initialize some useful values
m = length(y); % number of training examples % You need to return the following variables correctly
J = 0; % ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost. temp = (X*theta - y);
J = (temp'*temp)./(2*m); % ========================================================================= end


function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
% theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha % Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCostMulti) and gradient here.
% temp = X'*(X*theta - y)./m;
theta = theta - alpha*temp; % ============================================================ % Save the cost J in every iteration
J_history(iter) = computeCostMulti(X, y, theta); end end


最后依据公式求解theta,这样的方法并不须要迭代。而能非常好的求得近似解(实质上还是矩阵的变换, theta*X = y 变形即得)

function [theta] = normalEqn(X, y)
%NORMALEQN Computes the closed-form solution to linear regression
% NORMALEQN(X,y) computes the closed-form solution to linear
% regression using the normal equations. theta = zeros(size(X, 2), 1); % ====================== YOUR CODE HERE ======================
% Instructions: Complete the code to compute the closed form solution
% to linear regression and put the result in theta.
% % ---------------------- Sample Solution ----------------------
theta = (inv(X'*X))*X'*y;
% ------------------------------------------------------------- % ============================================================ end

Solving with normal equations...

Theta computed from the normal equations: 




Predicted price of a 1650 sq-ft, 3 br house (using normal equations):



