从零单排入门机器学习：线性回归（linear regression）实践篇

线性回归（linear regression）实践篇

之前一段时间在coursera看了Andrew ng的机器学习的课程，感觉还不错，算是入门了。

这次打算以该课程的作业为主线，对机器学习基本知识做一下总结。小弟才学疏浅，如有错误。敬请指导。

问题原描写叙述：

you will implement linear regression with one

variable to predict prots for a food truck. Suppose you are the CEO of a

restaurant franchise and are considering dierent cities for opening a new

outlet. The chain already has trucks in various cities and you have data for

prots and populations from the cities.

简单来说，就是依据一个城市的人口数量，来预測一辆快餐车能获得的利益。

数据集大概是这样子的：

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbGluZ2VybGFubGFu/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">

一行数据为一个样本。第一列表示人口，第二列表示利益。

首先。先把数据可视化。

%% ======================= Part 2: Plotting =======================

fprintf('Plotting Data ...\n')

data = load('ex1data1.txt');

X = data(:, 1); y = data(:, 2);

m = length(y); % number of training examples

% Plot Data

% Note: You have to complete the code in plotData.m

plotData(X, y);

fprintf('Program paused. Press enter to continue.\n');

pause;

function plotData(x, y)

%PLOTDATA Plots the data points x and y into a new figure

%   PLOTDATA(x,y) plots the data points and gives the figure axes labels of

%   population and profit.

% ====================== YOUR CODE HERE ======================

% Instructions: Plot the training data into a figure using the

%               "figure" and "plot" commands. Set the axes labels using

%               the "xlabel" and "ylabel" commands. Assume the

%               population and revenue data have been passed in

%               as the x and y arguments of this function.

%

% Hint: You can use the 'rx' option with plot to have the markers

%       appear as red crosses. Furthermore, you can make the

%       markers larger by using plot(..., 'rx', 'MarkerSize', 10);

figure; % open a new figure window

plot(x, y, 'rx', 'MarkerSize', 10); % Plot the data

ylabel('Profit in $10,000s'); % Set the y label

xlabel('Population of City in 10,000s'); % Set the x label

% ============================================================

end

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbGluZ2VybGFubGFu/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">

计算cost function

function J = computeCost(X, y, theta)

%COMPUTECOST Compute cost for linear regression

%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the

%   parameter for linear regression to fit the data points in X and y

% Initialize some useful values

m = length(y); % number of training examples

% You need to return the following variables correctly

% ====================== YOUR CODE HERE ======================

% Instructions: Compute the cost of a particular choice of theta

%               You should set J to the cost.

H = X*theta;

diff = H - y;

%J = sum(diff.^2)/(2*m);

J = sum(diff.*diff)/(2*m);

% =========================================================================

end

为了方便理解上面代码，看看各变量大概长什么样子的。

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbGluZ2VybGFubGFu/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">

梯度下降法计算參数theta

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

%GRADIENTDESCENT Performs gradient descent to learn theta

%   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by

%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values

m = length(y); % number of training examples

J_history = zeros(num_iters, 1);

for iter = 1:num_iters

    % ====================== YOUR CODE HERE ======================

    % Instructions: Perform a single gradient step on the parameter vector

    %               theta.

    %

    % Hint: While debugging, it can be useful to print out the values

    %       of the cost function (computeCost) and gradient here.

    %

    H = X*theta-y;

    theta(1) = theta(1) - sum(H.* X(:,1))*alpha/m;%感觉这样写挺搓的

    theta(2) = theta(2) - sum(H.* X(:,2))*alpha/m;

    %theta = theta - alpha * (X' * (X * theta - y)) / m; 

    % ============================================================

    % Save the cost J in every iteration

    J_history(iter) = computeCost(X, y, theta);

end

end

难以理解的是theta = theta - alpha * (X' * (X * theta - y)) / m; 这样的向量化算法。

先看看theta本质是怎么计算的

再看看各变量长什么样子的

算出theta之后，就能够画出拟合直线了。

watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbGluZ2VybGFubGFu/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast" alt="">

注：本文作者linger，如有转载。请标明转载于http://blog.csdn.net/lingerlanlan。

本文链接:http://blog.csdn.net/lingerlanlan/article/details/32162559

从零单排入门机器学习：线性回归（linear regression）实践篇的更多相关文章

从零单排入门机器学习：Octave/matlab的经常使用知识之矩阵和向量
Octave/matlab的经常使用知识之矩阵和向量之前一段时间在coursera看了Andrew ng的机器学习的课程,感觉还不错.算是入门了.这次打算以该课程的作业为主线,对机器学习基本知识做一 ...
Stanford机器学习---第二讲. 多变量线性回归 Linear Regression with multiple variable
原文:http://blog.csdn.net/abcjennifer/article/details/7700772 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
机器学习（三）--------多变量线性回归(Linear Regression with Multiple Variables)
机器学习(三)--------多变量线性回归(Linear Regression with Multiple Variables) 同样是预测房价问题如果有多个特征值那么这种情况下假设h表示 ...
斯坦福CS229机器学习课程笔记 Part1：线性回归 Linear Regression
机器学习三要素机器学习的三要素为:模型.策略.算法. 模型:就是所要学习的条件概率分布或决策函数.线性回归模型策略:按照什么样的准则学习或选择最优的模型.最小化均方误差,即所谓的 least-sq ...
机器学习 (一) 单变量线性回归 Linear Regression with One Variable
文章内容均来自斯坦福大学的Andrew Ng教授讲解的Machine Learning课程,本文是针对该课程的个人学习笔记,如有疏漏,请以原课程所讲述内容为准.感谢博主Rachel Zhang的个人笔 ...
机器学习 (二) 多变量线性回归 Linear Regression with Multiple Variables
文章内容均来自斯坦福大学的Andrew Ng教授讲解的Machine Learning课程,本文是针对该课程的个人学习笔记,如有疏漏,请以原课程所讲述内容为准.感谢博主Rachel Zhang 的个人 ...
TensorFlow 学习笔记(1)----线性回归(linear regression)的TensorFlow实现
此系列将会每日持续更新,欢迎关注线性回归(linear regression)的TensorFlow实现 #这里是基于python 3.7版本的TensorFlow TensorFlow是一个机器学 ...
Ng第二课：单变量线性回归(Linear Regression with One Variable)
二.单变量线性回归(Linear Regression with One Variable) 2.1 模型表示 2.2 代价函数 2.3 代价函数的直观理解 2.4 梯度下降 2.5 梯度下 ...
斯坦福第二课：单变量线性回归(Linear Regression with One Variable)
二.单变量线性回归(Linear Regression with One Variable) 2.1 模型表示 2.2 代价函数 2.3 代价函数的直观理解 I 2.4 代价函数的直观理解 I ...

随机推荐

thinkphp 内存查询表防止多次查库
//从内存查询表以防止多次查库 private static function selectTable($tableName,array $where,$getFirst=false){ $res ...
A - Presents
Problem description Little Petya very much likes gifts. Recently he has received a new laptop as a N ...
Module, Package in Python
1.To put it simple, Module是写好的一系列函数或变量,文件以.py为后缀,可以在其他Module中整体或部分引用. PS: 在Module中[结尾或开头]加入if __name ...
ObjecT4：On-line multiple instance learning （MIL）学习
原文链接:http://blog.csdn.net/ikerpeng/article/details/19235391 用到论文,直接看翻译. 文章:Robust object tracking wi ...
使用meta实现页面的定时刷新或跳转
<meta http-equiv="refresh" content="5"> 这个表示当前页面每5秒钟刷一下,刷一下~ <meta http ...
01--[转]C++强大背后
[转]C++强大背后 2014-01-22 分类:互联网阅读(9295) 评论(6) 在31年前(1979年),一名刚获得博士学位的研究员,为了开发一个软件项目发明了一门新编程语言,该研究员名为Bj ...
jdbc转账操作
public class cs{ public static void main(String[] args){ try{ Connection conn=JdbcUtils.getConnectio ...
charles抓https设置
1下载charles和破解包 2安装证书打开charles的help->SSL Proxying->install charles root certificate 选择你要安装的列表里 ...
xshell登录centos7很慢解决办法
使用xshell登录到centos系统虚拟机,可以登录上去,但是认证速度特别慢. 因为在登录时,需要反向解析dns,因此,修改linux配置文件,vi /etc/ssh/sshd_config,将其注 ...
json字符串通俗的介绍
json 的本质就是字符串,按key:value这种键值对的格式定义的字符串 json就是传递javascript对象的语法,json只有两种结构,对象和数组,这两种结构嵌套和组合,来表示各种各样的数 ...

从零单排入门机器学习：线性回归（linear regression）实践篇

从零单排入门机器学习：线性回归（linear regression）实践篇的更多相关文章

随机推荐

热门专题