Regularization —— linear regression

本节主要是练习regularization项的使用原则。因为在机器学习的一些模型中，如果模型的参数太多，而训练样本又太少的话，这样训练出来的模型很容易产生过拟合现象。因此在模型的损失函数中，需要对模型的参数进行“惩罚”，这样的话这些参数就不会太大，而越小的参数说明模型越简单，越简单的模型则越不容易产生过拟合现象。

Regularized linear regression

From looking at this plot, it seems that fitting a straight line might be too simple of an approximation. Instead, we will try fitting a higher-order polynomial to the data to capture more of the variations in the points.

Let's try a fifth-order polynomial. Our hypothesis will be

This means that we have a hypothesis of six features, because are now all features of our regression. Notice that even though we are producing a polynomial fit, we still have a linear regression problem because the hypothesis is linear in each feature.

Since we are fitting a 5th-order polynomial to a data set of only 7 points, over-fitting is likely to occur. To guard against this, we will use regularization in our model.

Recall that in regularization problems, the goal is to minimize the following cost function with respect to :

The regularization parameter is a control on your fitting parameters. As the magnitues of the fitting parameters increase, there will be an increasing penalty on the cost function. This penalty is dependent on the squares of the parameters as well as the magnitude of . Also, notice that the summation after does not include

lamda 越大，训练出的模型越简单 —— 后一项的惩罚越大

Normal equations

Now we will find the best parameters of our model using the normal equations. Recall that the normal equations solution to regularized linear regression is

The matrix following is an diagonal matrix with a zero in the upper left and ones down the other diagonal entries. (Remember that is the number of features, not counting the intecept term). The vector and the matrix have the same definition they had for unregularized regression:

Using this equation, find values for using the three regularization parameters below:

a. (this is the same case as non-regularized linear regression)

b.

c.

Code

clc,clear

%加载数据

x = load('ex5Linx.dat');

y = load('ex5Liny.dat');

%显示原始数据

plot(x,y,'o','MarkerEdgeColor','b','MarkerFaceColor','r')

%将特征值变成训练样本矩阵

x = [ones(length(x),) x x.^ x.^ x.^ x.^];

[m n] = size(x);

n = n -;

%计算参数sidta，并且绘制出拟合曲线

rm = diag([;ones(n,)]);%lamda后面的矩阵

lamda = [  ]';

colortype = {'g','b','r'};

sida = zeros(n+,); %初始化参数sida

xrange = linspace(min(x(:,)),max(x(:,)))';

hold on;

for i = :

    sida(:,i) = inv(x'*x+lamda(i).*rm)*x'*y;%计算参数sida

    norm_sida = norm(sida) % norm 求sida的2阶范数

    yrange = [ones(size(xrange)) xrange xrange.^ xrange.^,...

        xrange.^ xrange.^]*sida(:,i);

    plot(xrange',yrange,char(colortype(i)))

    hold on

end

legend('traning data', '\lambda=0', '\lambda=1','\lambda=10')%注意转义字符的使用方法

hold off

Regularization —— linear regression的更多相关文章

machine learning(14) --Regularization:Regularized linear regression
machine learning(13) --Regularization:Regularized linear regression Gradient descent without regular ...
Matlab实现线性回归和逻辑回归: Linear Regression & Logistic Regression
原文:http://blog.csdn.net/abcjennifer/article/details/7732417 本文为Maching Learning 栏目补充内容,为上几章中所提到单参数线性 ...
Stanford机器学习---第二讲. 多变量线性回归 Linear Regression with multiple variable
原文:http://blog.csdn.net/abcjennifer/article/details/7700772 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
Stanford机器学习---第一讲. Linear Regression with one variable
原文:http://blog.csdn.net/abcjennifer/article/details/7691571 本栏目(Machine learning)包括单参数的线性回归.多参数的线性回归 ...
Regularized Linear Regression with scikit-learn
Regularized Linear Regression with scikit-learn Earlier we covered Ordinary Least Squares regression ...
机器学习笔记-1 Linear Regression with Multiple Variables(week 2)
1. Multiple Features note:X0 is equal to 1 2. Feature Scaling Idea: make sure features are on a simi ...
Simple tutorial for using TensorFlow to compute a linear regression
"""Simple tutorial for using TensorFlow to compute a linear regression. Parag K. Mita ...
第五次编程作业-Regularized Linear Regression and Bias v.s. Variance
1.正规化的线性回归 (1)代价函数 (2)梯度 linearRegCostFunction.m function [J, grad] = linearRegCostFunction(X, y, th ...
[UFLDL] Linear Regression & Classification
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html Deep learning:六(regulariz ...

随机推荐

Spring 注解拦截器使用详解
Spring mvc拦截器平时用到的拦截器通常都是xml的配置方式.今天就特地研究了一下注解方式的拦截器. 配置Spring环境这里就不做详细介绍.本文主要介绍在Spring下,基于注解方式的拦截器 ...
<Sicily>Threecolor problem
一.题目描述有红黄蓝3种颜色的n个珠子,师傅希望悟空把它们排成红色珠子在左,黄色珠子居中,蓝色珠子在右的一行,然后告诉师傅,从左数起,第m个珠子是什么颜色.众所周知,悟空是只猴子,他没有这个耐心,你 ...
Nordic Collegiate Programming Contest 2015(第七场)
A:Adjoin the Networks One day your boss explains to you that he has a bunch of computer networks tha ...
kolla-ansible 安装openstack 拉取阿里云镜像时报错
TASK [mariadb : Pulling mariadb image] ************************************************************ ...
Spring EL表达式和资源调用
Spring EL表达式 Spring EL-Spring表达式语言,支持在xml和注解中使用表达式,类似于在jsp的EL表达式语言. Spring 开发中经常涉及调用各种资源的情况, ...
Unity C# 设计模式（七）适配器模式
定义: 将一个类的接口转换成客户希望的另一个接口.adapter模式使得原本由于接口不兼容而不能在一起的那些类可以一起工作. 示例代码: 1.类适配器 /* Class Adapter:类适配器,这里 ...
【Henu ACM Round#19 A】 Vasya the Hipster
[链接] 我是链接,点我呀:) [题意] 在这里输入题意 [题解] 模拟题. 两个一起用->min(a,b); 剩下的除2加上去就好 [代码] #include <bits/stdc++. ...
ECNUOJ 2147 字符环
字符环 Time Limit:1000MS Memory Limit:65536KBTotal Submit:562 Accepted:146 Description 字符环:就是将给定的一个字符串 ...
Eclipse反编译插件 Enhanced Class Decompiler
因为jar包中的源码都是经过反编译的,所以需要安装插件才能查看到源码,此处介绍的是 Enhanced Class Decompiler 插件. 打开Eclipse,Help --> Eclips ...
HDFS 文件系统流程图。PB级文件存储时序图。
大小文件通吃, 热点hash功能. 全局唯一KV索引. 百度网盘模式.断点续传功能.MR分析功能. 来自为知笔记(Wiz)

Regularization —— linear regression

Regularization —— linear regression的更多相关文章

随机推荐

热门专题