the least-squares criterion|Sxx|Sxy|Syy|Regression Equation|Outliers|Influential Observations|curvilinear regression|linear regression
4.2 The Regression Equation
Because we could draw many different lines through the cluster of data points, we need a method to choose the “best” line. The method, called the least-squares criterion, is based on an analysis of the errors made in using a line to fifit the data points.
存在有限个可能的的模型(可以使用之后的方法得到模型),从中取出最有可能的2个:并用最小二乘法计算error:
比如(a)中的e
最后得到:
计算,最后确定模型为b,这只是对模型的评价,生成模型可以使用以下方法:
推导:
Suppose that a scatterplot indicates a linear relationship between two variables. Then,within the range of the observed values of the predictor variable, we can reasonably use the regression equation to make predictions for the response variable. However,to do so outside that range, which is called extrapolation,
比如减价趋势下的产品价格,离开观测值范围后,价格可能会处于负值状态,所以线性关系必须注明自变量range
In the context of regression, an outlier is a data point that lies far from the regression line
Outliers and Influential Observations
Outliers是偏离直线太远的值
influential observation : a data point whose removal causes the regression equation (and line) to change considerably
Eg.在加入(2,169)前后的直线发生了巨大变化,所以(2,169)是一个influential observation
解决办法:
1.缩小x的range
2.添加influential observation 周围的点
Nonetheless, we may need either to remove it—thus limiting the analysis to Orions between 4 and 7 years old—or to obtain additional data on 2- and 3-year-old Orions so that the regression analysis is not so dependent on one data point
outlier和influential observation实际上很难分清:An outlier may or may not be an inflfluential observation, and an inflfluential observation may or may not be an outlier. Many statistical software packages identify potential outliers and inflfluential observations.
否则会出现:
该分布实际上应该为curvilinear regression
多重线性回归:
曲线回归:
the least-squares criterion|Sxx|Sxy|Syy|Regression Equation|Outliers|Influential Observations|curvilinear regression|linear regression的更多相关文章
- Regularized Linear Regression with scikit-learn
Regularized Linear Regression with scikit-learn Earlier we covered Ordinary Least Squares regression ...
- [UFLDL] Linear Regression & Classification
博客内容取材于:http://www.cnblogs.com/tornadomeet/archive/2012/06/24/2560261.html Deep learning:六(regulariz ...
- CheeseZH: Stanford University: Machine Learning Ex5:Regularized Linear Regression and Bias v.s. Variance
源码:https://github.com/cheesezhe/Coursera-Machine-Learning-Exercise/tree/master/ex5 Introduction: In ...
- 机器学习 (一) 单变量线性回归 Linear Regression with One Variable
文章内容均来自斯坦福大学的Andrew Ng教授讲解的Machine Learning课程,本文是针对该课程的个人学习笔记,如有疏漏,请以原课程所讲述内容为准.感谢博主Rachel Zhang的个人笔 ...
- 机器学习 (二) 多变量线性回归 Linear Regression with Multiple Variables
文章内容均来自斯坦福大学的Andrew Ng教授讲解的Machine Learning课程,本文是针对该课程的个人学习笔记,如有疏漏,请以原课程所讲述内容为准.感谢博主Rachel Zhang 的个人 ...
- Linear regression with one variable - Model representation
摘要: 本文是吴恩达 (Andrew Ng)老师<机器学习>课程,第二章<单变量线性回归>中第6课时<模型概述>的视频原文字幕.为本人在视频学习过程中逐字逐句记录下 ...
- 机器学习---最小二乘线性回归模型的5个基本假设(Machine Learning Least Squares Linear Regression Assumptions)
在之前的文章<机器学习---线性回归(Machine Learning Linear Regression)>中说到,使用最小二乘回归模型需要满足一些假设条件.但是这些假设条件却往往是人们 ...
- 机器学习---用python实现最小二乘线性回归算法并用随机梯度下降法求解 (Machine Learning Least Squares Linear Regression Application SGD)
在<机器学习---线性回归(Machine Learning Linear Regression)>一文中,我们主要介绍了最小二乘线性回归算法以及简单地介绍了梯度下降法.现在,让我们来实践 ...
- 机器学习笔记1——Linear Regression with One Variable
Linear Regression with One Variable Model Representation Recall that in *regression problems*, we ar ...
随机推荐
- 51nod 1267:4个数和为0 哈希
1267 4个数和为0 基准时间限制:1 秒 空间限制:131072 KB 分值: 20 难度:3级算法题 收藏 关注 给出N个整数,你来判断一下是否能够选出4个数,他们的和为0,可以则输出&qu ...
- CTF -攻防世界-misc新手区
此题flag题目已经告诉格式,答案很简单. 将附件下载后,将光盘挂到虚拟机启动 使用 strings linux|grep flag会找到一个O7avZhikgKgbF/flag.txt然后root下 ...
- java 继承条件下的构造方法调用
运行 TestInherits.java示例,观察输出,注意总结父类与子类之间构造方法的调用关系修改Parent构造方法的代码,显式调用GrandParent的另一个构造函数,注意这句调用代码是否是第 ...
- java将HSSFWorkbook生成的excel压缩到zip中
思路:1.写入输入流中. 2.将输入流加到ZipOutputStream压缩流中 List<DocumentModel> list = null; try { list = documen ...
- CMake命令之install
CMAKE_INSTALL_PREFIX Install directory used by install(). if make install is invoked or INSTALL is b ...
- InnoDB和MyISAM区别总结
原来是MyISAM类型不支持事务处理等高级处理,而InnoDB类型支持. MyISAM类型的表强调的是性能,其执行数度比InnoDB类型更快,但是不提供事务支持,而InnoDB提供事务支持已经外部键等 ...
- 遇到屏蔽selenium的站点如何突破
访问某团外卖,查看下一页商家信息,正常浏览器可以打开, selenium打开就404, 分析请求参数,生成方法最后定位到 rohr*.js 而且有判断selenium特征 抓耳挠腮搞了半天没把这个j ...
- Mybatis配置文件无故报错、无自动完成提示的解决方法,及自动生成主要配置项
1.引子 Mybatis配置文件显示红叉有错误,而实际检查又没有错误,这是因为开发环境不能识别这种类型的xml文件.要解决这个问题,就要让IDE开发环境能够“认识”这个文件类型,我们要让IDE环境将这 ...
- 数据分析-Numpy-Pandas
补充上一篇未完待续的Numpy知识点 索引和切片 数组和标量(数字)之间运算 li1 = [ [1,2,3], [4,5,6] ] a = np.array(li1) a * 2 运行结果: arra ...
- Est数据库
Est--编码序列,gene 片段且具有标签 其中,est数据库中是类似测序1.测序2.测序3这样的序列.实验室测得的序列是cDNA,通过上图方法拼接,电脑克隆(dbest).如果有overlap则认 ...