Regression analysis
Source: http://wenku.baidu.com/link?url=9KrZhWmkIDHrqNHiXCGfkJVQWGFKOzaeiB7SslSdW_JnXCkVHsHsXJyvGbDva4V5A-uuOl84mg5zkTECichHX_AsN0mZalfI9BzDFOeNe-G###
❤ Simple linear regression
1. Y = β0 + β1*X + e
where:
Y - dependent variable (response)
X - independent variable (predictor/explanatory)
β0 - intercept
β1 - slope of the regression line
e - random error
2. Y' = b0 + b1*X
where: Y' - predicted value of Y
e = Y - Y'
3. Least squarea regression minizes the sum of the square of the errors and can be used to estimate b0 and b1.
4. Measuring the fit of the estimated model.
- The varibility of Y
SST (Sum of Squared Total): total variability about the mean, SST = sum((Y - mean(Y))^2);
SSE (Sum of Squared Error): variability about the regression line, SSE = sum(e^2) = sum((Y - mean(Y'))^2), SSE is unexplained varibility;
SSR (Sum of Squares due to Regression): variability that is explained, SSR = sum((Y' - mean(Y))^2), SSR is explained varibility.
Note that SST = SSE + SSR.
- Coefficient of determination
r^2: proportion of explained variability by the regression equation.
0 <= r^2 = 1 - SSE/SST = SSR/SST <= 1
- Correlation coefficient
r: strength of the relationship between X and Y.
-1 <= r <= 1
5. Assumptions in the regression model
Errors are independent, normally distributed, with the mean of zero, with a constant variance.
The assumptions can be tested by using residual analysis.
6. MSE (Mean Squared Error)
Estimation of error variance of the regression equation.
s^2 = MSE = SSE / (n - k - 1)
where:
n - number of observations in the sample
k - number of independent variables
Standard deviation of the regression: s = sqrt(MSE) is also frequently used.
❤ Test the model for significance: F-test
Used to statistically test the null hypothesis H0: there is no linear relationship between Y and X (i.e. β1 = 0).
If p value is low, then we regect H0 and conclude there is linear relationship:
F = MSR / MSE
where: MSR = SSR / k
Good regression model should have significant F value and high r^2 value.
Statistical test can be performed on the regression coefficients. H0: the βs are 0.
For a simple linear regression, the test for regression coefficient gives the same information as the ones given by F-test.
❤ ANOVA tables
The general form of the ANOVA table is helpful for understanding the interrelatedness of error terms.
❤ Multiple regression
Similar to the simple regression model, but there are more than one X in the multiple regression models.
Y' = b0 + b1*X1 + b2*X2 + ... + bn*Xn
Note that if indenpendent variables is correlate to each other, colinearity or multicolinearity will happen. This will cause problems when intepreate variables individually although the overall model estimation may still be good.
Regression analysis的更多相关文章
- [ML学习笔记] 回归分析(Regression Analysis)
[ML学习笔记] 回归分析(Regression Analysis) 回归分析:在一系列已知自变量与因变量之间相关关系的基础上,建立变量之间的回归方程,把回归方程作为算法模型,实现对新自变量得出因变量 ...
- Regression Analysis Using Excel
Regression Analysis Using Excel Setup By default, data analysis add-in is not enabled. Follow the st ...
- Functional mechanism: regression analysis under differential privacy_阅读报告
Functional mechanism: regression analysis under differential privacy 论文学习报告 组员:裴建新 赖妍菱 周子玉 2020 ...
- 7 Types of Regression Techniques you should know!
翻译来自:http://news.csdn.net/article_preview.html?preview=1&reload=1&arcid=2825492 摘要:本文解释了回归分析 ...
- STA 463 Simple Linear Regression Report
STA 463 Simple Linear Regression ReportSpring 2019 The goal of this part of the project is to perfor ...
- regression | p-value | Simple (bivariate) linear model | 线性回归 | 多重检验 | FDR | BH | R代码
P122, 这是IQR method课的第一次作业,需要统计检验,x和y是否显著的有线性关系. Assignment 1 1) Find a small bivariate dataset (pref ...
- Multiple Regression
Multiple Regression What is multiple regression? Multiple regression is regression analysis with mor ...
- Correlation and Regression
Correlation and Regression Sample Covariance The covariance between two random variables is a statis ...
- 7 Types of Regression Techniques
https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/ What is Regression Anal ...
随机推荐
- 关于jquery跨域请求方法
转载 http://www.cnblogs.com/benwu/archive/2012/12/25/2832981.html 项目中关于ajax jsonp的使用, 出现了问题:可以成功获得请求结果 ...
- ORACLE DBA_OBJECTS视图中OBJECT_TYPE为LOB的对象查看
在ORACLE数据库中,DBA_OBJECTS视图中OBJECT_TYPE为LOB的对象是什么东西呢?其实OBJECT_TYPE为LOB就是大对象(LOB),它指那些用来存储大量数据的数据库字段.下面 ...
- storm实战:基于storm,kafka,mysql的实时统计系统
公司对客户开放多个系统,运营人员想要了解客户使用各个系统的情况,在此之前,数据平台团队已经建设好了统一的Kafka消息通道. 为了保证架构能够满足业务可能的扩张后的性能要求,选用storm来处理各个应 ...
- VPS拨号主机自动拨号脚本(centos7)
问题:因公司会不定时购买大量VPS拨号主机,在部署环境的时候,首先要配置拨号,传统的拨号设置(pppoe-setup)配置比较繁琐,故写这个脚本方便拨号配置. #!/bin/bash ppp_user ...
- 手写一个json格式化 api
最近写的一个东西需要对json字符串进行格式化然后显示在网页上面. 我就想去网上找找有没有这样的api可以直接调用.百度 json api ,搜索结果都是那种只能在网页上进行校验的工具,没有api. ...
- Android UI编程(1)——九宫格(GridView)
(转自:http://blog.csdn.net/Thanksgining/article/details/42968847) 参考博客:http://blog.csdn.net/xyz_lmn/ar ...
- java.util.Date与java.sql.Date
我数据库里用到了日期类型.用java编程的时候同时import了java.util.*和java.sql.*,发现直接申明Date类型 Date dt; 会报错,查了一下才发现有java.util.D ...
- dba诊断之IO
--查看占用系统io较大的session SELECT se.sid,se.serial#,pr.SPID,se.username,se.status,se.terminal,se.program ...
- Nginx之location 匹配规则详解
有些童鞋的误区 1. location 的匹配顺序是“先匹配正则,再匹配普通”. 矫正: location 的匹配顺序其实是“先匹配普通,再匹配正则”.我这么说,大家一定会反驳我,因为按“先匹配普通, ...
- CTO和技术副总裁应该如何分工?谁才是技术领导者?
谁是初创公司的技术领导者,是CTO还是技术副总裁?任何在创业公司工作的人都知道,我们不应该去问这个问题.因为这两个是非常不同的角色,角色本身会随着创业公司的发展而变化,两者对于业务规模都很重要. 简单 ...