r squared
multiple r squared
adjusted r squared
http://web.maths.unsw.edu.au/~adelle/Garvan/Assays/GoodnessOfFit.html
Goodness-of-Fit Statistics
Sum of Squares Due to Error
This statistic measures the total deviation of the response values from the fit to the response values. It is also called the summed square of residuals and is usually labelled as SSE.
- SSE = Sum
(i=1 to n)
- {
wi
- (
yi - fi
- )
2
- }
Here yi is the observed data value and fi is the predicted value from the fit. wi is the weighting applied to each data point, usually wi = 1.
A value closer to 0 indicates that the model has a smaller random error component, and that the fit will be more useful for prediction.
R-Square
This statistic measures how successful the fit is in explaining the variation of the data. Put another way, R-square is the square of the correlation between the response values and the predicted response values. It is also called the square of the multiple correlation coefficient and the coefficient of multiple determination.
R-square is defined as
- R-square = 1 - [Sum
(i=1 to n)
- {
wi
- (
yi - fi
- )
2
- }] /[Sum
(i=1 to n)
- {
wi
- (
yi - yav
- )
2
- }] = 1 - SSE/SST
Here fi is the predicted value from the fit, yav is the mean of the observed data yi is the observed data value. wi is the weighting applied to each data point, usually wi=1. SSE is the sum of squares due to error and SST is the total sum of squares.
R-square can take on any value between 0 and 1, with a value closer to 1 indicating that a greater proportion of variance is accounted for by the model. For example, an R-square value of 0.8234 means that the fit explains 82.34% of the total variation in the data about the average.
If you increase the number of fitted coefficients in your model, R-square will increase although the fit may not improve in a practical sense. To avoid this situation, you should use the degrees of freedom adjusted R-square statistic described below.
Note that it is possible to get a negative R-square for equations that do not contain a constant term. Because R-square is defined as the proportion of variance explained by the fit, if the fit is actually worse than just fitting a horizontal line then R-square is negative. In this case, R-square cannot be interpreted as the square of a correlation. Such situations indicate that a constant term should be added to the model.
Degrees of Freedom Adjusted R-Square
This statistic uses the R-square statistic defined above, and adjusts it based on the residual degrees of freedom. The residual degrees of freedom is defined as the number of response values nminus the number of fitted coefficients m estimated from the response values.
v = n-m
v indicates the number of independent pieces of information involving the n data points that are required to calculate the sum of squares. Note that if parameters are bounded and one or more of the estimates are at their bounds, then those estimates are regarded as fixed. The degrees of freedom is increased by the number of such parameters.
The adjusted R-square statistic is generally the best indicator of the fit quality when you compare two models that are nested – that is, a series of models each of which adds additional coefficients to the previous model.
- adjusted R-square = 1 - SSE(
n
- -1)/SST(
v
- )
The adjusted R-square statistic can take on any value less than or equal to 1, with a value closer to 1 indicating a better fit. Negative values can occur when the model contains terms that do not help to predict the response.
Root Mean Squared Error
This statistic is also known as the fit standard error and the standard error of the regression. It is an estimate of the standard deviation of the random component in the data, and is defined as
- RMSE =
s
- = (MSE)
½
where MSE is the mean square error or the residual mean square
- MSE=SSE/
v
Just as with SSE, an MSE value closer to 0 indicates a fit that is more useful for prediction.
r squared的更多相关文章
- 机器学习:衡量线性回归法的指标(MSE、RMSE、MAE、R Squared)
一.MSE.RMSE.MAE 思路:测试数据集中的点,距离模型的平均距离越小,该模型越精确 # 注:使用平均距离,而不是所有测试样本的距离和,因为距离和受样本数量的影响 1)公式: MSE:均方误差 ...
- 线性函数拟合R语言示例
线性函数拟合(y=a+bx) 1. R运行实例 R语言运行代码如下:绿色为要提供的数据,黄色标识信息为需要保存的. x<-c(0.10,0.11, 0.12, 0.13, 0.14, ...
- R语言︱非结构化数据处理神器——rlist包
本文作者:任坤,厦门大学王亚南经济研究院金融硕士生,研究兴趣为计算统计和金融量化交易,pipeR,learnR,rlist等项目的作者. 近年来,非关系型数据逐渐获得了更广泛的关注和使用.下面分别列举 ...
- R语言命令汇总
> qqplot(spear,fastrankweight)> qqplot(spear,fastrankweight,main="title")> qqplot ...
- R ggplot2 线性回归
摘自 http://f.dataguru.cn/thread-278300-1-1.html library(ggplot2) x=1:10y=rnorm(10)a=data.frame(x= x, ...
- r语言与dataframe
什么是DataFrame 引用 r-tutor上的定义: DataFrame 是一个表格或者类似二维数组的结构,它的各行表示一个实例,各列表示一个变量. 没错,DataFrame就是类似于Excel表 ...
- R语言学习笔记(二十四):plyr包的用法
plyr 这个包,提供了一组规范的数据结构转换形式. Input/Output list data frame array list llply() ldply() laply() data fram ...
- a note of R software write Function
Functionals “To become significantly more reliable, code must become more transparent. In particular ...
- Advanced R之构造子集
转发请声明出处:http://www.cnblogs.com/lizichao/p/4794733.html 构造子集 R构造子集的操作功能强大而且速度快.精通构造子集者可以用简洁的方式表达复杂的操作 ...
随机推荐
- linux内核中打印栈回溯信息 - dump_stack()函数分析【转】
转自:http://blog.csdn.net/jasonchen_gbd/article/details/45585133 版权声明:本文为博主原创文章,转载请附上原博链接. 目录(?)[-] ...
- 标准C程序设计七---61
Linux应用 编程深入 语言编程 标准C程序设计七---经典C11程序设计 以下内容为阅读: <标准C程序设计>(第7版) 作者 ...
- 标准C程序设计七---54
Linux应用 编程深入 语言编程 标准C程序设计七---经典C11程序设计 以下内容为阅读: <标准C程序设计>(第7版) 作者 ...
- IOS 使用DSYM文件定位Bug 的具体位置
在项目的开发中,我们通常需要排查和修改测试中和发布后线上的一些bug,现在有一些第三方的bug分享和查找工具SDK,如腾讯的Bugly和听云等,包括苹果的开发工具xcode也自带 bug查找工具.那么 ...
- iOS应用内跳转百度高德苹果地图
移动开发经常用到基于位置的一些导航功能,但是对于对导航功能依赖性不强的的应用,我们直接采用应用外跳转地图APP即可. 但是应用间跳转,首先需要设置白名单, 在iOS 9 下涉及到平台客户端跳转,系统会 ...
- 富文本ZSSRichTextEditor之趟坑集锦
富文本ZSSRichTextEditor是iOS原生与网页交互的集大成者,各种交互.自然问题也是多多,这篇文文章陆续更新遇到的奇葩问题. 1.问题1:从头条这种文章里头复制粘贴的文章,里边有图片,我们 ...
- 小程序-TabBar点击切换
这种页面的布局会经常用到,所以在此做个笔记,之后遇到可以节省很多时间 WXML: <view class='listTitle_tab'> <view class='scr ...
- HDU 3068 Manacher
题目链接:http://hdu.hustoj.com/showproblem.php?pid=3068 今天学习一下马拉车算法,虽然mg讲过,但是没有系统去学. 算法学习:参考博客 马拉车模板题. # ...
- sudo apt-get upgrade 不成功遇到问题
一. sudo apt-get update 和 sudo apt-get upgrade 出错:(Ubuntu更新过程被中断后的问题) Ubuntu的更新过程是先下载完源里的文件就开始执行升级,如果 ...
- k8s之nginx-ingress、 Daemonset实现生产案例
上一篇中用node ip + 非80端口,访问k8s集群内部的服务.实际生产中更希望用node ip + 80端口的方式,访问k8s集群内的服务. # 修改mandatory.yaml中创建控制器部分 ...