In statistics and applications of statistics, normalization can have a range of meanings.[1] In the simplest cases, normalization of ratings means adjusting values measured on different scales to a notionally common scale, often prior to averaging. In more complicated cases, normalization may refer to more sophisticated adjustments where the intention is to bring the entire probability distributions of adjusted values into alignment. In the case of normalization of scores in educational assessment, there may be an intention to align distributions to a normal distribution. A different approach to normalization of probability distributions is quantile normalization, where the quantiles of the different measures are brought into alignment.

In another usage in statistics, normalization refers to the creation of shifted and scaled versions of statistics, where the intention is that these normalized values allow the comparison of corresponding normalized values for different datasets in a way that eliminates the effects of certain gross influences, as in an anomaly time series. Some types of normalization involve only a rescaling, to arrive at values relative to some size variable. In terms of levels of measurement, such ratios only make sense for ratio measurements (where ratios of measurements are meaningful), not interval measurements (where only distances are meaningful, but not ratios).

In theoretical statistics, parametric normalization can often lead to pivotal quantities – functions whose sampling distribution does not depend on the parameters – and to ancillary statistics – pivotal quantities that can be computed from observations, without knowing parameters.

在统计学和应用统计学中,normalization有着宽泛的意义。最简单的理解,比如评级的标准化,意味着不同尺度上测量的数据,调整为理论上的共同尺度,这通常要先于平均运算。在复杂的案例中,normalization通常也意味着复杂的调整,目的就是要使得调整后的数据的概率分布,保证某种尺度上的一致。举个例子,在教育评估中,不同科目难易不同,不同的学生选择了不同的科目,得了不同的分数,如何评价他们的好坏?要想使不同科目的分数具有科比性,就需要以‘标准分布(normal distribution)’作为比较的基准。与概率分布标准化不同的一种方法,就是‘分位点标准化( quantile normalization)’,也就是使得不同测量方法的分位点保持一致(我估计是不是类似于举重、拳击的轻量级、重量级的分位)。

在统计学的另一个术语中,标准化normalization特指经过平移和缩放后的统计版本,目的是这些标准化的数据使得来源于不同数据集合中的经归一化后,能够互相比较。以这样的方式消除总体影响效果,比如“异常事件序列( anomaly time series)”。某些类型的标准化只包括一个缩放因子,相对于尺度变量,使其达到某个某个量值。根据测量等级,这样的比率只对比率测量有意义(其中,测量的比率才是有意义的),而不是间隔测量(其中,只有距离是有意义的,而不是比率)

在理论统计学中,参数标准化常常可以导致基准量—采样分布函数不依赖于参数;并且产生一些辅助统计—基准量,这些基准量可以从观察数据计算得到,不需要知道具体参数。

Examples[edit]

There are various normalizations in statistics – nondimensional ratios of errors, residuals, means and standard deviations, which are hence scale invariant – some of which may be summarized as follows. Note that in terms of levels of measurement, these ratios only make sense for ratio measurements (where ratios of measurements are meaningful), not interval measurements (where only distances are meaningful, but not ratios). See also Category:Statistical ratios...

在统计学上,有多种不同的标准化:比如无量纲的误差、残差、均值和标准差等的比率。因为是无量纲比率,所以是尺度不变的。某些比率可以概括如下。注意,根据测量等级,这些比率只是对“比率测量(ratio measurement)”有意义,其中的测量比率是有意义的。See also Category:Statistical ratios...

Name Formula Use
Standard score

Normalizing errors when population parameters are known. Works well for populations that are normally distributed

Student's t-statistic Normalizing residuals when population parameters are unknown (estimated).
Studentized residual Normalizing residuals when parameters are estimated, particularly across different data points in regression analysis.
Standardized moment Normalizing moments, using the standard deviation {\displaystyle \sigma } as a measure of scale.
Coefficient of
variation
Normalizing dispersion, using the mean {\displaystyle \mu } as a measure of scale, particularly for positive distribution such as the exponential distribution and Poisson distribution.
Feature scaling

Feature scaling is used to bring all values into the range [0,1].  This can be generalized to restrict the range of values in the dataset between any arbitrary points a and b usings

.

Note that some other ratios, such as the variance-to-mean ratio {\displaystyle \left({\frac {\sigma ^{2}}{\mu }}\right)}, are also done for normalization, but are not nondimensional: the units do not cancel, and thus the ratio has units, and are not scale invariant.

Other types[edit]

Other non-dimensional normalizations that can be used with no assumptions on the distribution include:

  • Assignment of percentiles. This is common on standardized tests. See also quantile normalization.
  • Normalization by adding and/or multiplying by constants so values fall between 0 and 1. This used for probability density functions, with applications in fields such as physical chemistry in assigning probabilities to |ψ|2.

See also[edit]

References[edit]

  1. Jump up^ Dodge, Y (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9 (entry for normalization of scores)

normalization(统计)的更多相关文章

  1. 归一化方法 Normalization Method

    1. 概要 数据预处理在众多深度学习算法中都起着重要作用,实际情况中,将数据做归一化和白化处理后,很多算法能够发挥最佳效果.然而除非对这些算法有丰富的使用经验,否则预处理的精确参数并非显而易见. 2. ...

  2. 从Bayesian角度浅析Batch Normalization

    前置阅读:http://blog.csdn.net/happynear/article/details/44238541——Batch Norm阅读笔记与实现 前置阅读:http://www.zhih ...

  3. [CS231n-CNN] Training Neural Networks Part 1 : activation functions, weight initialization, gradient flow, batch normalization | babysitting the learning process, hyperparameter optimization

    课程主页:http://cs231n.stanford.edu/   Introduction to neural networks -Training Neural Network ________ ...

  4. 数据标准化/归一化normalization

    http://blog.csdn.net/pipisorry/article/details/52247379 基础知识参考: [均值.方差与协方差矩阵] [矩阵论:向量范数和矩阵范数] 数据的标准化 ...

  5. (转载)深度剖析 | 可微分学习的自适配归一化 (Switchable Normalization)

    深度剖析 | 可微分学习的自适配归一化 (Switchable Normalization) 作者:罗平.任家敏.彭章琳 编写:吴凌云.张瑞茂.邵文琪.王新江 转自:知乎.原论文参考arXiv:180 ...

  6. 图像分类(二)GoogLenet Inception_v2:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

    Inception V2网络中的代表是加入了BN(Batch Normalization)层,并且使用 2个 3*3卷积替代 1个5*5卷积的改进版,如下图所示: 其特点如下: 学习VGG用2个 3* ...

  7. tensorflow中batch normalization的用法

    网上找了下tensorflow中使用batch normalization的博客,发现写的都不是很好,在此总结下: 1.原理 公式如下: y=γ(x-μ)/σ+β 其中x是输入,y是输出,μ是均值,σ ...

  8. BN(Batch Normalization)

    Batch Nornalization Question? 1.是什么? 2.有什么用? 3.怎么用? paper:<Batch Normalization: Accelerating Deep ...

  9. 单细胞数据初步处理 | drop-seq | QC | 质控 | 正则化 normalization

    比对 The raw Drop-seq data was processed with the standard pipeline (Drop-seq tools version 1.12 from ...

随机推荐

  1. JavaScript入门学习之一——初级语法

    JavaScript是前端编辑的一种编程语言(不同于html,html是一种标记语言),所以和其他的编程语言一样,我们将会从下面几点学习 基础语法 数据类型 函数 面向对象 JavaScript的组成 ...

  2. 〇三——css常规使用

    我们在前面已经学习了常用的html基础,就可以画出一个最直接的‘裸体’ ,那么这么画出来的比较简陋,那怎么能让他变得更漂亮呢?这里就引出今天要讲的——css 我们先看看怎么把页面加上修饰的效果 < ...

  3. springmvc其他类获取request记得web.xml

    <listener> <listener-class>org.springframework.web.context.request.RequestContextListene ...

  4. 2017 CVTE Windows开发一面 3.7

    下午3点接到了个广州打过来的电话,电话面试总体时间比较短,35分钟. 考察内容: 1.讲实习: 因人而异,将了之前公司做的项目,刚好和面的岗位匹配,面试官听完之后还不忘垂壁一下他们的产品. 2.C#事 ...

  5. Acwing-102-最佳牛围栏(二分,实数)

    链接: https://www.acwing.com/problem/content/104/ 题意: 农夫约翰的农场由 N 块田地组成,每块地里都有一定数量的牛,其数量不会少于1头,也不会超过200 ...

  6. JavaScript基础——JavaScript函数(笔记)

    avaScript 函数(笔记) JavaScript 是函数式编程语言,在JavaScript脚本中可以随处看到函数,函数构成了JavaScript源代码的主体. 一.定义函数 定义函数的方法有两种 ...

  7. 【Winform-GataGridView】根据DataGridView中的数据内容设置行的文字颜色、背景色 — 根据状态变色

    C#中可以根据每行内容的不同来对DataGridView数据表格控制每行的文字颜色.背景颜色进行不同的设置. 效果如下: 实现: 在DataGridView的RowPrePaint事件中进行行颜色控制 ...

  8. 什么是CSS 表单?

    ㈠输入框(input) 样式 ⑴使用 width 属性来设置输入框的宽度   示例:css部分:input { width: 100%; }                html部分:<for ...

  9. KMP模版 && KMP求子串在主串出现的次数模版

    求取出现的次数 :  #include<bits/stdc++.h> ; char mo[maxn], str[maxn];///mo为模式串.str为主串 int next[maxn]; ...

  10. JS实现深拷贝的几种方法

    引 如何区分深拷贝与浅拷贝,简单点来说,就是假设B复制了A,当修改A时,看B是否会发生变化,如果B也跟着变了,说明这是浅拷贝,拿人手短,如果B没变,那就是深拷贝,自食其力. 此篇文章中也会简单阐述到栈 ...