Generalized normal distribution and Skew normal distribution
Density Function
The Generalized Gaussian density has the following form:
where (rho) is the "shape parameter". The density is plotted in the following figure:
Matlab code used to generate this figure is available here: ggplot.m.
Adding an arbitrary location parameter, , and inverse scale parameter, , the density has the form,
Matlab code used to generate this figure is available here: ggplot2.m.
Generating Random Samples
Samples from the Generalized Gaussian can be generated by a transformation of Gamma random samples, using the fact that if is a distributed random variable, and is an independent random variable taking the value -1 or +1 with equal probability, then,
is distributed . That is,
where the density of is written in a non-standard but suggestive form.
Matlab Code
Matlab code to generate random variates from the Generalized Gaussian density with parameters as described here is here:
As an example, we generate random samples from the example Generalized Gaussian densities shown above.
Matlab code used to generate this figure is available here: ggplot3.m.
Mixture Densities
A more general family of densities can be constructed from mixtures of Generalized Gaussians. A mixture density, , is made up of constituent densities together with probabilities associated with each constituent density.
The densities have different forms, or parameter values. A random variable with a mixture density can be thought of as being generated by a two-part process: first a decision is made as to which constituent density to draw from, where the density is chosen with probability , then the value of the random variable is drawn from the chosen density. Independent repetitions of this process result in a sample having the mixture density .
As an example consider the density,
Matlab code used to generate these figures is available here: ggplot4.m.
The generalized normal distribution or generalized Gaussian distribution (GGD) is either of two families of parametric continuous probability distributions on the real line. Both families add a shape parameter to the normal distribution. To distinguish the two families, they are referred to below as "version 1" and "version 2". However this is not a standard nomenclature.
Version 1
Probability density function
|
|
Cumulative distribution function
|
|
Parameters | location (real) scale (positive, real) shape (positive, real) |
---|---|
Support | |
denotes the gamma function |
|
CDF |
denotes the lower incomplete gamma function |
Mean | |
Median | |
Mode | |
Variance | |
Skewness | 0 |
Ex. kurtosis | |
Entropy | [1] |
Known also as the exponential power distribution, or the generalized error distribution, this is a parametric family of symmetric distributions. It includes all normal and Laplacedistributions, and as limiting cases it includes all continuous uniform distributions on bounded intervals of the real line.
This family includes the normal distribution when (with mean and variance ) and it includes the Laplace distributionwhen . As , the density converges pointwise to a uniform density on .
This family allows for tails that are either heavier than normal (when ) or lighter than normal (when ). It is a useful way to parametrize a continuum of symmetric, platykurticdensities spanning from the normal () to the uniform density (), and a continuum of symmetric, leptokurticdensities spanning from the Laplace () to the normal density ().
Parameter estimation
Parameter estimation via maximum likelihood and the method of moments has been studied.[2] The estimates do not have a closed form and must be obtained numerically. Estimators that do not require numerical calculation have also been proposed.[3]
The generalized normal log-likelihood function has infinitely many continuous derivates (i.e. it belongs to the class C∞ of smooth functions) only if is a positive, even integer. Otherwise, the function has continuous derivatives. As a result, the standard results for consistency and asymptotic normality of maximum likelihood estimates of only apply when .
Maximum likelihood estimator
It is possible to fit the generalized normal distribution adopting an approximate maximum likelihood method.[4][5] With initially set to the sample first moment , is estimated by using a Newton–Raphson iterative procedure, starting from an initial guess of ,
where
is the first statistical moment of the absolute values and is the second statistical moment. The iteration is
where
and
and where and are the digamma function and trigamma function.
Given a value for , it is possible to estimate by finding the minimum of:
Finally is evaluated as
Applications
This version of the generalized normal distribution has been used in modeling when the concentration of values around the mean and the tail behavior are of particular interest.[6][7] Other families of distributions can be used if the focus is on other deviations from normality. If the symmetry of the distribution is the main interest, the skew normal family or version 2 of the generalized normal family discussed below can be used. If the tail behavior is the main interest, the student t family can be used, which approximates the normal distribution as the degrees of freedom grows to infinity. The t distribution, unlike this generalized normal distribution, obtains heavier than normal tails without acquiring a cusp at the origin.
Properties
The multivariate generalized normal distribution, i.e. the product of exponential power distributions with the same and parameters, is the only probability density that can be written in the form and has independent marginals.[8] The results for the special case of the Multivariate normal distribution is originally attributed to Maxwell.[9]
Version 2
Probability density function
|
|
Cumulative distribution function
|
|
Parameters | location (real) scale (positive, real) shape (real) |
---|---|
Support | |
, where is the standard normal pdf |
|
CDF | , where is the standard normal CDF |
Mean | |
Median | |
Variance | |
Skewness | |
Ex. kurtosis |
This is a family of continuous probability distributions in which the shape parameter can be used to introduce skew.[10][11]When the shape parameter is zero, the normal distribution results. Positive values of the shape parameter yield left-skewed distributions bounded to the right, and negative values of the shape parameter yield right-skewed distributions bounded to the left. Only when the shape parameter is zero is the density function for this distribution positive over the whole real line: in this case the distribution is a normal distribution, otherwise the distributions are shifted and possibly reversed log-normal distributions.
Parameter estimation
Parameters can be estimated via maximum likelihood estimation or the method of moments. The parameter estimates do not have a closed form, so numerical calculations must be used to compute the estimates. Since the sample space (the set of real numbers where the density is non-zero) depends on the true value of the parameter, some standard results about the performance of parameter estimates will not automatically apply when working with this family.
Applications
This family of distributions can be used to model values that may be normally distributed, or that may be either right-skewed or left-skewed relative to the normal distribution. The skew normal distribution is another distribution that is useful for modeling deviations from normality due to skew. Other distributions used to model skewed data include the gamma, lognormal, and Weibull distributions, but these do not include the normal distributions as special cases.
Other distributions related to the normal
The two generalized normal families described here, like the skew normal family, are parametric families that extends the normal distribution by adding a shape parameter. Due to the central role of the normal distribution in probability and statistics, many distributions can be characterized in terms of their relationship to the normal distribution. For example, the lognormal, folded normal, and inverse normal distributions are defined as transformations of a normally-distributed value, but unlike the generalized normal and skew-normal families, these do not include the normal distributions as special cases.
Actually all distributions with finite variance are in the limit highly related to the normal distribution. The Student-t distribution, the Irwin–Hall distribution and the Bates distribution also extend the normal distribution, and include in the limit the normal distribution. So there is no strong reason to prefer the "generalized" normal distribution of type 1, e.g. over a combination of Student-t and a normalized extended Irwin–Hall – this would include e.g. the triangular distribution (which cannot be modeled by the generalized Gaussian type 1).
A symmetric distribution which can model both tail (long and short) and center behavior (like flat, triangular or Gaussian) completely independently could be derived e.g. by using X = IH/chi.
Skew normal distribution
Probability density function
|
|
Cumulative distribution function
|
|
Parameters | location (real) scale (positive, real) shape (real) |
---|---|
Support | |
CDF | is Owen's T function |
Mean | where |
Variance | |
Skewness | |
Ex. kurtosis | |
MGF | |
CF |
In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.
Definition
Let denote the standard normal probability density function
with the cumulative distribution function given by
- ,
where erf is the error function. Then the probability density function (pdf) of the skew-normal distribution with parameter is given by
This distribution was first introduced by O'Hagan and Leonard (1976). A popular alternative parameterization is due to Mudholkar and Hutson (2000), which has a form of the c.d.f. that is easily inverted such that there is a closed form solution to the quantile function.
A stochastic process that underpins the distribution was described by Andel, Netuka and Zvara (1984).[1] Both the distribution and its stochastic process underpinnings were consequences of the symmetry argument developed in Chan and Tong (1986), which applies to multivariate cases beyond normality, e.g. skew multivariate t distribution and others. The distribution is a particular case of a general class of distributions with probability density functions of the form f(x)=2 φ(x) Φ(x) where φ() is any PDF symmetric about zero and Φ() is any CDF whose PDF is symmetric about zero.[2]
To add location and scale parameters to this, one makes the usual transform . One can verify that the normal distribution is recovered when , and that the absolute value of the skewness increases as the absolute value of increases. The distribution is right skewed if and is left skewed if . The probability density function with location , scale , and parameter becomes
Note, however, that the skewness of the distribution is limited to the interval .
Estimation
Maximum likelihood estimates for , , and can be computed numerically, but no closed-form expression for the estimates is available unless . If a closed-form expression is needed, the method of moments can be applied to estimate from the sample skew, by inverting the skewness equation. This yields the estimate
where , and is the sample skew. The sign of is the same as the sign of . Consequently, .
The maximum (theoretical) skewness is obtained by setting in the skewness equation, giving . However it is possible that the sample skewness is larger, and then cannot be determined from these equations. When using the method of moments in an automatic fashion, for example to give starting values for maximum likelihood iteration, one should therefore let (for example) .
Concern has been expressed about the impact of skew normal methods on the reliability of inferences based upon them.[3]
Differential equation
The differential equation leading to the pdf of the skew normal distribution is
- ,
with initial conditions
-
广义高斯分布:亚高斯信号,高斯信号,超高斯信号
一个信号的高斯性是通过其峭度定义的。在信号x的均值为零的条件下,其峭度定义如下:kurt(x)=E{x^4}-3[E{x^2}]^2<0 次高斯信号 (亚高斯信号)kurt(x) =0 高斯信号>0 超高斯信号当我们拿到任意信号x的一个样本后,可通过如下的计算求其峭度,进而判断高斯性:假设x是1*N的行向量:x=x-mean(x)*ones(1,N); %去均值KurtX=mean(x.^4)-3*(mean(x.^2))^2; %求峭度均匀分布的信号是次高斯信号,拉普拉斯分布的信号是超高斯信号。语音信号是超高斯信号。根据中心极限定理的意义,N个不同分布信号的联合分布有高斯化的趋势,所以信号的非高斯性是盲信号分离一个很好的优化判据。相对于高斯信号,亚高斯信号更平坦多峰,超高斯信号更尖锐且有更长的尾巴。对于高斯分布的信号,二阶统计量足以描述其特性,但是对于通信系统中典型的通信信号,其分布通常是欠高斯的,所以二阶统计量不足以描述其特性,必须用更高阶统计量描述其特性。非平稳信号:可以简单地理解为分布参数或者分布律随时间发生变化。高斯信号:是分布规律符合正态分布的非平稳信号而非平稳高斯信号:就是信号的分布律不随时间变化,总是高斯的,但分布参数(均值和方差)却是随时间变化的。一般对于非平稳信号,主要有时频分析和小波分析。补充:高斯信号就是信号的各种幅值出现的机会满足高斯分布的信号。站在ICA上说,高斯信号的坏处就是,它看起来就是一堆玉米(顺便废话:它的概率密度曲线看起来确实很像玉米堆),你在一堆玉米上再倒上一堆玉米,得到的仍然是一堆玉米,看不出来是由原来两堆玉米混起来的,所以在理论上是不可分离的。超高斯分布比高斯分布更加集中亚高斯分布比高斯分布平坦超高斯:四阶累积量大于0亚高斯:四阶累积量小于0
Generalized normal distribution and Skew normal distribution的更多相关文章
- Distribution download cancelled. Using distribution from 'https://services.gradle.org/distributions/
Distribution download cancelled. Using distribution from ‘https://services.gradle.org/distributions/ ...
- jenkins打包ios 报错rror: No signing certificate "iOS Distribution" found: No "iOS Distribution...
错误提示如图: error: No signing certificate "iOS Distribution" found: No "iOS Distribution& ...
- Study notes for Discrete Probability Distribution
The Basics of Probability Probability measures the amount of uncertainty of an event: a fact whose o ...
- Notes on the Dirichlet Distribution and Dirichlet Process
Notes on the Dirichlet Distribution and Dirichlet Process In [3]: %matplotlib inline Note: I wrote ...
- 一起啃PRML - 1.2.4 The Gaussian distribution 高斯分布 正态分布
一起啃PRML - 1.2.4 The Gaussian distribution 高斯分布 正态分布 @copyright 转载请注明出处 http://www.cnblogs.com/chxer/ ...
- Chi-Square Statistic/Distribution
. 1.What is a Chi Square Test? 卡方检验有两种类型.两者使用卡方统计量和分布的目的不同. 第一种:卡方拟合优度检验确定样本数据是否与总体匹配.(这里不介绍) 第二种:独立 ...
- UNDERSTANDING THE GAUSSIAN DISTRIBUTION
UNDERSTANDING THE GAUSSIAN DISTRIBUTION Randomness is so present in our reality that we are used to ...
- [Math Review] Statistics Basic: Sampling Distribution
Inferential Statistics Generalizing from a sample to a population that involves determining how far ...
- #np.random.normal,产生制定分布的数集(默认是标准正态分布)
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html #np.random.normal,产生制定分 ...
随机推荐
- spring applicationContext.xml中<beans>中属性概述
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w ...
- 014 再次整理关于hadoop中yarn的原理及运行
一:对yarn的理解 1.关于yarn的组成 大约分成主要的四个. Resourcemanager,Nodemanager,Applicationmaster,container 2.Resource ...
- 关于伪类“:pseudo-class”和伪元素“::pseudo-element”的常见应用
伪类用于指定要选择的元素的特殊状态,向其添加特殊的效果,比如: input { width: 515px; height: 50px; padding: 10px 20px; border: 1px ...
- jQuery选择器(转)
原文:http://www.cnblogs.com/qinwang913/p/3444135.html 一.基本分类 jQuery选择器大致可以分为两类,基本选择器和过滤选择器,总体结构体系如下: ...
- HashMap几个需要注意的知识点
HashMap简介 HashMap 是java集合框架的一部分. key value都允许null值 (除了非同步和允许使用 null 之外,HashMap 类与 Hashtable 大致相同) 不保 ...
- 清北学堂省选刷题冲刺班 Test Day3
目录 2018.3.27 Test T1 T2 T3 考试代码 T2 T3 2018.3.27 Test 时间:8:00~11:30 期望得分:100+60+25=185 实际得分:100+40+25 ...
- BZOJ4065 : [Cerc2012]Graphic Madness
因为两棵树中间只有k条边,所以这些边一定要用到. 对于每棵树分别考虑: 如果一个点往下连着两个点,那么这个点往上的那条边一定不能用到. 如果一个点往下连着一个点,那么这个点往上的那条边一定不能用到. ...
- python tcp 实时抓包
问题:之前我们系统上线后,因为是旧的系统,没有加统计的功能,比如用户喜欢那个页面,是哪些用户再访问,接口的负载能力等等. 解决办法:1,现有代码更改,添加功能.现有代码侵入太多,工作量比较大 2,想到 ...
- codeforces 596E Wilbur and Strings
题意:一个矩阵上面有0~9的数字,可以从任意一个格子出发,每次根据格子上的数字会前进到另一个格子(或原地不动),现在给出q个数位串,问是否有走法可以取出这个串(走到格子上的时候可以不取). 思路:发现 ...
- 国内代码托管git-osc基础使用教程
git-osc是开源中国社区团队推出的基于Git的快速的.免费的.稳定的在线代码托管平台,不限制私有库和公有库数量.国内同类的有taocode.SVNchina等等 个人更喜欢git-osc的界面与操 ...