The goal of whitening is to make the input less redundant; more formally, our desiderata are that our learning algorithms sees a training input where (i) the features are less correlated with each other, and (ii) the features all have the same variance.

example

How can we make our input features uncorrelated with each other? We had already done this when computing . Repeating our previous figure, our plot for was:

The covariance matrix of this data is given by:

It is no accident that the diagonal values are and . Further, the off-diagonal entries are zero; thus, and are uncorrelated, satisfying one of our desiderata for whitened data (that the features be less correlated).

To make each of our input features have unit variance, we can simply rescale each feature by . Concretely, we define our whitened data as follows:

Plotting , we get:

This data now has covariance equal to the identity matrix . We say that is our PCA whitened version of the data: The different components of are uncorrelated and have unit variance.

ZCA Whitening

Finally, it turns out that this way of getting the data to have covariance identity isn't unique. Concretely, if is any orthogonal matrix, so that it satisfies (less formally, if is a rotation/reflection matrix), then will also have identity covariance. In ZCA whitening, we choose . We define

Plotting , we get:

It can be shown that out of all possible choices for , this choice of rotation causes to be as close as possible to the original input data .

When using ZCA whitening (unlike PCA whitening), we usually keep all dimensions of the data, and do not try to reduce its dimension.

Regularizaton

When implementing PCA whitening or ZCA whitening in practice, sometimes some of the eigenvalues will be numerically close to 0, and thus the scaling step where we divide by would involve dividing by a value close to zero; this may cause the data to blow up (take on large values) or otherwise be numerically unstable. In practice, we therefore implement this scaling step using a small amount of regularization, and add a small constant to the eigenvalues before taking their square root and inverse:

When takes values around , a value of might be typical.

For the case of images, adding here also has the effect of slightly smoothing (or low-pass filtering) the input image. This also has a desirable effect of removing aliasing artifacts caused by the way pixels are laid out in an image, and can improve the features learned (details are beyond the scope of these notes).

ZCA whitening is a form of pre-processing of the data that maps it from to . It turns out that this is also a rough model of how the biological eye (the retina) processes images. Specifically, as your eye perceives images, most adjacent "pixels" in your eye will perceive very similar values, since adjacent parts of an image tend to be highly correlated in intensity. It is thus wasteful for your eye to have to transmit every pixel separately (via your optic nerve) to your brain. Instead, your retina performs a decorrelation operation (this is done via retinal neurons that compute a function called "on center, off surround/off center, on surround") which is similar to that performed by ZCA. This results in a less redundant representation of the input image, which is then transmitted to your brain.

Whitening的更多相关文章

  1. (六)6.8 Neurons Networks implements of PCA ZCA and whitening

    PCA 给定一组二维数据,每列十一组样本,共45个样本点 -6.7644914e-01  -6.3089308e-01  -4.8915202e-01 ... -4.4722050e-01  -7.4 ...

  2. (六)6.7 Neurons Networks whitening

    PCA的过程结束后,还有一个与之相关的预处理步骤,白化(whitening) 对于输入数据之间有很强的相关性,所以用于训练数据是有很大冗余的,白化的作用就是降低输入数据的冗余,通过白化可以达到(1)降 ...

  3. UFLDL教程之(三)PCA and Whitening exercise

    Exercise:PCA and Whitening 第0步:数据准备 UFLDL下载的文件中,包含数据集IMAGES_RAW,它是一个512*512*10的矩阵,也就是10幅512*512的图像 ( ...

  4. Deep Learning学习随记(二)Vectorized、PCA和Whitening

    接着上次的记,前面看了稀疏自编码.按照讲义,接下来是Vectorized, 翻译成向量化?暂且这么认为吧. Vectorized: 这节是老师教我们编程技巧了,这个向量化的意思说白了就是利用已经被优化 ...

  5. Modeling Filters and Whitening Filters

    Colored and White Process White Process White Process,又称为White Noise(白噪声),其中white来源于白光,寓意着PSD的平坦分布,w ...

  6. 白化(Whitening): PCA 与 ZCA (转)

    转自:findbill 本文讨论白化(Whitening),以及白化与 PCA(Principal Component Analysis) 和 ZCA(Zero-phase Component Ana ...

  7. CS229 6.8 Neurons Networks implements of PCA ZCA and whitening

    PCA 给定一组二维数据,每列十一组样本,共45个样本点 -6.7644914e-01  -6.3089308e-01  -4.8915202e-01 ... -4.4722050e-01  -7.4 ...

  8. CS229 6.7 Neurons Networks whitening

    PCA的过程结束后,还有一个与之相关的预处理步骤,白化(whitening) 对于输入数据之间有很强的相关性,所以用于训练数据是有很大冗余的,白化的作用就是降低输入数据的冗余,通过白化可以达到(1)降 ...

  9. PCA和Whitening

    PCA: PCA的具有2个功能,一是维数约简(可以加快算法的训练速度,减小内存消耗等),一是数据的可视化. PCA并不是线性回归,因为线性回归是保证得到的函数是y值方面误差最小,而PCA是保证得到的函 ...

  10. 【DeepLearning】Exercise:PCA and Whitening

    Exercise:PCA and Whitening 习题链接:Exercise:PCA and Whitening pca_gen.m %%============================= ...

随机推荐

  1. jquery easyui ajax data属性传值方式

    $.ajax({   url:url,   type:'post',   data:data,   dataType:'json',   contentType: "application/ ...

  2. Host status showing red icon in chronograph, Chronograf主机列表页显示主机状态为红色标志

    刚开始全部装好的时候主机显示的状态是绿色的,过了些日子我再打开看的时候就变成红色的了,点击主机进去查看的时候没有了图表数据,大概是这样子的, 在influxdb数据库主机上执行命令curl " ...

  3. HDU-1789 Doing Homework again 贪心问题 有时间限制的最小化惩罚问题

    题目链接:https://cn.vjudge.net/problem/HDU-1789 题意 小明有一大堆作业没写,且做一个作业就要花一天时间 给出所有作业的时间限制,和不写作业后要扣的分数 问如何安 ...

  4. [POI2009]KON-Ticket Inspector(二维前缀和+DP)

    题意 有n个车站,现在有一辆火车从1到n驶过,给出aij代表从i站上车j站下车的人的个数.列车行驶过程中你有K次检票机会,所有当前在车上的人会被检票,问最多能检多少个不同的人的票 (n<=600 ...

  5. 入门python:《Python编程从入门到实践》中文PDF+英文PDF+代码学习

    入门python推荐学习久负盛名的python入门书籍<Python编程从入门到实践>. 书中涵盖的内容是比较精简的,没有艰深晦涩的概念,最重要的是每个小结都附带有"动手试一试& ...

  6. python学习---【__all__】

    python包中都会有一个__init__.py的模块,这个模块是区分该父文件是一个python包,或是一个文件目录.这个__init__.py可以是空,也可以添加内容,最常见的就是其中的__all_ ...

  7. 洛谷P1164 小A点菜 && caioj 1410 动态规划1:点菜(背包方案问题)

    方程很简单 f[0] = 1 f[j] += f[j-w[i]] #include<cstdio> #define REP(i, a, b) for(int i = (a); i < ...

  8. ECNUOJ 2855 贪吃蛇

    贪吃蛇 Time Limit:1000MS Memory Limit:65536KBTotal Submit:480 Accepted:109 Description  相信很多人都玩过这个游戏,当然 ...

  9. Testin实验室:陌陌APP通过率为94.92% 基本满足移动社交需求

    Testin实验室:陌陌APP通过率为94.92% 基本满足移动社交需求 2014/11/10 · Testin · 独家评測 11月8日,国内移动社交应用陌陌公开向美国证券交易委员会提交了IPO申请 ...

  10. Cocos2d-x学习笔记(十四)CCAutoreleasePool具体解释

    原创文章,转载请注明出处:http://blog.csdn.net/sfh366958228/article/details/38964637 前言 之前学了那么多的内容.差点儿全部的控件都要涉及内存 ...