7.1 Singular values and Singular vectors

The SVD separates any matrix into simple pieces.

A is any m by n matrix, square or rectangular, Its rank is r.

Choices from the SVD

\[AA^Tu_i = \sigma_i^{2}u_i \\
A^TAv_i = \sigma_i^{2}v_i \\
Av_i = \sigma_i u_i
\]

\(u_i\)— the left singular vectors (unit eigenvectors of \(AA^T\))

\(v_i\)— the right singular vectors (unit eigenvectors of \(A^TA\))

\(\sigma_i\)— singular values (square roots of the equal eigenvalues of \(AA^T\) and \(A^TA\))

The rank of A is equal to numbers of \(\sigma _i\)

example:

\[A = \left [ \begin{matrix} 1&0 \\ 1&1 \end{matrix}\right] \\
\Downarrow \\
AA^T =
\left [ \begin{matrix} 1&0 \\ 1&1 \end{matrix}\right]
\left [ \begin{matrix} 1&1 \\ 0&1 \end{matrix}\right]
=\left [ \begin{matrix} 1&1 \\ 1&2 \end{matrix}\right]
\\
A^TA =
\left [ \begin{matrix} 1&1 \\ 0&1 \end{matrix}\right]
\left [ \begin{matrix} 1&0 \\ 1&1 \end{matrix}\right]
=\left [ \begin{matrix} 2&1 \\ 1&1 \end{matrix}\right]
\\
\Downarrow \\
det(AA^T - I) = 0 \ \quad \ det(A^TA - I) = 0 \\
\lambda_1 = \frac{3+\sqrt{5}}{2} , \sigma_1=\frac{1+\sqrt{5}}{2},
u_1= \frac{1}{\sqrt{1+\sigma_1^2}}\left [ \begin{matrix} 1 \\ \sigma_1 \end{matrix}\right],
v_1= \frac{1}{\sqrt{1+\sigma_1^2}}\left [ \begin{matrix} \sigma_1 \\ 1 \end{matrix}\right]
\\
\lambda_2 = \frac{3-\sqrt{5}}{2} , \sigma_1=\frac{1-\sqrt{5}}{2},
u_2= \frac{1}{\sqrt{1+\sigma_2^2}}\left [ \begin{matrix} \sigma_1 \\ -1 \end{matrix}\right],
v_2= \frac{1}{\sqrt{1+\sigma_2^2}}\left [ \begin{matrix} 1 \\ -\sigma_1 \end{matrix}\right]\\
\Downarrow \\
A =
\left [ \begin{matrix} u_1&u_2 \end{matrix}\right]
\left [ \begin{matrix} \sigma_1&\\&\sigma_2 \end{matrix}\right]
\left [ \begin{matrix} v_1^T\\v_2^T \end{matrix}\right]
\\
A\left [ \begin{matrix} v_1&v_2 \end{matrix}\right] =
\left [ \begin{matrix} u_1&u_2 \end{matrix}\right]
\left [ \begin{matrix} \sigma_1&\\&\sigma_2 \end{matrix}\right]
\]

7.2 Bases and Matrices in the SVD

Keys:

  1. The SVD produces orthonormal basis of \(u's\) and $ v's$ for the four fundamental subspaces.

    • \(u_1,u_2,...,u_r\) is an orthonormal basis of the column space. (\(R^m\))
    • \(u_{r+1},...,u_{m}\) is an orthonormal basis of the left nullspace. (\(R^m\))
    • \(v_1,v_2,...,v_r\) is an orthonormal basis of the row space. (\(R^n\))
    • \(v_{r+1},...,u_{n}\) is an orthonormal basis of the nullspace.(\(R^n\))
  2. Using those basis, A can be diagonalized :

    Reduced SVD: only with bases for the row space and column space.

    \[A = U_r \Sigma_r V_r^T \\
    U = \left [ \begin{matrix} u_1&\cdots&u_r\\ \end{matrix}\right] ,
    \Sigma_r = \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \end{matrix}\right],
    V_r^T=\left [ \begin{matrix} v_1\\ \vdots \\ v_r \end{matrix}\right] \\
    \Downarrow \\
    A = \left [ \begin{matrix} u_1&\cdots&u_r\\ \end{matrix}\right]
    \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \end{matrix}\right]
    \left [ \begin{matrix} v_1\\ \vdots \\ v_r \end{matrix}\right] \\
    = u_1\sigma_1v_{1}^T + u_2\sigma_2v_{2}^T + \cdots + u_r\sigma_rv_r^T
    \]

    Full SVD: include four subspaces.

    \[A = U \Sigma V^T \\
    U = \left [ \begin{matrix} u_1&\cdots&u_r&\cdots&u_n\\ \end{matrix}\right] ,
    \Sigma_r = \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \\ &&&\ddots \\ &&&&\sigma_n \end{matrix}\right],
    V^T=\left [ \begin{matrix} v_1\\ \vdots \\ v_r \\ \vdots \\ v_m \end{matrix}\right] \\
    \Downarrow \\
    A = \left [ \begin{matrix} u_1&\cdots&u_r&\cdots&u_n\\ \end{matrix}\right]
    \left [ \begin{matrix} \sigma_1&&\\&\ddots&\\&&\sigma_r \\ &&&\ddots \\ &&&&\sigma_n \end{matrix}\right]
    \left [ \begin{matrix} v_1\\ \vdots \\ v_r \\ \vdots \\ v_m \end{matrix}\right] \\
    = u_1\sigma_1v_{1}^T + u_2\sigma_2v_{2}^T + \cdots + u_r\sigma_rv_r^T\cdots + u_n\sigma_n v_n^{T} + \cdots + u_m\sigma_mv_m^T
    \]

    example: \(A=\left [ \begin{matrix} 3&0 \\ 4&5 \end{matrix}\right]\), r=2

    \[A^TA =\left [ \begin{matrix} 25&20 \\ 20&25 \end{matrix}\right],
    AA^T =\left [ \begin{matrix} 9&12 \\ 12&41 \end{matrix}\right] \\
    \lambda_1 = 45, \sigma_1 = \sqrt{45},
    v_1 = \frac{1}{\sqrt{2}}
    \left [ \begin{matrix} 1 \\ 1 \end{matrix}\right],
    u_1 = \frac{1}{\sqrt{10}}
    \left [ \begin{matrix} 1 \\ 3 \end{matrix}\right]\\
    \lambda_2 = 5, \sigma_2 = \sqrt{5} ,
    v_2 = \frac{1}{\sqrt{2}}
    \left [ \begin{matrix} -1 \\ 1 \end{matrix}\right],
    u_2 = \frac{1}{\sqrt{10}}
    \left [ \begin{matrix} -3 \\ 1 \end{matrix}\right]\\
    \Downarrow \\
    U = \frac{1}{\sqrt{10}}
    \left [ \begin{matrix} 1&-3 \\ 3&1 \end{matrix}\right],
    \Sigma = \left [ \begin{matrix} \sqrt{45}& \\ &\sqrt{5} \end{matrix}\right],
    V = \frac{1}{\sqrt{2}}
    \left [ \begin{matrix} 1&-1 \\ 1&1 \end{matrix}\right]
    \]

7.3 The geometry of the SVD

  1. \(A = U\Sigma V^T\) factors into (rotation)(stretching)(rotation), the geometry shows how A transforms vectors x on a circle to vectors Ax on an ellipse.

  1. Polar decomposition factors A into QS : rotation \(Q=UV^T\) times streching \(S=V \Sigma V^T\).

    \[V^TV = I \\
    A = U\Sigma V^T = (UV^T)(V\Sigma V^T) = (Q)(S)
    \]

    Q is orthogonal and inclues both rotations U and \(V^T\), S is symmetric positive semidefinite and gives the stretching directions.

    If A is invertible, S is positive definite.

  2. The Pseudoinverse \(A^{+}: AA^{+}=I\)

    • \(Av_i=\sigma_iu_i\) : A multiplies \(v_i\) in the row space of A to give \(\sigma_i u_i\) in the column space of A.

    • If \(A^{-1}\) exists, \(A^{-1}u_i=\frac{v_i}{\sigma}\) : \(A^{-1}\) multiplies \(u_i\) in the row space of \(A^{-1}\) to give \(\sigma_i u_i\) in the column space of \(A^{-1}\), \(1/\sigma_i\) is singular values of \(A^{-1}\).

    • Pseudoinverse of A: if \(A^{-1}\) exists, then \(A^{+}\) is the same as \(A^{-1}\)

      \[A^{+} = V \Sigma^{+}U^{T} = \left [ \begin{matrix} v_1&\cdots&v_r&\cdots&v_n\\ \end{matrix}\right]
      \left [ \begin{matrix} \sigma_1^{-1}&&\\&\ddots&\\&&\sigma_r^{-1} \\ &&&\ddots \\ &&&&\sigma_n^{-1} \end{matrix}\right]
      \left [ \begin{matrix} u_1\\ \vdots \\ u_r \\ \vdots \\ u_m \end{matrix}\right] \\
      \]

7.4 Principal Component Analysis ( PCA by the SVD)

PCA gives a way to understand a data plot in dimension m, applications mostly are human genetics \ face recognition\ finance \ model order reduction (computation) .

The sample covariance matrix \(S=AA^T/(n-1)\)

The crucial connection to linear algebra is in the singular values and singular vectors of A, which comes from the eigenvalues \(\lambda=\sigma^2\) and the eigenvectors u of the sample covariance matrix \(S=AA^T/(n-1)\)

  1. The total variance in the data is the sum of all eigenvalues and of sample variances \(s^2\) :

    \[T = \sigma_1^2 + \cdots + \sigma_m^2 = s_1^2 + \cdots + s_m^2 = trace(diagonal \ \ sum)
    \]
  2. The first eigenvector \(u_1\) of S points in the most significant direction of the data.That direction accounts for a fraction \(\sigma_1^2/T\) of the total variance.

  3. The next eigenvectors \(u_2\) (orthogonal to \(u_1\)) accounts for a small fraction \(\sigma_2^2/T\).

  4. Stop when those fractions are small. You have the R directions that explain most of the data.The n data points are very near an R-dimensional subspace with basis \(u_1, \cdots, u_R\), which are the principal components.

  5. R is the "effective rank" of A. The true rank r is probably m or n : full rank matrix.

example: \(A = \left[ \begin{matrix} 3&-4&7&-1&-4&-3 \\ 7&-6&8&-1&-1&-7 \end{matrix} \right]\) has sample covariance \(S=AA^T/5 = \left [ \begin{matrix} 20&25 \\ 25&40 \end{matrix}\right]\)

The eigenvalues of S are 57 and 3,so the first rank one piece \(\sqrt{57}u_1v_1^T\) is much larger than the second piece \(\sqrt{3}u_2v_2^T\).

The leading eigenvector \(u_1 = (0.6,0.8)\) shows the direction that you see in the scatter graph.

The SVD of A (centered data) shows the dominant direction in the scatter plot.

The second eigenvector \(u_2\) is perpendicular to \(u_1\). The second singular value \(\sigma_2=\sqrt{3}\) measures the spread across the dominant line.

7. The Singular Value Decomposition(SVD)的更多相关文章

  1. [Math Review] Linear Algebra for Singular Value Decomposition (SVD)

    Matrix and Determinant Let C be an M × N matrix with real-valued entries, i.e. C={cij}mxn Determinan ...

  2. SVD singular value decomposition

    SVD singular value decomposition https://en.wikipedia.org/wiki/Singular_value_decomposition 奇异值分解在统计 ...

  3. 奇异值分解(We Recommend a Singular Value Decomposition)

    奇异值分解(We Recommend a Singular Value Decomposition) 原文作者:David Austin原文链接: http://www.ams.org/samplin ...

  4. We Recommend a Singular Value Decomposition

    We Recommend a Singular Value Decomposition Introduction The topic of this article, the singular val ...

  5. 【转】奇异值分解(We Recommend a Singular Value Decomposition)

    文章转自:奇异值分解(We Recommend a Singular Value Decomposition) 文章写的浅显易懂,很有意思.但是没找到转载方式,所以复制了过来.一个是备忘,一个是分享给 ...

  6. [转]奇异值分解(We Recommend a Singular Value Decomposition)

    原文作者:David Austin原文链接: http://www.ams.org/samplings/feature-column/fcarc-svd译者:richardsun(孙振龙) 在这篇文章 ...

  7. [转载]We Recommend a Singular Value Decomposition

    原文:http://www.ams.org/samplings/feature-column/fcarc-svd Introduction The topic of this article, the ...

  8. Singular value decomposition

    SVD is a factorization of a real or complex matrix. It has many useful applications in signal proces ...

  9. 关于SVD(Singular Value Decomposition)的那些事儿

    SVD简介 SVD不仅是一个数学问题,在机器学习领域,有相当多的应用与奇异值都可以扯上关系,比如做feature reduction的PCA,做数据压缩(以图像压缩为代表)的算法,还有做搜索引擎语义层 ...

  10. 从矩阵(matrix)角度讨论PCA(Principal Component Analysis 主成分分析)、SVD(Singular Value Decomposition 奇异值分解)相关原理

    0. 引言 本文主要的目的在于讨论PAC降维和SVD特征提取原理,围绕这一主题,在文章的开头从涉及的相关矩阵原理切入,逐步深入讨论,希望能够学习这一领域问题的读者朋友有帮助. 这里推荐Mit的Gilb ...

随机推荐

  1. 【LeetCode回溯算法#01】图解组合问题

    组合问题 力扣题目链接(opens new window) 给定两个整数 n 和 k,返回范围 [1, n] 中所有可能的 k 个数的组合. 示例: 输入: n = 4, k = 2 输出: [ [2 ...

  2. 【Azure 应用服务】使用Python Azure SDK 来获取 App Service的访问限制信息(Access Restrictions)

    问题描述 为Azure App Service添加访问限制,需要Python Azure SDK来实现的示例代码. 问题解答 查阅Azure App Service的官方资料,使用Python SDK ...

  3. 【Azure Redis 缓存】C#程序是否有对应的方式来优化并缩短由于 Redis 维护造成的不可访问的时间

    问题描述 C#程序是否有对应的方式来优化并缩短由于 Redis 维护造成的不可访问的时间? Redis维护说明: Redis 服务维护时,会把副本节点提升为主节点,且旧主节点关闭现有连接时,这个时候, ...

  4. 「实操」适配 NebulaGraph 新版本与压测实践

    本文来自邦盛科技-知识图谱团队-繁凡,本文以 NebulaGraph v3.1.0 为例. 前言 NebulaGraph v3.1 版本已经发布有一段时间了,但是我们的项目之前是基于 v2.6.1 版 ...

  5. C程序分别实现下列字符阵列的输出

    C程序分别实现下列字符阵列的输出:(https://zhuanlan.zhihu.com/p/443989560    可以参考这个博主写的) 1,左下三角形(代码) 1 #include <s ...

  6. Java 数组 数据类型默认值

    1 public static void main(String[] args) 2 { 3 int[] arry = new int[4]; //int 默认值0 //浮点型 0.0 4 for(i ...

  7. 使用 Docker 部署 Answer 问答平台

    1)介绍 GitHub:https://github.com/apache/incubator-answer Answer 问答社区是在线平台,让用户提出问题并获得回答.用户可以发布问题并得到其他用户 ...

  8. vscode 格式化空格,constructor 构造函数的空格 会有问题,找到一个配置文件好使

    Ctrl+Shift+P "javascript.format.enable": false, "javascript.format.insertSpaceAfterCo ...

  9. Ubuntu(Linux) PyQt5 QtUIFile 转换为 PythonModule (pyuic.py/pyuic脚本)

    PS:要转载请注明出处,本人版权所有. PS: 这个只是基于<我自己>的理解, 如果和你的原则及想法相冲突,请谅解,勿喷. 前置说明   本文作为本人csdn blog的主站的备份.(Bl ...

  10. Lifecycle详细分析

    Lifecycle源码分析 目录介绍 01.Lifecycle的作用是什么 02.Lifecycle的简单使用 03.Lifecycle的使用场景 04.如何实现生命周期感知 05.注解方法如何被调用 ...