Definition of matrix norms
In my previous post, I introduced various definitions of matrix norms in \(\mathbb{R}^{n \times n}\) based on the corresponding vector norms in \(\mathbb{R}^n\). Meanwhile, the equivalence of different vector norms and their induced metrics and topologies in \(\mathbb{R}^n\) is also inherited into \(\mathbb{R}^{n \times n}\). In this article, we’ll show why the above defined matrix norms are valid.
Generally, the definition of a matrix norm in \(\mathbb{R}^{n \times n}\) should satisfy the following four conditions:
- Positive definiteness: for all \(A \in \mathbb{R}^{n \times n}\), \(\norm{A} \geq 0\). \(\norm{A} = 0\) if and only if \(A = 0\).
- Absolute homogeneity: for all \(\alpha \in \mathbb{R}\) and \(A \in \mathbb{R}^{n \times n}\), \(\norm{\alpha A} = \abs{\alpha} \norm{A}\).
- Triangle inequality: for all \(A, B \in \mathbb{R}^{n \times n}\), \(\norm{A + B} \leq \norm{A} + \norm{B}\).
- Sub-multiplicity: for all \(A, B \in \mathbb{R}^{n \times n}\), \(\norm{AB} \leq \norm{A} \norm{B}\).
Therefore, we need to prove the following theorem in order to meet the above requirements.
Theorem Let \(\norm{\cdot}\) be a norm on \(\mathbb{R}^n\). Then for all \(A \in \mathbb{R}^{n \times n}\), its matrix norm \(\zeta: \mathbb{R}^{n \times n} \rightarrow \mathbb{R}\) can be defined as
\[
\zeta(A) = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A \vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \norm{\vect{x}}=1} \norm{A \vect{x}}
\]
Proof a) Positive definiteness and absolute homogeneity directly inherit from vector norms.
b) The triangle inequality can be proved as following.
\[
\begin{aligned}
\zeta(A + B) &= \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{(A + B) \vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x} + B\vect{x}}}{\norm{\vect{x}}} \\
& \leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}} + \norm{B\vect{x}}}{\norm{\vect{x}}} \leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}}{\norm{\vect{x}}} + \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{B\vect{x}}}{\norm{\vect{x}}} \\
&= \zeta(A) + \zeta(B).
\end{aligned}
\]
c) For sub-multiplicity, we have
\[
\begin{aligned}
\zeta(AB) &= \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{AB\vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{AB\vect{x}} \norm{B\vect{x}}}{\norm{B\vect{x}}\norm{\vect{x}}} \\
&\leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}}{\norm{\vect{x}}} \cdot \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{B\vect{x}}}{\norm{\vect{x}}} = \norm{A} \cdot \norm{B}.
\end{aligned}
\]
d) Prove \(\zeta(A) = \sup_{\forall \vect{x} \in \mathbb{R}^n, \norm{\vect{x}} = 1} \norm{A\vect{x}}\).
Note that \(\frac{1}{\norm{\vect{x}}}\) is a scalar value in \(\mathbb{R}\), then with the proved absolute homogeneity, we have
\[
\zeta(A) = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}}{\norm{\vect{x}}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \left\Vert A \cdot \frac{\vect{x}}{\norm{\vect{x}}} \right\Vert.
\]
By letting \(\vect{x}' = \frac{\vect{x}}{\norm{\vect{x}}}\), we have this part proved.
Summarizing a) to d), \(\norm{\cdot}\) is literally a matrix norm induced from the corresponding vector norm.
Next, we prove the validity of the detailed formulations of the matrix norms, i.e.
- 1-norm: \(\norm{A}_1 = \max_{1 \leq j \leq n} \sum_{i=1}^n \abs{a_{ij}}\), which is the maximum column sum;
- 2-norm: \(\norm{A}_2 = \sqrt{\rho(A^T A)}\), where \(\rho\) represents the spectral radius, i.e. the maximum eigenvalue of \(A^TA\);
- \(\infty\)-norm: \(\norm{A}_{\infty} = \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\), which is the maximum row sum.
a) 1-norm: Because
\[
\begin{aligned}
\norm{A\vect{x}}_1 &= \sum_{i=1}^n \left\vert \sum_{j=1}^n a_{ij} x_j \right\vert \leq \sum_{i=1}^n \sum_{j=1}^n \abs{a_{ij} x_j} = \sum_{j=1}^n \left( \abs{x_j} \sum_{i=1}^n \abs{a_{ij}} \right) \\
&\leq \left( \sum_{j=1}^n \abs{x_j} \right) \cdot \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right),
\end{aligned}
\]
we have
\[
\norm{A}_1 \leq \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}_1}{\norm{\vect{x}}_1} \leq \frac{\left( \sum_{j=1}^n \abs{x_j} \right) \cdot \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right)}{\sum_{j=1}^n \abs{x_j}} = \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right).
\]
Then, we need to show that the maximum value on the right hand side is achievable.
Assume that when \(j = j_0\), \(\sum_{i=1}^n \abs{a_{ij}}\) has the maximum value. If this value is zero, it means \(A\) is a zero matrix and the definition of matrix 1-norm is trivially true. If this value is not zero, by letting \(\vect{x} = (\delta_{ij_0})_{i \geq 1}^n\) with \(\delta_{ij_0}\) being the Kronecker delta, we have
\[
\frac{\norm{A\vect{x}}_1}{\norm{\vect{x}}_1} = \frac{\sum_{i=1}^n \abs{a_{ij_0}}}{1} = \max_{1 \leq j \leq n} \left( \sum_{i=1}^n \abs{a_{ij}} \right).
\]
b) 2-norm: The proof for this part needs the intervention of inner product \(\langle \cdot, \cdot \rangle\) of vectors in \(\mathbb{R}^n\), from which the vector 2-norm can be induced. Then we have
\[
\norm{A}_2 = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \frac{\norm{A\vect{x}}_2}{\norm{\vect{x}}_2} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \sqrt{\frac{\langle A\vect{x}, A\vect{x} \rangle}{\langle \vect{x}, \vect{x} \rangle}} = \sup_{\forall \vect{x} \in \mathbb{R}^n, \vect{x} \neq 0} \sqrt{\frac{\langle A^*A\vect{x}, \vect{x} \rangle}{\langle \vect{x}, \vect{x} \rangle}},
\]
where \(A^*\) is the adjoint operator, i.e. transpose of \(A\). Therefore, \(A^*A\) is a real valued symmetric matrix which has \(n\) real eigenvalues \(\{\lambda_i\}_{i=1}^n\) with \(0 \leq \lambda_1 \leq \cdots \leq \lambda_n\) and \(n\) corresponding orthonormal eigenvectors \(\{\vect{v}_i\}_{i=1}^n\) (N.B. There may be duplicates in the eigenvalues). For all \(\vect{x} \in \mathbb{R}^n\), it can be expanded as \(\vect{x} = \sum_{i=1}^n a_i \vect{v}_i\) and \(A^*A\vect{x} = \sum_{i=1}^n a_i A^*A \vect{v}_i = \sum_{i=1}^n a_i \lambda_i \vect{v}_i\). Then we have
\[
\begin{aligned}
\langle A^*A\vect{x}, \vect{x} \rangle &= \left\langle \sum_{i=1}^n a_i \lambda_i \vect{v}_i, \sum_{j=1}^n a_j \vect{v}_j \right\rangle = \sum_{i=1}^n \sum_{j=1}^n \lambda_i a_i^2 \langle \vect{v}_i, \vect{v}_j \rangle \\
&= \sum_{i=1}^n \sum_{j=1}^n \lambda_i a_i^2 \delta_{ij} = \sum_{i=1} \lambda_i a_i^2.
\end{aligned}
\]
Meanwhile,
\[
\langle \vect{x}, \vect{x} \rangle = \left\langle \sum_{i=1}^n a_i \vect{v}_i, \sum_{j=1}^n a_j \vect{v}_j \right\rangle = \sum_{i=1}^n \sum_{j=1}^n a_i a_j \langle \vect{v}_i, \vect{v}_j \rangle = \sum_{i=1}^n a_i^2.
\]
Therefore,
\[
\frac{\norm{A\vect{x}}_2}{\norm{\vect{x}}_2} \leq \sqrt{\frac{\lambda_n \sum_{i=1}^n a_i^2}{\sum_{i=1}^n a_i^2}} = \sqrt{\lambda_n}.
\]
By letting \(a_1 = a_2 = \cdots = a_{n-1} = 0\) and \(a_n = 1\), we have \(\frac{\norm{A\vect{x}}_2}{\norm{\vect{x}}_2} = \sqrt{\lambda_n}\). Hence,
\[
\norm{A}_2 = \sqrt{\lambda_n} = \sqrt{\rho(A^*A)}
\]
and the definition of matrix 2-norm is valid.
c) \(\infty\)-norm:
\[
\begin{aligned}
\norm{A\vect{x}}_{\infty} &= \max_{1 \leq i \leq n} \left( \left\vert \sum_{j=1}^n a_{ij} x_j \right\vert \right) \leq \max_{1 \leq i \leq n} \left( \sum_{j=1}^n \abs{a_{ij}} \cdot \abs{x_j} \right) \\
&= \max_{1 \leq i \leq n} \left( \left( \sum_{j=1}^n \abs{a_{ij}} \right) \cdot \left( \max_{1 \leq j \leq n} \abs{x_j} \right) \right) = \left( \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}} \right) \cdot \left( \max_{1 \leq j \leq n} \abs{x_j} \right) \\
\norm{\vect{x}}_{\infty} &= \max_{1 \leq i \leq n} \abs{x_i}
\end{aligned}
\]
Therefore, \(\frac{\norm{A\vect{x}}_{\infty}}{\norm{\vect{x}}_{\infty}} \leq \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\). Then, we need to prove this maximum value is achievable.
Assume when \(i = i_0\), \(\sum_{j=1}^n \abs{a_{i_0 j}}\) achieves the maximum. If this value is zero, \(A\) is a zero matrix and the definition of matrix \(\infty\)-norm is trivially true. If this value is not zero, by letting \(\vect{x} = (\sgn(a_{i_0 1}), \cdots, \sgn(a_{i_0 n}))^{\rm T}\), we have \(\norm{\vect{x}}_{\infty} = 1\) and \(\norm{A\vect{x}}_{\infty} = \sum_{j=1}^n \abs{a_{i_0 j}} = \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\). Hence, \(\frac{\norm{A\vect{x}}_{\infty}}{\norm{\vect{x}}_{\infty}} = \max_{1 \leq i \leq n} \sum_{j=1}^n \abs{a_{ij}}\) and the definition of \(\infty\)-norm is valid.
Definition of matrix norms的更多相关文章
- James Munkres Topology: Theorem 20.3 and metric equivalence
Proof of Theorem 20.3 Theorem 20.3 The topologies on \(\mathbb{R}^n\) induced by the euclidean metri ...
- CSharpGL(13)用GLSL实现点光源(point light)和平行光源(directional light)的漫反射(diffuse reflection)
CSharpGL(13)用GLSL实现点光源(point light)和平行光源(directional light)的漫反射(diffuse reflection) 2016-08-13 由于CSh ...
- Numpy应用100问
对于从事机器学习的人,python+numpy+scipy+matplotlib是重要的基础:它们基本与matlab相同,而其中最重要的当属numpy:因此,这里列出100个关于numpy函数的问题, ...
- Opengl的gl_NormalMatrix【转】
原文地址:http://blog.csdn.net/ichild1964/article/details/9728357 参考:http://www.gamedev.net/topic/598985- ...
- Applying Eigenvalues to the Fibonacci Problem
http://scottsievert.github.io/blog/2015/01/31/the-mysterious-eigenvalue/ The Fibonacci problem is a ...
- Matlab norm 用法小记
Matlab norm 用法小记 matlab norm (a) 用法以及实例 norm(A,p)当A是向量时norm(A,p) Returns sum(abs(A).^p)^(1/p), for ...
- 《Machine Learning》系列学习笔记之第二周
第二周 第一部分 Multivariate Linear Regression Multiple Features Note: [7:25 - θT is a 1 by (n+1) matrix an ...
- 一些矩阵范数的subgradients
目录 引 正交不变范数 定理1 定理2 例子:谱范数 例子:核范数 算子范数 定理3 定理4 例子 \(\ell_2\) <Subgradients> Subderivate-wiki S ...
- subgradients
目录 定义 上镜图解释 次梯度的存在性 性质 极值 非负数乘 \(\alpha f(x)\) 和,积分,期望 仿射变换 仿梯度 混合函数 应用 Pointwise maximum 上确界 suprem ...
随机推荐
- JS设置Cookie过期时间
//JS操作cookies方法! //写cookies function setCookie(name,value) { var Days = 30; var exp = new Date(); ex ...
- Angular记录(3)
文档资料 箭头函数--MDN:https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Functions/Arrow_fun ...
- JSON使用与类型转换
JSON语法与对象 JSON方法与使用 一.JSON语法与对象 JSON是英文JavaScript Object Notation(JavaScript 对象表示法)的缩写,是存储和交换文本信息的语法 ...
- Dynamic Clock in Terminal.
#!/bin/bash tput civis while [ 1 ] do tput clear # tput cup 10 20 info=$(date "+%Y-%m-%d %H:%M: ...
- LINUX 常用命令(二)
B0.用户相关配置文件 用户信息文件: /etc/passwd密码文件: /etc/shadow用户组文件: /etc/group用户组密 ...
- 20155324《网络对抗》Exp2 后门原理与实践
20155324<网络对抗>Exp2 后门原理与实践 20155324<网络对抗>Exp2 后门原理与实践 常用后门工具实践 Windows获得Linux Shell 在Win ...
- luogu P5320 [BJOI2019]勘破神机
传送门 首先我们要知道要求什么.显然每次放方块要放一大段不能从中间分开的部分.设\(m=2\)方案为\(f\),\(m=3\)方案为\(g\),\(m=2\)可以放一个竖的,或者两个横的,所以\(f_ ...
- Python+Selenium+Unittest框架使用——Selenium——模拟操作浏览器(三)
1.浏览器大小的控制 Set_window_size()是控制浏览器大小 Maximize_window()浏览器全屏显示 from selenium import webdriver #导入sele ...
- Linux的快捷键一
- python开发基础之数据类型、字符编码、文件操作
一.知识点 1.身份运算: 2.现在计算机系统通用的字符编码工作方式:在计算机内存中,统一使用Unicode编码,当需要保存到硬盘或者需要传输的时候,就转换为UTF-8编码.用记事本编辑的时候,从文件 ...