Speex is based on CELP, which stands for Code Excited Linear Prediction. This section attempts to introduce the principles behind CELP, so if you are already familiar with CELP, you can safely skip to section 7. The CELP technique is based on three ideas:

  1. The use of a linear prediction (LP) model to model the vocal tract
  2. The use of (adaptive and fixed) codebook entries as input (excitation) of the LP model
  3. The search performed in closed-loop in a ``perceptually weighted domain''

This section describes the basic ideas behind CELP. Note that it's still incomplete.

Linear Prediction (LPC)

Linear prediction is at the base of many speech coding techniques, including CELP. The idea behind it is to predict the signal  using a linear combination of its past samples:

where  is the linear prediction of . The prediction error is thus given by:

The goal of the LPC analysis is to find the best prediction coefficients  which minimize the quadratic error function:

That can be done by making all derivatives  equal to zero:

The  filter coefficients are computed using the Levinson-Durbin algorithm, which starts from the auto-correlation  of the signal .

For an order  filter, we have:

The filter coefficients  are found by solving the system . What the Levinson-Durbin algorithm does here is making the solution to the problem instead of  by exploiting the fact that matrix  is toeplitz hermitian. Also, it can be proven that all the roots of  are within the unit circle, which means that  is always stable. This is in theory; in practice because of finite precision, there are two commonly used techniques to make sure we have a stable filter. First, we multiply  by a number slightly above one (such as 1.0001), which is equivalent to adding noise to the signal. Also, we can apply a window to the auto-correlation, which is equivalent to filtering in the frequency domain, reducing sharp resonances.

The linear prediction model represents each speech sample as a linear combination of past samples, plus an error signal called the excitation (or residual).

In the z-domain, this can be expressed as

where  is defined as

We usually refer to  as the analysis filter and  as the synthesis filter. The whole process is called short-term prediction as it predicts the signal using a prediction using only the  past samples, where  is usually around 10.

Because LPC coefficients have very little robustness to quantization, they are converted to Line Spectral Pair (LSP) coefficients which have a much better behaviour with quantization, one of them being that it's easy to keep the filter stable.

Pitch Prediction

During voiced segments, the speech signal is periodic, so it is possible to take advantage of that property by approximating the excitation signal  by a gain times the past of the excitation:

where  is the pitch period,  is the pitch gain. We call that long-term prediction since the excitation is predicted from  with .

Innovation Codebook

The final excitation  will be the sum of the pitch prediction and an innovation signal  taken from a fixed codebook, hence the name Code Excited Linear Prediction. The final excitation is given by:

The quantization of  is where most of the bits in a CELP codec are allocated. It represents the information that couldn't be obtained either from linear prediction or pitch prediction. In the z-domain we can represent the final signal  as

Analysis-by-Synthesis and Error Weighting

Most (if not all) modern audio codecs attempt to ``shape'' the noise so that it appears mostly in the frequency regions where the ear cannot detect it. For example, the ear is more tolerant to noise in parts of the spectrum that are louder and vice versa. That's why instead of minimizing the simple quadratic error

where  is the encoder signal, we minimize the error for the perceptually weighted signal

where  is the weighting filter, usually of the form

(1)

with control parameters . If the noise is white in the perceptually weighted domain, then in the signal domain its spectral shape will be of the form

If a filter  has (complex) poles at  in the -plane, the filter  will have its poles at , making it a flatter version of .

Analysis-by-synthesis refers to the fact that when trying to find the best pitch parameters () and innovation signal , we do not work by making the excitation  as close as the original one (which would be simpler), but apply the synthesis (and weighting) filter and try making  as close to the original as possible.

参考资料:

1 百科总结: https://zh.wikipedia.org/wiki/%E7%A0%81%E6%BF%80%E5%8A%B1%E7%BA%BF%E6%80%A7%E9%A2%84%E6%B5%8B
2 详细介绍: http://ntools.net/arc/Documents/speex/manual/node8.html

Introduction to CELP Coding的更多相关文章

  1. Spark 大数据平台 Introduction part 2 coding

    Basic Functions sc.parallelize(List(1,2,3,4,5,6)).map(_ * 2).filter(_ > 5).collect() *** res: Arr ...

  2. 算术编码Arithmetic Coding-高质量代码实现详解

    关于算术编码的具体讲解我不多细说,本文按照下述三个部分构成. 两个例子分别说明怎么用算数编码进行编码以及解码(来源:ARITHMETIC CODING FOR DATA COIUPRESSION): ...

  3. Zen Coding in Visual Studio 2012

    http://www.johnpapa.net/zen-coding-in-visual-studio-2012 Zen Coding is a faster way to write HTML us ...

  4. Introduction to ASP.NET Web Programming Using the Razor Syntax (C#)

    1, http://www.asp.net/web-pages/overview/getting-started/introducing-razor-syntax-c 2, Introduction ...

  5. Top 10 Algorithms for Coding Interview--reference

    By X Wang Update History:Web Version latest update: 4/6/2014PDF Version latest update: 1/16/2014 The ...

  6. 转:Top 10 Algorithms for Coding Interview

    The following are top 10 algorithms related concepts in coding interview. I will try to illustrate t ...

  7. Github Coding Developer Book For LiuGuiLinAndroid

    Github Coding Developer Book For LiuGuiLinAndroid 收集了这么多开源的PDF,也许会帮到一些人,现在里面的书籍还不是很多,我也在一点点的上传,才上传不到 ...

  8. 使用Travis CI自动部署博客到github pages和coding pages

    每次换系统或换电脑之后重新部署博客总是很苦恼?想像jekyll那样,一次性部署完成后,以后本地不用安装环境直接 git push 就能生成博客?那推荐你应该使用使用 Travis CI了. 这篇文章我 ...

  9. Introduction to Parallel Computing

    Copied From:https://computing.llnl.gov/tutorials/parallel_comp/ Author: Blaise Barney, Lawrence Live ...

随机推荐

  1. 微信小程序---setData

    data:{ obj:{ name:'hello' } } 对data中obj的name字段进行重新赋值. onLoad: function (option) { var value = 'obj.n ...

  2. Cobbler安装CentOS7系统时报错 curl:(7)Failed connect to 10.0.0.201:80;Connection refused

    问题原因: 其他涉及到http服务的端口全部都改成了81端口.只有 /etc/cobbler/settings 这里没有改. [root@mage-monitor- ~/]#grep -E " ...

  3. Codeforces Round #499 (Div. 2)

    Codeforces Round #499 (Div. 2) https://codeforces.com/contest/1011 A #include <bits/stdc++.h> ...

  4. 567. Permutation in String判断某字符串中是否存在另一个字符串的Permutation

    [抄题]: Given two strings s1 and s2, write a function to return true if s2 contains the permutation of ...

  5. node.js中express框架的基本使用

    express是一个基于node.js平台的,快速,开放,极简的web开发框架. 一.安装 express npm install express --save 二.简单使用 express //引入 ...

  6. JavaSE基础知识(4)—数组的应用

    一.数组的特点.好处及使用步骤 1.数组的好处 特点:相当于用于保存一组元素的容器好处: 1.提高代码的简洁性和扩展性,且同时开辟多个空间,提高了效率 2.分类存储,且空间是连续的,容易查找 2.数组 ...

  7. Linux驱动之一个简单的输入子系统程序编写

    的在Linux驱动之输入子系统简析已经分析过了输入子系统的构成,它是由设备层.核心层.事件层共同组成的.其中核心层提供一些设备层与事件层公用的函数,比如说注册函数.反注册函数.事件到来的处理函数等等: ...

  8. C++学习札记(1)

    指针 按别名传递 下面是个例子: #include <iostream> using namespace std; void swap(int &a,int &b) { i ...

  9. 201771010142 张燕《面向对象程序设计(java)》第一周学习总结

    201771010142 张燕<面向对象程序设计(java)>第一周学习总结 第一部分:课程准备部分 填写课程学习 平台注册账号, 平台名称 注册账号 博客园:www.cnblogs.co ...

  10. 移动端web兼容各种分辨率写法

    移动端web开发最好用rem单位,再设置以下js,以iphone6 750*1334为基准 <script> var size = document.documentElement.cli ...