Using the G711 standard

Marc Sweetgall,                          28 Jul 2006

   4.74 (27 votes)
1 2 3 4 5
4.74/5 - 27 votes
μ 4.75, σa 0.85 [?]
 
Rate:
Add a reason or comment to your vote:             x             Votes of 3 or less require a comment
 
Implementing the G711 µ-law and a-law codecs in C#.

Introduction

I have been working on a VoIP application, and wanted to implement the G.711 specification, which I found out had two variants: A-law and µ-law. Compounded by the problem of the latter being referred to as "mu-law" and "u-law" in addition to µ-law (ALT-0181, by the way), the documentation for the two is quite poor. Without an ITU login, you can't see the actual standard, and few places online go in to exactly what happens in these encodings. Wikipedia just throws a couple of equations around, but is not terribly helpful. Eventually, I did find various implementations, only one of which was reasonably commented. Of course, that one was the one with the error in it...

So here, in my first CodeProject article, I will explain thoroughly the implementation of G.711 in both of its forms. The code is in C#, but it is simple enough to be ported to, say, Java.

Note that in most contexts, I will use mu instead of µ. Only a few comments and variable names use µ, it's just too strange to think of as a normal character. And although I know the alt-code by heart, it still takes much longer to type.

The Code

The project is arranged into five C# files. One is Program.cs, which is not really important until later, and even then, it isn't so terribly important. The other four are static classes, MuLawEncoder, MuLawDecoder, ALawEncoder, and ALawDecoder, which all do exactly as their names imply.

The static constructors do all of the real work, and store the results in a table. When the Encode or Decode methods are called, they look in the table. At the cost of only a little memory (64 KB per encoder, 0.5KB per decoder), it is definitely worth it.

µ-Law Encoding

The MuLawEncoder class handles the µ-law encoding (surprise!) by looking up values in its own private byte array pcmToMuLawMap. If the index is the unsigned 16-bit PCM value, the value is the unsigned 8-bit µ-law byte.

public const int BIAS = 0x84; //132, or 1000 0100
public const int MAX = 32635; //32767 (max 15-bit integer) minus BIAS

These are the constants used by the encoder. Both will come up later.

static MuLawEncoder()
{
pcmToMuLawMap = new byte[65536];
for (int i = short.MinValue; i <= short.MaxValue; i++)
pcmToMuLawMap[(i & 0xffff)] = encode(i);
}

The static constructor fills the table. It instantiates it as an array of 65536 bytes, and then goes from -37638 to 37637. To make the index of the array not be negative, i is ANDed with 0xffff, making it positive.

The method encode(short i) does exactly what you would expect. However, it is private. This is because the only time it will ever be used is by the constructor.

private static byte encode(int pcm) //16-bit
{
//Get the sign bit. Shift it for later
//use without further modification
int sign = (pcm & 0x8000) >> 8;
//If the number is negative, make it
//positive (now it's a magnitude)
if (sign != 0)
pcm = -pcm;
//The magnitude must be less than 32635 to avoid overflow
if (pcm > MAX) pcm = MAX;
//Add 132 to guarantee a 1 in
//the eight bits after the sign bit
pcm += BIAS; /* Finding the "exponent"
* Bits:
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* S 7 6 5 4 3 2 1 0 . . . . . . .
* We want to find where the first 1 after the sign bit is.
* We take the corresponding value from
* the second row as the exponent value.
* (i.e. if first 1 at position 7 -> exponent = 2) */
int exponent = 7;
//Move to the right and decrement exponent until we hit the 1
for (int expMask = 0x4000; (pcm & expMask) == 0;
exponent--, expMask >>= 1) { } /* The last part - the "mantissa"
* We need to take the four bits after the 1 we just found.
* To get it, we shift 0x0f :
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* S 0 0 0 0 0 1 . . . . . . . . . (meaning exponent is 2)
* . . . . . . . . . . . . 1 1 1 1
* We shift it 5 times for an exponent of two, meaning
* we will shift our four bits (exponent + 3) bits.
* For convenience, we will actually just shift
* the number, then and with 0x0f. */
int mantissa = (pcm >> (exponent + 3)) & 0x0f; //The mu-law byte bit arrangement
//is SEEEMMMM (Sign, Exponent, and Mantissa.)
byte mulaw = (byte)(sign | exponent << 4 | mantissa); //Last is to flip the bits
return (byte)~mulaw;
}

The comments say everything that needs to be said.

The public methods all access the table, rather than do reprocessing. They are all called MuLawEncode, with different arguments. It is redundant, but so be it. The Encode overloads:

public static byte MuLawEncode(int pcm) { /*...*/ }
public static byte MuLawEncode(short pcm) { /*...*/ }
public static byte[] MuLawEncode(int[] data) { /*...*/ }
public static byte[] MuLawEncode(short[] data) { /*...*/ }
public static void[] MuLawEncode(byte[] data, byte[] target)
{ /*Suggested by Nathan Allan*/ }
public static byte[] MuLawEncode(byte[] data) {
int size = data.Length / 2;
byte[] encoded = new byte[size];
for (int i = 0; i < size; i++)
encoded[i] = MuLawEncode((data[2 * i + 1] << 8) | data[2 * i]);
return encoded;
}

The last takes an array of bytes in Little-Endian order. Thus, it is special, and gets to be displayed.

The last thing in the MuLawEncoder class is the ZeroTrap. Apparently, it is not so great of a thing to send an all-zero µ-law byte, so when the trap is enabled, an all-zero µ-law byte is replaced instead, by 0x02. By default, this trap is disabled.

Normally, the zero trap is a boolean, but here, there is no need. Since the unsigned PCM value 33000 maps to 0x00, we know that if the table reads 0x00, the zero trap is off. See:

public bool ZeroTrap
{
get { return (pcmToMuLawMap[33000] != 0); }
set
{
byte val = (byte)(value ? 2 : 0);
for (int i = 32768; i <= 33924; i++)
pcmToMuLawMap[i] = val;
}
}

When the zero trap is assigned, the program will go through the table and assign either 0x00 or 0x02 to all of the places that map to 0x00 normally. These are the values in the range [32768, 33924].

µ-Law Decoding

In yet another major surprise, this is done in the MuLawDecoder class. This uses the same table lookup technique as the above, but has an array of type short, since the values are 16-bit signed PCM values.

static MuLawDecoder()
{
muLawToPcmMap = new short[256];
for (byte i = 0; i < byte.MaxValue; i++)
muLawToPcmMap[i] = decode(i);
} private static short decode(byte mulaw)
{
//Flip all the bits
mulaw = (byte)~mulaw; //Pull out the value of the sign bit
int sign = mulaw & 0x80;
//Pull out and shift over the value of the exponent
int exponent = (mulaw & 0x70) >> 4;
//Pull out the four bits of data
int data = mulaw & 0x0f; //Add on the implicit fifth bit (we know
//the four data bits followed a one bit)
data |= 0x10;
/* Add a 1 to the end of the data by
* shifting over and adding one. Why?
* Mu-law is not a one-to-one function.
* There is a range of values that all
* map to the same mu-law byte.
* Adding a one to the end essentially adds a
* "half byte", which means that
* the decoding will return the value in the
* middle of that range. Otherwise, the mu-law
* decoding would always be
* less than the original data. */
data <<= 1;
data += 1;
/* Shift the five bits to where they need
* to be: left (exponent + 2) places
* Why (exponent + 2) ?
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* . 7 6 5 4 3 2 1 0 . . . . . . . <-- starting bit (based on exponent)
* . . . . . . . . . . 1 x x x x 1 <-- our data
* We need to move the one under the value of the exponent,
* which means it must move (exponent + 2) times
*/
data <<= exponent + 2;
//Remember, we added to the original,
//so we need to subtract from the final
data -= MuLawEncoder.BIAS;
//If the sign bit is 0, the number
//is positive. Otherwise, negative.
return (short)(sign == 0 ? data : -data);
}

Again, the comments explain the magic.

And again, the main function is overloaded:

public static short MuLawDecode(byte mulaw) { /*...*/ }
public static short[] MuLawDecode(byte[] data) { /*...*/ }
public static void MuLawDecode(byte[] data, out short[] decoded) { /*...*/ }
public static void MuLawDecode(byte[] data, out byte[] decoded)
{
int size = data.Length;
decoded = new byte[size * 2];
for (int i = 0; i < size; i++)
{
//First byte is the less significant byte
decoded[2 * i] = (byte)(muLawToPcmMap[data[i]] & 0xff);
//Second byte is the more significant byte
decoded[2 * i + 1] = (byte)(muLawToPcmMap[data[i]] >> 8);
}
}

The out parameters are used because otherwise there would be no way to separate the two MuLawDecode functions that both take a byte array. And again, the Little-Endian byte order is displayed.

A-law Encoding

A-law is even worse documented than µ-law. The implementations are even worse in terms of commenting, so this took a bit longer to figure out. In the end, it is indeed similar to µ-law's implementation, despite how different the A-law C code looks from the µ-law C code.

There is no zero trap, and the ALawEncode overloads are identical to the MuLawEncode overloads, so the only difference is the encode(short i) method, and that the MAX is 0x7fff instead of (0x7fff-0x84) like in MuLawEncoder.

private static byte encode(int pcm)
{
//Get the sign bit. Shift it for later use
//without further modification
int sign = (pcm & 0x8000) >> 8;
//If the number is negative,
//make it positive (now it's a magnitude)
if (sign != 0)
pcm = -pcm;
//The magnitude must fit in 15 bits to avoid overflow
if (pcm > MAX) pcm = MAX; /* Finding the "exponent"
* Bits:
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* S 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0
* We want to find where the first 1 after the sign bit is.
* We take the corresponding value
* from the second row as the exponent value.
* (i.e. if first 1 at position 7 -> exponent = 2)
* The exponent is 0 if the 1 is not found in bits 2 through 8.
* This means the exponent is 0 even if the "first 1" doesn't exist.
*/
int exponent = 7;
//Move to the right and decrement exponent
//until we hit the 1 or the exponent hits 0
for (int expMask = 0x4000; (pcm & expMask) == 0
&& exponent>0; exponent--, expMask >>= 1) { } /* The last part - the "mantissa"
* We need to take the four bits after the 1 we just found.
* To get it, we shift 0x0f :
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* S 0 0 0 0 0 1 . . . . . . . . . (say that exponent is 2)
* . . . . . . . . . . . . 1 1 1 1
* We shift it 5 times for an exponent of two, meaning
* we will shift our four bits (exponent + 3) bits.
* For convenience, we will actually just
* shift the number, then AND with 0x0f.
*
* NOTE: If the exponent is 0:
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* S 0 0 0 0 0 0 0 Z Y X W V U T S (we know nothing about bit 9)
* . . . . . . . . . . . . 1 1 1 1
* We want to get ZYXW, which means a shift of 4 instead of 3
*/
int mantissa = (pcm >> ((exponent == 0) ? 4 : (exponent + 3))) & 0x0f; //The a-law byte bit arrangement is SEEEMMMM
//(Sign, Exponent, and Mantissa.)
byte alaw = (byte)(sign | exponent << 4 | mantissa); //Last is to flip every other bit, and the sign bit (0xD5 = 1101 0101)
return (byte)(alaw^0xD5);
}

Even this has only subtle differences. The mask, lack of bias, and the zero exponent weirdness are the key differences.

A-Law Decoding

The only difference between this and MuLawDecoder is the decode(short i) method.

private static short decode(byte alaw)
{
//Invert every other bit,
//and the sign bit (0xD5 = 1101 0101)
alaw ^= 0xD5; //Pull out the value of the sign bit
int sign = alaw & 0x80;
//Pull out and shift over the value of the exponent
int exponent = (alaw & 0x70) >> 4;
//Pull out the four bits of data
int data = alaw & 0x0f; //Shift the data four bits to the left
data <<= 4;
//Add 8 to put the result in the middle
//of the range (like adding a half)
data += 8; //If the exponent is not 0, then we know the four bits followed a 1,
//and can thus add this implicit 1 with 0x100.
if (exponent != 0)
data += 0x100;
/* Shift the bits to where they need to be: left (exponent - 1) places
* Why (exponent - 1) ?
* 1 2 3 4 5 6 7 8 9 A B C D E F G
* . 7 6 5 4 3 2 1 . . . . . . . . <-- starting bit (based on exponent)
* . . . . . . . Z x x x x 1 0 0 0 <-- our data (Z is 0 only when <BR> * exponent is 0)
* We need to move the one under the value of the exponent,
* which means it must move (exponent - 1) times
* It also means shifting is unnecessary if exponent is 0 or 1.
*/
if (exponent > 1)
data <<= (exponent - 1); return (short)(sign == 0 ? data : -data);
}

That's it for the encoders and decoders.

Program.cs

The program included in the source package runs the ALawEncoder and MuLawEncoder on a random series of data, and averages the percent errors. It then displays the average errors for each codec for the full range.

Some results:

On the range [1,32767]: µ-Law: 1.14%, A-Law: 1.26%
On the range [-32767,-1]: µ-Law: 1.14%, A-Law: 1.16%

Conclusion

A-law and µ-law are not so complicated, especially when. laid out in plain sight. I hope this is useful. After all, G.711 can make 16-bit samples, take up 8 bits, or 50% compression. At 8KHz sampling, that turns a 128Kbps stream into a 64Kbps stream.

Edited July 28th, 2006 to fix error and add new overload suggested by Nathan Allan

 

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Using the G711 standard的更多相关文章

  1. Code Project精彩系列(转)

    Code Project精彩系列(转)   Code Project精彩系列(转)   Applications Crafting a C# forms Editor From scratch htt ...

  2. 理解 .NET Platform Standard

    相关博文:ASP.NET 5 Target framework dnx451 and dnxcore50 .NET Platform Standard:https://github.com/dotne ...

  3. Standard C 语言标准函数库介绍

    全面巩固所知所学,往精通方向迈进! Standard C 语言标准函数库速查 (Cheat Sheet) from:http://ganquan.info/standard-c/function/ C ...

  4. Python语言中对于json数据的编解码——Usage of json a Python standard library

    一.概述 1.1 关于JSON数据格式 JSON (JavaScript Object Notation), specified by RFC 7159 (which obsoletes RFC 46 ...

  5. SQL Azure (17) SQL Azure V12 - 跨数据中心标准地域复制(Standard Geo-Replication)

    <Windows Azure Platform 系列文章目录> 熟悉Microsoft Azure平台的读者都了解,Azure SQL Database提供不同等级的,跨数据中心的异地冗余 ...

  6. SQL SERVER 2012 从Enterprise Evaluation Edtion 升级到 Standard Edtion SP1

    案例背景:公司从意大利购买了一套中控系统,前期我也没有参与其中(包括安装.实施都是第三方),直到最近项目负责人告诉我:前期谈判以为是数据库的License费用包含在合同中,现在经过确认SQL Serv ...

  7. CSS3 媒体查询移动设备尺寸 Media Queries for Standard Devices (包括 苹果手表 apple watch)

    /* ----------- iPhone 4 and 4S ----------- */ /* Portrait and Landscape */ @media only screen and (m ...

  8. [Ubuntu] change mouse scrolling between standard and natural

    Standard: sudo vi .Xmodmap insert the content as below pointer = Natural: sudo vi .Xmodmap insert th ...

  9. ant 错误 Specified VM install not found: type Standard VM, name jdk1.6.0_27

    ant 错误 ant Specified VM install not found: type Standard VM, name jdk1.6.0_27 原因: 安装了新的jdk, 在workspa ...

随机推荐

  1. 学习Django,http协议,

    学习Django http协议 规则 ''' http协议:超文本传输协议 ①基于TCP/IP协议基础上的应用层协议,底层实现仍为socket ②基于请求-响应模式:通讯一定是从客户端开始,服务器端接 ...

  2. Python3 tkinter基础 Button bg 按钮的背景颜色

             Python : 3.7.0          OS : Ubuntu 18.04.1 LTS         IDE : PyCharm 2018.2.4       Conda ...

  3. 我为什么选择Go语言(Golang)

    作为一个以开发为生的程序员,在我心目中编程语言如同战士手里的武器,好与不好主要看使用的人是否趁手.是否适合,没有绝对的高低之分. 从2013年起,学习并使用Golang已经有4年时间了,我想叙述一下我 ...

  4. Spring与MyBatis面试

    Spring: https://www.cnblogs.com/wang-meng/p/5701982.html https://www.cnblogs.com/liangyihui/p/591777 ...

  5. 【控制分片分配】控制Elasticsearch分片和副本的分配

    ES集群中索引可能由多个分片构成,并且每个分片可以拥有多个副本.通过将一个单独的索引分为多个分片,我们可以处理不能在一个单一的服务器上面运行的大型索引,简单的说就是索引的大小过大,导致效率问题.不能运 ...

  6. hihoCoder week2 Trie树

    题目链接 https://hihocoder.com/contest/hiho2/problems 字典树 #include <bits/stdc++.h> using namespace ...

  7. 论文笔记:Semantic Segmentation using Adversarial Networks

    Semantic Segmentation using Adversarial Networks 2018-04-27 09:36:48 Abstract: 对于产生式图像建模来说,对抗训练已经取得了 ...

  8. Vue属性中带’-‘的处理方式

    我们在写属性时经常会加入’-‘来进行分词,比如:<panda from-here=”China”></panda>,那这时我们在props里如果写成props:[‘form-h ...

  9. 【AI】微软人工智能学习笔记(二)

    微软Azure机器学习服务 01|机器学习概述 首先上一张图, 这个图里面的大神是谁我也不清楚反正,但是看起来这句话说得很有哲理就贴出来了. 所以在人工智能领域下面的这个机器学习,到底是一个什么样的概 ...

  10. Leetcode121-Best Time to Buy and Sell Stock I - Easy

    I Say you have an array for which the ith element is the price of a given stock on day i. If you wer ...