转:http://www.linuxforu.com/2012/01/joy-of-programming-understanding-bit-fields-c/

By S.G. Ganesh on January 30, 2012 in CodingColumns · 2 Comments

One important feature that distinguishes C as a systems programming language is its support for bit-fields. Let us explore this feature in this column.

In C, structure members can be specified with size in number of bits, and this feature is known as bit-fields. Bit-fields are important for low-level (i.e., for systems programming) tasks such as directly accessing systems resources, processing, reading and writing in terms of streams of bits (such as processing packets in network programming), cryptography (encoding or decoding data with complex bit-manipulation), etc.

Consider the example of reading the components of a floating-point number. A 4-byte floating-point number in the IEEE 754 standard consists of the following:

  • The first bit is reserved for the sign bit — it is 1 if the number is negative and 0 if it is positive.
  • The next 8 bits are used to store the exponent in the unsigned form. When treated as a signed exponent, this exponent value ranges from -127 to +128. When treated as an unsigned value, its value ranges from 0 to 255.
  • The remaining 23 bits are used to store the mantissa.

Here is a program to print the value of a floating-point number into its constituents:

struct FP {
// the order of the members depends on the
// endian scheme of the underlying machine
      unsigned int mantissa : 23;
     unsigned int exponent : 8;
      unsigned int sign : 1;
} *fp;
 
int main() {
       float f = -1.0f;
       fp = (struct FP *)&f;
 
printf(" sign = %s, biased exponent = %u,
mantissa = %u ", fp->sign ? "negative" : "positive",
fp->exponent, fp->mantissa);
}

For the floating-point number -1.0, this program prints:

sign = negative, biased exponent = 127, mantissa = 0

Since the sign of the floating-point number is negative, the value of the sign bit is 1. Since the exponent is actual 0, in unsigned exponent format, it is represented as 127, and hence that value is printed. The mantissa in this case is 0, and hence it is printed as it is.

To understand how floating-point arithmetic works, see this Wikipedia article.

An alternative to using bit-fields is to use integers directly, and manipulate them using bitwise operators (such as &|~, etc.). In the case of reading the components of a floating-point number, we could use bitwise operations also. However, in many cases, such manipulation is a round-about way to achieve what we need, and the solution using bit-fields provides a more direct solution and hence is a useful feature.

There are numerous limitations in using bit-fields. For example, you cannot apply operators such as & (addressof)sizeof to bit-fields. This is because these operators operate in terms of bytes (not bits) and the bit-fields operate in terms of bits (not bytes), so you cannot use these operators. In other words, an expression such as sizeof(fp->sign) will result in a compiler error.

Another reason is that the underlying machine supports addressing in terms of bytes, and not bits, and hence such operators are not feasible. Then how does it work when expressions such as fp->sign, or fp->exponent are used in this program?

Note that C allows only integral types as bit-fields, and hence expressions referring to the bit-fields are converted to integers. In this program, as you can observe, we used the %u format specifier, which is for an unsigned integer — the bit-field value was converted into an integer and that is why the program worked.

Those new to bit-fields face numerous surprises when they try using them. This is because a lot of low-level details come into the picture while using them. In the programming example for bit-fields, you might have noticed the reversal in the order of the sign, exponent and mantissa, which is because of the underlying endian scheme followed. Endian refers to how bytes are stored in memory (see this Wikipedia article for more details).

Can you explain the following simple program that makes use of a bit-field?

struct bitfield {
    int bit : 1;
} BIT;
int main() {
   BIT.bit = 1;
   printf(" sizeof BIT is = %d\n", sizeof(BIT));
   printf(" value of bit is = %d ", BIT.bit);
 
}

It prints:

 sizeof BIT is = 4
 value of bit is = -1

Why? Note that it is not a compiler error to attempt to find the sizeof(BIT) because it is a structure; had we attempted sizeof(BIT.bit), that will not compile.

Now, coming to the output, if we had used only one bit in the BIT structure, why is thesizeof(BIT) 4 bytes? It is because of the addressing requirement of the underlying machine. The machine might perhaps require all structs to start in an address divisible by 4; or perhaps, allocating the size of a WORD for the structure is more efficient even if the underlying machine may require that structs start at an even address. Also, the compiler is free to add extra bits between any struct members (including bit-field members), which is known as “padding”.

Now let us come to the next output. We set BIT.bit = 1; and the printf statement printed -1! Why was that?

Note that we declared bit as int bit : 1; where the compiler treated the bit to be a signed integer of one bit size. Now, what is the range of a 1-bit signed integer?

It is from 0 to -1 (not 0 and 1, which is a common mistake). Remember the formula for finding out the range of signed integers: 2(n-1) to 2(n-1)-1 where N is the number of bits. For example, if N is 8 (number of bits in a byte), i.e., the range of a signed integer of size 8 is -2(8-1) to 2(8-1)-1, which is -128 to +127. Now, when N is 1, i.e., the range of a signed integer of size 1, it is -2(1-1)to 2(1-1)-1, which is -1 to 0!

No doubt, bit-fields are a powerful feature for low-level bit-manipulation. The cost of using bit-fields is the loss of portability. We already saw how padding and ending issues can affect portability in our simple program for reading the components of a floating-point number. Bit-fields should be used in places where space is very limited, and when functionality is demanding. Also, the gain in space could be lost in efficiency: bit-fields take more time to process, since the compiler takes care of (and hides) the underlying complexity in bit-manipulation to get/set the required data. Bugs associated with bit-fields can be notoriously hard to debug, since we need to understand data in terms of bits. So, use bit-fields sparingly and with care.

Feature image courtesy: Dean Terry. Reused under the terms of CC-BY-NC-ND 2.0 License.

Joy of Programming: Understanding Bit-fields in C的更多相关文章

  1. Core Java Volume I — 4.4. Static Fields and Methods

    4.4. Static Fields and MethodsIn all sample programs that you have seen, the main method is tagged w ...

  2. Questions that are independent of programming language. These questions are typically more abstract than other categories.

    Questions that are independent of programming language.  These questions are typically more abstract ...

  3. (转)Awesome Courses

    Awesome Courses  Introduction There is a lot of hidden treasure lying within university pages scatte ...

  4. Async/Await FAQ

    From time to time, I receive questions from developers which highlight either a need for more inform ...

  5. .NET并行编程1 - 并行模式

    设计模式——.net并行编程,清华大学出版的中译本. 相关资源地址主页面: http://parallelpatterns.codeplex.com/ 代码下载: http://parallelpat ...

  6. Lock-Free 编程

    文章索引 Lock-Free 编程是什么? Lock-Free 编程技术 读改写原子操作(Atomic Read-Modify-Write Operations) Compare-And-Swap 循 ...

  7. <转载>国外程序员推荐的免费编程书籍资源

    一.George Stocker 提供了一大串,分类如下: How to Design Programs: An Introduction to Computing and Programming 2 ...

  8. MySQL Crash Course #05# Chapter 9. 10. 11. 12 正则.函数. API

    索引 正则表达式:MySQL only supports a small subset of what is supported in most regular expression implemen ...

  9. The history of programming languages.(transshipment) + Personal understanding and prediction

    To finish this week's homework that introduce the history of programming languages , I surf the inte ...

随机推荐

  1. QCon2013上海站总结 -- 前端开发

    选择这个专题开始主要有两个原因:一是这次会议关于前端开发的内容不多.二是我做过几年前端开发,这个专题对我来说会容易点:) 这次QCon上海关于前端开发有一个Keynote,一个Javascript专题 ...

  2. ACM 数论小结 2014-08-27 20:36 43人阅读 评论(0) 收藏

    断断续续的学习数论已经有一段时间了,学得也很杂,现在进行一些简单的回顾和总结. 学过的东西不能忘啊... 1.本原勾股数: 概念:一个三元组(a,b,c),其中a,b,c没有公因数而且满足:a^2+b ...

  3. 信号量的操作——semop函数

    信号量的值与相应资源的使用情况有关,当它的值大于 0 时,表示当前可用的资源数的数量:当它的值小于 0 时,其绝对值表示等待使用该资源的进程个数.信号量的值仅能由 PV 操作来改变.        在 ...

  4. Mysql捕捉(网站)应用执行的语句

    如题,很多时候我们需要知道某个程序或者网站链接到额数据库到底执行了什么语句,对于MSsql来说, 比较简单,有相对应的事件查看器,但是对于mysql来说,并没有这个组件或者相关配套工具,此时我们可以 ...

  5. Centos 64位安装 EPEL源

    #直接在线安装rpm包 rpm -ivh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm # ...

  6. 架设证书服务器 及 让IIS启用HTTPS服务

    一.架设证书服务器(CA服务)1.在系统控制面板中,找到“添加/删除程序”,点击左侧的“添加/删除windows组件”,在列表中找到“证书服务”,安装之.2.CA类型,这里有四种选择,这里以“独立根C ...

  7. [置顶] 我的设计模式学习笔记------>Java设计模式总概况

    设计模式的概念最早起源于建筑设计大师Alexander的<建筑的永恒方法>一书,尽管Alexander的著作是针对建筑领域的,但是他的观点实际上用用于所有的工程设计领域,其中也包括软件设计 ...

  8. POJ 2886Who Gets the Most Candies?(线段树)

    POJ 2886 题目大意是说有n个人围成一圈,游戏的起点是k,每个人持有一个数字(非编号)num,每次当前的人退出圈,下一个人是他左边的第num个(也就是说下一个退出的是k+num, k可以为负数, ...

  9. thinkphp利用行为扩展实现监听器

    1.在User/login函数中添加如下代码 tag('login_listener',$result); //alert('success', '恭喜,登录成功', U('xx/yy')); 去掉跳 ...

  10. iOS7 各种问题解决

    1 UITableView 行分割线不到头,短线问题 if ([self.tableView respondsToSelector:@selector(setSeparatorInset:)]) { ...