C++ Knowledge series 3
Programming language evolves always along with Compiler's evolvement
The Semantics of Data
- The size of an empty base class or an empty derived class inherited from an empty base class is not 0. For some reason: virtual pointer to virtual function table, or virtual pointer to virtual base class, or condition checking like if ( &a == &b), or alignment depended on platform.
- Three main interplay factors:
- Language support overhead (vptr);
- Compiler optimization of recognized special case;
- Alignment constraints, machine dependent.
- The empty virtual base class has become a common idiom of OO design under C++, it provides a virtual interface without defining any data.
- This potential difference between compilers illustrates the evolutionary nature of the C++ Object Model. The model provides for the general case. As special cases are recognized over time, this or that heuristic is introduced to provide optimal handling. If successful, the heuristic is raised to common practice and becomes incorporated across implementations. It becomes thought of as standard, although it is not prescribed by the Standard, and over time it is likely be thought of as part of the language.
- The virtual function table is a good example of this. Another is the named return value (NRV) optimization discussed.
- A virtual base class subobject occurs only once in the derived class regardless of the number of times it occurs within the class inheritance hierarchy.
- Non-static data members hold the values of individual class objects; static data member hold values of interest to the class as a whole.
- The C++ object model representation for non-static data members optimizes for space and access time (and to preserve compatibility with the C language layout of the C struct) by storing the members directly within each class object.
- This is also true for the inherited non-static data members of both virtual and non-virtual base classes, although the ordering of their layout is left undefined.
- Static data member are maintained within the global data segment of the program and do not affect the size of individual class object, member functions do so.
- The static data members of a template class behave slightly different.
- Each class object, then, is exactly the size necessary to contain the non-static data member of its class. This size may at times surprise you as being larger than necessary. This girth comes about in two ways:
- Additional data members added by the compilation system to support some language functionality (primarily the virtuals);
- Alignment requirement on the data members and data structures as a whole.
The Binding of a Data Member
- The language rule back then was refereed to as the “member rewriting rule” and stated generally that the body of an inline function is not evaluated until after the entire class declaration is seen.
- The Standard C++ refined the rewriting rule with a tuple of member scope resolution rules. The effect is still to evaluate the body of an inline member function as if it had been defined immediately following the class declaration.
- Thus the binding of a data member within the body of an inline member function does not occur until after the entire class declaration is seen. This is not true of the argument list of the member function, however within the argument list are still resolved in place at the point they are first encountered.
- Class::GetLength( int length) { return length; } // length is bound to argument, even if the class has length member.
- Non-intuitive bindings between extern and nested type names, therefore, can still occur.
- When the subsequent declaration of the nested typedef of length is encountered, the Standard C++ requires that the earlier bindings be flagged as illegal.
- This aspect of the language still requires the general defensive programming style of always placing nested type declarations at the beginning of the class. In our example, placing the nested typedef defining length above any of its uses within the class corrects the non-intuitive binding.
Data Member Layout
- The non-static data members are set down in the order of their declaration within each class object.
- Any intervening static data members are ignored, are stored in the program’s data segment independent of individual class objects.
- The Standard requires within an access section (the private, public, or protected section of a class declaration) only that the members be set down such that "later members have higher addresses within a class object"
- That is, the members are not required to be set down contiguously.
- Alignment constraints on the type of a succeeding member may require padding
- Additionally, the compiler may synthesize one or more additional internal data members in support of the Object Model. The vptr, for example, is one such synthesized data member that all current implementations insert within each object of a class containing one or more virtual functions.
- The Standard, by phrasing the layout requirement as it does, allows the compiler the freedom to insert these internally generated members anywhere, even between those explicitly declared by the programmer.
- The Standard also allows the compiler the freedom to order the data members within multiple access sections within a class in whatever order it sees fit.
- The order of members of class object is now implementation dependent.
- No overhead is incurred by the access section specifier or the number of access levels.
Access of a Data Member
- object.dataMember = 0; What is the cost of accessing the data member?
- The answer depends both on how data member and the class are declared. Data member can be either a static or non-static member. Object’s class can be an independent class or be derived from a single base class. Less likely, but still possible, it can be either multiply or virtually derived.
Access to Static Data Members
- Static data members are literally lifted out of their class, and treated as if each were declared as a global variable(but with visibility limited to the scope of the class). Note: the global variable is initialized before main thread, it is implemented by compiler.
- Each member’s access permission and class association is maintained without incurring any space or runtime overhead either in the individual class objects or in the static data member itself. It is totally checked by compiler at compiling-time. At linking-time in Java.
- A single instance of each class static data member is stored within the data segment of the program. Each reference to the static member is internally translated to be a direct reference of that single extern instance.
- This is the only case in the language where the access of a member through a pointer and through an object are exactly equivalent in terms of the instructions actually executed. This is because the access of a static data member through the member selection operators is a syntactic convenience only.
- The member is not within the class object, and therefore the class object is not necessary for the access.
- What if static data member is an inherited member of a complex inheritance hierarchy, perhaps the member of a virtual base class of a virtual base class, or some other equally complex hierarchy? It doesn’t matter. There is still only a single instance of the member within the program, and its access is direct.
- What if the access of the static data member is through a function call or some other form of expression? In cfront, it was simply discard. Standard C++ explicitly requires that function shall be evaluated, although no use is made of its result.
- Taking the address of a static data member yields an ordinary pointer of its data type, not a pointer to class member, since the static member is not contained within a class object.
- const int * p = &Class::intStaticDataMember;
- The two important aspects of any name-mangling scheme are: 1. the algorithm yields unique names; 2. those unique names can be easily recast back to the original name in case the compilation system needs to communicate with the user.
Access to Non-Static Data Member
- Non-static data members are stored directly within each class object and cannot be accessed except through an explicit or implicit class object. An implicit class object is present whenever the programmer directly accesses a non-static data member within a member function.
- The seemingly direct access of non-static data member is actually carried out through an implicit class object represented by the this pointer
- Access of a non-static data member requires the addition of the beginning address of the class object with the offset location of the data member. this + ( this->y – 1 );
- Notice the peculiar "subtract by one" expression applied to the pointer-to-data-member offset value. Offset values yielded by the pointer-to-data-member syntax are always bumped up by one. Doing this permits the compilation system to distinguish between a pointer to data member that is addressing the first member of a class and a pointer to data member that is addressing no member. Pointers to data members are discussed in more details.
- The offset of each nonstatic data member is known at compile time, even if the member belongs to a base class subobject derived through a single or multiple inheritance chain. Access of a nonstatic data member, therefore, is equivalent in performance to that of a C struct member or the member of a nonderived class
- Virtual inheritance introduces an additional level of indirection in the access of its members through a base class subobject.
- ever significantly different when accessed through the object origin or the pointer pt? The answer is the access is significantly different when the Point3d class is a derived class containing a virtual base class within its inheritance hierarchy and the member being accessed, such as x, is an inherited member of that virtual base class. In this case, we cannot say with any certainty which class type pt addresses (and therefore we cannot know at compile time the actual offset location of the member), so the resolution of the access must be delayed until runtime through an additional indirection. This is not the case with the object origin. Its type is that of a Point3d class, and the offset location of even inherited virtual base class members are fixed at compile time. An aggressive compiler can therefore resolve the access of x through origin statically.
Inheritance and the Data Member
- Under the C++ inheritance model, a derived class object is represented as the concatenation of its members with those of its base classes. The actual ordering of the derived and base class parts is left unsepecified by the Standard. In theory, a compiler is free to place either the base or the derived part first in the derived class object.
- In practice, the base class members always appear first, except in the case of a virtual base class.
- In general, the handling of a virtual base class is an exception to all generalities, even of course, this one.
- Layout of data member of class object is dependent on:
- 1. single inheritance without virtual functions;
- 2. single inheritance with virtual functions;
- 3. multiple inheritance;
- 4. virtual inheritance.
- In the absence of virtual functions, they are equivalent to C struct declarations
Inheritance without Polymorphism:
- concrete inheritance adds no space or access-time overhead to the representation, its layout is akin to C struct representation. Pitfall of alignment requirement, and wrongness of member-wise copy constructor or assignment operator.
- Would overwrite the values of the packed inherited members. It would be an enormous effort on the user's part to debug this, to say the least.
Inheritance with virtual functions for polymorphism:
- This flexibility, of course, is the heart of OO programming. Support for this flexibility, however, does introduce a number of space and access-time overheads.
- Introduction of a virtual table associated with class to hold the address of each virtual function. The size of this table in general is the number of virtual functions declared plus an additional one or two slots to support RTTI.
- Introduction of the vptr within each class object. The vptr provides the runtime link for an object to efficiently find its associated virtual table.
- Augmentation of the constructor to initialize the object’s vptr to the virtual table of the class. Depending on the aggressiveness of the compiler’s optimization, this may mean resetting the vptr within the derived and each base class constructor.
- Augmentation of the destructor to reset the vptr to the associated virtual table of the class. (It is likely to have been set to address the virtual table of the derived class within the destructor of the derived class. Remember, the order of destructor calls is in reverse: derived class and then base class.) An aggressive optimizing compiler can suppress a great many of these assignments.
- In general, the destructor shall be declared virtual in order to be put in the virtual function table, when it is invoked, the base class’ destructor is automatically called by compiler
Multiple Inheritance
- Single inheritance provides a form of “natural” polymorphism regarding the conversion between base and derived types within the inheritance hierarchy.
- Multiple inheritance is neither as well behaved nor as easily modeled as single inheritance. The complexity of multiple inheritance lies in the “unnatural” relationship of the derived class with its second and subsequent base class sub-objects.
- The problem of multiple inheritance primarily affects conversions between the derived and second or subsequent base class object.
- The assignment of the address of a multiply derived object to a pointer of its leftmost base class is the same as that for single inheritance, sine both point to the same beginning address.
- The assignment of the address of a second or subsequent base class, however, requires that that address be modified by the addition( or subtraction in the case of a downcast) of the size of the intervening base class sub-objects.
- The Standard does not require a specific ordering of the Point3d and Vertex base classes of Vertex3d. The original cfront implementation always placed them in the order of declaration. A Vertex3d object under cfront, therefore, consisted of the Point3d subobject (which itself consisted of a Point2d subobject), followed by the Vertex subobject and finally by the Vertex3d part. In practice, this is still how all implementations lay out the multiple base classes (with the exception of virtual inheritance).
- An optimization under some compilers, however, such as the MetaWare compiler, switch the order of multiple base classes if the second (or subsequent) base class declares a virtual function and the first does not. This shuffling of the base class order saves the generation of an additional vptr within the derived class object. There is no universal agreement among implementations about the importance of this optimization, and use of this optimization is not (at least currently) widespread.
- What about access of a data member of a second or subsequent base class? Is there an additional cost? No. The member's location is fixed at compile time. Hence its access is a simple offset the same as under single inheritance regardless of whether it is a pointer, reference, or object through which the member is being accessed.
Virtual Inheritance
- A semantic side effect of multiple inheritance is the need to support a form of shared subobject inheritance.
- We need only a single base class sub-object. The language level solution is the introduction of virtual inheritance.
- As complicated as the semantics of virtual inheritance may seem, its support within the compiler has proven even more complicated. In our iostream example, the implementational challenge is to find a reasonably efficient method of collapsing the two instances of an ios subobject maintained by the istream and ostream classes into a single instance maintained by the iostream class, while still preserving the polymorphic assignment between pointers (and references) of base and derived class objects.
- The general implementation solution is as follows. A class containing one or more virtual base class subobjects, such as istream, is divided into two regions: an invariant region and a shared region. Data within the invariant region remains at a fixed offset from the start of the object regardless of subsequent derivations. So members within the invariant region can be accessed directly. The shared region represents the virtual base class subobjects. The location of data within the shared region fluctuates with each derivation. So members within the shared region need to be accessed indirectly. What has varied among implementations is the method of indirect access.
- The general layout strategy is to first lay down the invariant region of the derived class and then build up the shared region. However, one problem remains: How is the implementation to gain access to the shared region of the class? In the original cfront implementation, a pointer to each virtual base class is inserted within each derived class object. Access of the inherited virtual base class members is achieved indirectly through the associated pointer.
- There are two general solutions to the first problem. Microsoft's compiler introduced the virtual base class table. Each class object with one or more virtual base classes has a pointer to the virtual base class table inserted within it. The actual virtual base class pointers, of course, are placed within the table. Although this solution has been around for many years, I am not aware of any other compiler implementation that employs it. (It may be that Microsoft's patenting of their virtual function implementation effectively prohibits its use.)
- The second solution, and the one preferred by Bjarne (at least while I was working on the Foundation project with him), is to place not the address but the offset of the virtual base class within the virtual function table.
Pointer to Data Members
- Pointers to data members are a somewhat arcane but useful feature of the language, particularly if you need to probe at the underlying member layout of a class. One example of such a probing might be to determine if the vptr is placed at the beginning or end of the class. A second use, presented in Section, might be to determine the ordering of access sections within the class. As I said, it's an arcane, although potentially useful, language feature.
From: <<Inside the C++ Object Model>>
C++ Knowledge series 3的更多相关文章
- Java Knowledge series 4
JVM & Bytecode Has-a or Is-a relationship(inheritance or composition) 如果想利用新类内部一个现有类的特性,而不想使用它的接 ...
- C++ Knowledge series 1
Programming language evolves always along with Compiler's evolvement. 1. The C++ Object Model: Strou ...
- C++ Knowledge series Template & Class
Function Function is composed of name, parameter (operand, type of operand), return value, body with ...
- C++ Knowledge series Inheritance & RTTI & Exception Handling
Inheritance The pointer or reference to base class can address/be assigned with any of the classes d ...
- C++ Knowledge series Conversion & Constructor & Destructor
Everything has its lifecycle, from being created to disappearing. Pass by reference instead of pass ...
- C++ Knowledge series STL & Const
Thank to the pepole who devote theirself to the common libs. STL(http://www.cplusplus.com/reference/ ...
- Java Knowledge series 7
Pepole who make a greate contribution on common libaraies deserve our respect. Component(Widget) / S ...
- C++ Knowledge series 2
Programming language evolves always along with Compiler's evolvement The semantics of constructors O ...
- Java Knowledge series 5
Interface from user, not from implementor.(DIP) Interface-Oriented Programming. Interface or Abstrac ...
- Java Knowledge series 3
JVM & Bytecode Abstract & Object Object in Java (1) 所有东西都是对象object.可将对象想象成一种新型变量:它保存着数据,但可要求 ...
随机推荐
- P2387 [NOI2014]魔法森林 LCT维护最小生成树
\(\color{#0066ff}{ 题目描述 }\) 为了得到书法大家的真传,小 E 同学下定决心去拜访住在魔法森林中的隐 士.魔法森林可以被看成一个包含 n 个节点 m 条边的无向图,节点标号为 ...
- 有趣的数 zoj 月赛
题目描述 让我们来考虑1到N的正整数集合.让我们把集合中的元素按照字典序排列,例如当N=11时,其顺序应该为:1,10,11,2,3,4,5,6,7,8,9. 定义K在N个数中的位置为Q(N,K),例 ...
- PHPExcel类库的使用
首先下载PHPEXCEL 下载地址:https://github.com/PHPOffice/PHPExcel 一.生成Excel <?php require "PHPExcel-1. ...
- 【2014年百度之星资格赛1001】Energy Conversion
Problem Description 魔法师百小度也有遇到难题的时候—— 现在,百小度正在一个古老的石门面前,石门上有一段古老的魔法文字,读懂这种魔法文字需要耗费大量的能量和大量的脑力. 过了许久, ...
- nginx配置文件企业优化
1.1 企业规范优化Nginx配置文件 第一个里程碑:创建扩展目录,生成虚拟主机配置文件 mkdir extra sed -n '10,15p' nginx.conf >extra/www.co ...
- ajax 工作原理以及其优缺点
1.什么是AJAX?AJAX全称为“Asynchronous JavaScript and XML”(异步JavaScript和XML),是一种创建交互式网页应用的网页开发技术.它使用:使用XHTML ...
- linux及hadoop基本操作
cd 命令:切换目录 () 切换到目录“/usr/local” ) 切换到当前目录的上一级目录 ) 切换到当前登录 Linux 系统的用户的自己的主文件夹 ls 命令:查看文件与目录 ...
- idea+springboot+Mybatis搭建web项目
使用idea+springboot+Mybatis搭建一个简单的web项目. 首先新建一个项目: 在这里选择Maven项目也可以,但是IDEA为我们提供了一种更方便快捷的创建方法,即Spring In ...
- VSCode创建自定义用户片段
1.选择相应的用户片段类型(以"Java"为例) 首选项 -> 用户代码片段 -> java 2.设置模板 prefix 触发快捷提示的字符串前缀 body 代码片段主 ...
- python_爬虫基础学习
——王宇阳—根据mooc课程总结记录笔记(Code_boy) Requests库:自动爬去HTML页面.自动网络请求提交 robots.txt:网络爬虫排除标准 Beautiful Soup库:解析H ...