Phases of translation--翻译阶段

The C++ source file is processed by the compiler as if the following phases take place, in this exact order:

Phase 1 --96个basic source character set

  1. The individual bytes of the source code file are mapped(in implementation defined manner) to the characters of the basic source character set.In particular, OS-dependent end-of-line indicators are replaced by newline character. The basic source character set consists of 96 characters:--96个基本字符
    a) 5 whitespace characters(space, horizontal tab, vertical tab, form feed, new-line)
    b) 10 digit characters from 0 to 9
    c) 52 letters from a to z and from A to Z
    d) 29 punctuation characters:_ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " ’
  2. Any source file character that cannot be mapped to a character in basic source character set is replaced by its universal character name (escaped with \u or \U) or by some internal form that is handled equivalently.--转义字符
  3. Trigraph sequences are replaced by corresponding single-character representations.(until C++17)

Phase 2--backslash-反斜线

  1. Whenever backslash appears at the end of a line (immediately followed by the newline character), both backslash and newline are deleted, combining two physical source lines into one logical source line.
    This is a single-pass operation; a line ending in two backslashes followed by an empty line does not combine three lines into one. If a universal character name (\uXXX) is formed on this phase, the behavior is undefined.
  2. If a non-empty source file does not end with a newline character after this step (whether it had no newline originally, or it ended with a backslash), the behavior is undefined (until C++11) a terminating newline character is added (since C++11).

    Phase 3--header,identifiers,numbers,character and string literal

  3. The source file is decomposed into comments, sequences of whitespace characters (space, horizontal tab, new-line, vertical tab, and form-feed), and preprocessing tokens, which are the following:
  • header names such as or "myfile.h" (only recognized after #include)
  • identifiers
  • numbers
  • character and string literal(including alternative tokens),such as +, <<=, new, <%, ##, or and
  • individual non-whitespace characters that do not fit in any other category
  1. Any transformations performed during phases 1 and 2 between the initial and the final double quote of any raw string literal are reverted. (since C++11)
  2. Each comment is replaced by one space character.

Newlines are kept, and it's unspecified whether non-newline whitespace sequences may be collapsed into single space characters.

Phase 4 --preprocessor

  1. The preprocessor is executed.
  2. Each file introduced with the #include directive goes through phases 1 through 4, recursively.
  3. At the end of this phase, all preprocessor directives are removed from the source.

    Phase 5--character and string literal

  4. All characters in character literals and string literals are converted from the source character set to the execution character set (which may be a multibyte character set such as UTF-8, as long as the 96 characters of the basic source character set listed in phase 1 have single-byte representations).
  5. Escape sequences and universal character names in character literals and non-raw string literals are expanded and converted to the execution character set. If the character specified by a universal character name isn't a member of the execution character set, the result is implementation-defined, but is guaranteed not to be a null (wide) character。

    Phase 6

    Adjacent string literals are concatenated.

Phase 7-- translated as a translation unit

Compilation takes place: the tokens are syntactically and semantically analyzed and translated as a translation unit.

Phase 8 -- instantiation unit

Each translation unit is examined to produce a list of required template instantiations, including the ones requested by explicit instantiations. The definitions of the templates are located, and the required instantiations are performed to produce instantiation units.

Phase 9

Translation units, instantiation units, and library components needed to satisfy external references are collected into a program image which contains information needed for execution in its execution environment.

Some compilers don't implement instantiation units (also known as template repositories or template registries) and simply compile each template instantiation at Phase 7, storing the code in the object file where it is implicitly or explicitly requested, and then the linker collapses these compiled instantiations into one at Phase 9

Phases of translation的更多相关文章

  1. C++预处理详解

    本文在参考ISO/IEC 14882:2003和cppreference.com的C++ Preprocessor的基础上,对C++预处理做一个全面的总结讲解.如果没有特殊说明,所列内容均依据C++9 ...

  2. C++ 字面量

    https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?view=vs-2017 C++ supports ...

  3. The C Programming Language Second Edition

    %12d  at least #include <stdio.h> main() { ,sum=,w=; ; ; w<=end; w++ ) { sum+=w; // for(wb= ...

  4. 浅谈C++编译原理 ------ C++编译器与链接器工作原理

    原文:https://blog.csdn.net/zyh821351004/article/details/46425823 第一篇:      首先是预编译,这一步可以粗略的认为只做了一件事情,那就 ...

  5. Logical query-processing phases

    Logical query-processing phases in brief (1) FROM This phase identifies the query’s source tables an ...

  6. Introduction to Neural Machine Translation - part 1

    The Noise Channel Model \(p(e)\): the language Model \(p(f|e)\): the translation model where, \(e\): ...

  7. Datatypes translation between Oracle and SQL Server

    Datatypes translation between Oracle and SQL Server part 1: character, binary strings Datatypes tran ...

  8. Network Address Translation(转载)

    Network Address Translation  来源:http://alexanderlaw.blog.hexun.com/9791596_d.html       地址转换用来改变源/目的 ...

  9. [Google Translation API v2 for Java]

    Reference:https://cloud.google.com/translate/docs/reference/libraries#java-resources QuickstartSampl ...

随机推荐

  1. Fedora 19下Guacamole的安装使用

    由于我要使用RDP实现web远程桌面,因此需要用到了Guacamole这个开源的软件.之前用Ubuntu12.04折腾了一晚上,也没有找到依赖库文件,而Guacamole的官方安装说明却没有介绍这个依 ...

  2. font简写语法

    font简写语法 测试 <p class="p box">这是子元素的字体 默认继承了哪些字体属性 **font-style font-weight font-size ...

  3. delphi 7 下安装 indy 10.5.8 教程

    本教程用 indy 10.5.8 替换 delphi 7 自带的 indy 版本,让大家深入了解 delphi 组件安装的方法. 第一步:下载 indy 10.5.8 组件,解压到合适的目录里.如 D ...

  4. HDU 5828 Rikka with Sequence(线段树)

    [题目链接] http://acm.hdu.edu.cn/showproblem.php?pid=5828 [题目大意] 给出一个数列,要求支持区间加法,区间开方和区间和查询操作. [题解] 考虑开方 ...

  5. PHP cURL 应用

    对于做过数据采集的人来说,cURL一定不会陌生.虽然在PHP中有 file_get_contents函数可以获取远程链接的数据,但是它的可控制性太差了,对于各种复杂情况的采集情 景,file_get_ ...

  6. POJ 3169 Layout (图论-差分约束)

    Layout Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 6574   Accepted: 3177 Descriptio ...

  7. Jquery开发插件的方法

    Jquery未开发插件提供了两个方法: (1)Jquery.extend(object)    -为Jquery类本身添加新的方法;代码如下: $.extend({ add:function(a,b) ...

  8. Android SQLite之乐学成语项目数据库存储

    一.SQLite是什么?为什么要用SQLite?SQLite有什么特点?(下面小编一 一解答) ①SQLite是一个轻量级的关系型数据库,运算速度快,占用资源少,很适合在移动设备上使用, 不仅支持 标 ...

  9. HDU 4861 Couple doubi(找规律|费马定理)

    Couple doubi Time Limit:1000MS     Memory Limit:32768KB     64bit IO Format:%I64d & %I64u Submit ...

  10. HDOJ 1166.敌兵布阵

    2015-06-08 问题简述: 原题的题意相当于有一些连续摆放的箱子,里面装着球,球的数量可以加减,现要查询几个连续的箱子里球的总数,其中存在放球和拿球的操作. 原题链接:http://acm.hd ...