The code that we write in a programming language like C#, ASP+ or in any other .NET
compatible language is finally converted to either Assembler or Intermediate Language (IL).
Thus, code written in the COBOL Programming Language can be modified in C# and
subsequently(随后) used in ASP+. Therefore, the best way to accentuate(强调) our comprehension(理解) about
the .NET technologies is by understanding IL.
Once you are conversant with(熟悉) IL, you will have no difficulty in understanding the .NET
technologies, since all .NET languages finally compile to it. IL was invented first and it is
programming language neutral. It was then followed by other programming languages like C#,
Visual Basic.NET, ASP.NET, etc.
We shall raise the curtains on(揭开序幕) IL with a significantly small program. Also, we will commence
with(从...开始) the assumption that you are familiar with at least one .NET programming language.

We have written a very small non-working IL program in the il subdirectory and named it as a.
il. How do we assemble it into an executable program? There is no need to fret over(为...着急) this
problem. Microsoft has provided a program called ilasm whose sole task is to create an
executable file from an IL file.

Before you run this command make sure that your path variable is set to the bin sub directory
in the framework. If not, give the command as
  set path=c:\progra~1\microsoft.net\frameworksdk\bin;%PATH%
Now we use the command as follows:
  c:\il>ilasm /nologo /quiet a.il

On doing so, the following error is generated:
Source file is ANSI
Error: No entry point declared for executable
***** FAILURE *****

In future, we shall not display the first and the last lines of the output generated by ilasm. We
shall also remove the blank lines between non-blank lines.
In IL, we are permitted to commence a line with or without a dot '.'. Anything that begins with a
dot is a directive to the assembler, asking it to perform some function, such as creating a
function or class etc. Anything that does not start with a '.' is an actual assembler instruction.
The significance of .method is that a function or method called vijay is created and this function
returns void i.e. it does not return any value. The function has been named vijay arbitrarily for
want of any other superior nomenclature.
The assembler was obviously not impressed with this program and thus brandished the
message 'no entry point'. This error message is generated because the IL file can contain
numerous functions, and the assembler has no way of distinguishing as to which of them is to
be executed first.
In IL, the first function to be executed is called the entrypoint function. In C#, the function is
Main. The syntax for a function is the name followed by the familiar pair of round () brackets.
The start point and the end point of the function's code is signified by the curly braces {}

Now no error is generated. The directive entrypoint signifies that the program execution has to
begin from this function. In this case, we have to use this directive notwithstanding the fact
that, this program has only one function. On giving the dir command at the DOS prompt, we
see three files created. a.exe is an executable file which can now be executed to see the output
of the program

Our luck seems to run out when we try to execute the above program because the above runtime error is generated. One probable reason for this could be the poor formation of the
function. Every function should have the instruction 'end of function' incorporated in it. We
obviously overlooked this fact in our haste.

The 'end of function' instruction is called ret. All well formed functions have to end with this
instruction.

On executing the function, we get the same error again. Where could we have faltered this time?

The blunder(大错) was that we forgot to use the mandatory directive called assembly followed by a
name. We have incorporated it in the code above, and have used the name mukhi followed by a
pair of empty curly braces {}. The assembly directive is used to give a name to the program. It is
also called a deployment unit.
The code above is the smallest program that can be assembled without any errors, though it
does not perform anything useful when executed. It does not have any function called Main. It
only has a function called vijay with the entrypoint directive. The program now assembles and
runs with no errors at all.
The concept of assembly is extremely crucial in the .NET world and should be thoroughly
understood. We will explore this directive in the latter half of the chapter.

The cause for the above failure message is that the above program has two functions, vijay and
vijay1, with each containing the .entrypoint directive. As mentioned earlier, this directive
specifies as to which function is to be executed first.
Thus, in functionality, it is akin to the Main function in C#. When C# code gets converted into
IL code, the code contained in the function Main gets converted into a function in IL and
contains the directive .entrypoint. For example, if the first function to be executed in a COBOL
program is called abc, the code generated in IL inserts the .entrypoint directive in this function.
In conventional programming languages, the function to be executed first has to have a specific
name, eg. Main, but in IL, only the .entrypoint directive is required. Therefore, since a program
can have only one starting point, only one function in the IL code is allowed to contain the .
entrypoint directive.
It is pertinent to note that no error message number or explanation is generated, making it
difficult to debug this error.

The .entrypoint directive need not be positioned as the first or last directive in the function. It
has to merely be present in the body of the function, to herald its status as the first function to
be executed. Directives are not assembly instructions and can even be placed after the ret
instruction. To remind you, ret signifies the end of the function code.

We may have a function written in C#, ASP+ or COBOL, but the mechanism for executing this
function in IL is the same. It is as follows:
We have to use the assembler instruction call. The call instruction is to be followed by the
following details in the given sequence:

• return type of the function (void).
• the namespace (System).
• the class (Console).
• the function name (WriteLine()).
The function gets called but does not produce any output. So, we pass a parameter to the
WriteLine function

The above code has a glaring(耀眼的) omission(疏忽). When a function is called in IL, in addition to its return
type, the data type of the parameters that are being passed to the function have to also be
specified. We have already stated that the Writeline function expects a parameter of the class
named System.String, but since no string is passed to the function, it generates a runtime
error.
Thus, there is a significant difference between IL and other programming language when it
comes to calling a function. In IL, when we call a function, we have to specify everything we
know about the function, including its return type and the data types of its parameters. This
ensures that the assembler can authenticate the syntactical propriety of your code, by
conducting appropriate checks at run time.
We shall now see how to facilitate passing of parameters to a function

The assembler instruction ldstr places a string on the stack. The name ldstr is an abbreviated
version of the text "load a string on the stack". A stack is an area of memory that facilitates
passing of parameters to a function. All functions receive their parameters from the stack.
Thus, instructions like ldstr are indispensable(不可缺少的).

We have added some attributes to the method vijay. We shall explain them one by one below.
public: This is called an accessibility attribute as it decides as to who all can access a method.
Public means that this method is accessible to every other part of the program.
hidebysig: A class can be derived from many other classes. The attribute hidebysig ensures that
a function in a parent class is hidden from the derived class having the same name or
signature. In this example, it makes sure that if the function vijay is present in the base class,
it is not visible in the derived class.
static: Methods can either be static or non-static. A static method belongs to a class and not to
an instance. Thus, as we have only a single class, we cannot have more than one copy of a
static function. There are no restrictions on where a static method can be created. The function
with the entrypoint directive must be static. Static functions must have a body or source code
associated with them and they are referenced using the type name and not the instance name.
il managed: Due to its complex nature, we shall postpone the explanation of this attribute.
When the time is appropriate, its functionality will be clearly explained.
The abovementioned attributes do not modify the output of the function. In a short while, it will
become apparent to you as to why we have provided the explanation of these attributes.
Whenever we write a program in the C# programming language, we first specify the keyword
class, followed by the name of the class and then, we enclose the source code within a pair of
curly braces {}. This is demonstrated in a.cs

Let us now introduce the IL directive called class.

Notice the change in assembler output : Class 1 Methods: 1;

The directive .class is followed by the name of the class. It is optional in IL. Let us enhance the
functionality of the class by adding a few class attributes.

We have added three attributes to our class directive:
• private: This signifies that access to the members of the class is restricted to the current
class only.
• auto: This means that the layout of the class in memory will be decided only at runtime,
and not by our program.
• ansi: The source code is generally divided into two main categories:
- Managed Code
- Unmanaged Code
Code written in languages like C is called unmanaged code or untrustworthy code. We need an
attribute that handles interoperability between unmanaged code and managed code. For
example, this attribute can be put to use when we want to transfer strings between managed
and unmanaged code

If we cross the bounds of managed code and vault into the realm of unmanaged code, a string,
which is an array of 2-byte Unicode characters, will be converted into an ANSI string, which is
an array of 1-byte ANSI characters and vice versa. The modifier ansi is used for smooth
transition between managed and unmanaged code

The class zzz has been derived from the class System.Object. In the .NET world, in order to
maintain type consistency, all types are ultimately derived form System.Object. Thus, all
objects have a common base class of Object. In IL, classes are derived from other classes in the
same manner as incorporated in programming languages like C++, C# and Java.

You are bound to wonder as to why we have written such an ungainly program. You need to
exercise a little patience before the mist clears and it all starts to make sense. We shall explain
the newly introduced functions and attributes one by one:
.ctor: We have introduced a new function called .ctor which calls the WriteLine function to
display hell1, but it does not get called. .ctor refers to the constructor.
rtspecialname: This attribute signifies to the runtime that the name of the function is special
and it is to be treated in a special manner.
specialname: This attribute alerts the compilers and tools that the function is special. The
runtime may choose to ignore this attribute.
instance: A normal function is called an instance function. Such a function is associated with
an object, unlike a static method, which is associated with a class.
The reason for choosing the specified name for the function will become apparent in due course.
ldarg.0: This is an assembler instruction which loads either the this pointer or the address of
the ZEROth parameter on the execution stack. We shall explain ldarg.0 in detail subsequently.
mscorlib: In the program above, the function .ctor is being called from the base class System.
Object. The name of the function is normally prefixed with the name of the library that contains
the code. This library name is placed within square brackets. In this case, it is optional because
mscorlib.dll is the default library and it contains most of the classes that .NET requires.
.maxstack: This directive specifies the maximum number of elements that can be present on
the evaluation stack when a method is being executed.
.module: All IL files must be part and parcel of a logical entity called a module. The file is added
to a module using the .module directive. The name of the module may be stated as aa.exe, but
the name of the executable file remains the same as before, i.e. a.exe.
.subsystem: This directive is used to specify the operating system on which the executable will

run. This is another way of specifying the kind of executable the assembly is representing.
Some of the numeric values and their corresponding Operating Systems are as follows:
2 - A Windows Character Subsystem.
3 - A Windows GUI Subsystem.
5 - An older operating system called OS/2.
.corsflags: This directive is used to specify flags that are unique to a 64 bit computer. A value of
1 indicates that it is an executable created from il and a value of 4 signifies a library.
.assembly: We very briefly(短暂地) touched upon a directive called .assembly a couple of pages earlier.
Lets delve(专研) a little deeper now.
Whatever we create is part of an entity called a manifest. The .assembly directive marks the
beginning of a manifest. In the hierarchy, the module is the next smaller entity to a manifest.
The .assembly directive specifies the assembly to which this module belongs. A module can only
contain a single .assembly directive.
The presence of this directive is mandatory for exe files but is optional for modules in a .dll.
This is because this directive is needed to create an assembly for us. It is a basic requirement of
the .NET world. An assembly directive contains other directives.
.hash: Hashing is a common technique used in the computer world and there are a large
number of hashing methods or algorithms used. This directive is used for hashing.
.ver: The .ver directive consists of 4 numbers separated by a colons. They represent the
following information in the order given below:
• major version number
• minor version number
• build
• revision number
extern: If there is a requirement to refer to other assemblies, the extern directive is used. The
code of the core .NET classes is in mscorlib.dll. Besides this dll, when our program needs to
refer to code from a large number of other dlls, the extern directive comes into play.
originator: This is the last directive that we shall explore before we move on to explain the
essence and significance of the above example. This directive discloses the identity of the
creator of the dll. It contains eight bytes of the public key of the owner of the dll. It is obviously
a hash value.
Let us revise(复习) what we have done so far, step by step via a different approach:
(a) We started with the smallest C# program that we could write. This program was called a.cs
and contained the following code:

(b) Then we ran the C# compiler using the following command:

Therefore, the exe file called a.exe got created.
(c) On the executable, we ran a program called ildasm, provided by Microsoft:

This created a text file a.txt with the following contents:

When you read the above file, you will realize that all of it has been explained earlier. We
started out with a simple C# program and then compiled it into an executable file. Under
normal circumstances(环境), it would have got converted into machine language or the assembler of
the computer/microprocessor that the program is running on. Once the executable is created,
we disassemble it using ildasm. The disassembled output is saved in a new file a.txt. This file
could be named as a.il and we could have then reversed gear by running ilasm on it to create
the executable again.

 //  Microsoft (R) .NET Framework IL Disassembler.  Version 4.6.1055.0

 // Metadata version: v4.0.30319
 .assembly extern mscorlib
 {
      E0  )                         // .z\V.4..
   :::
 }
 .assembly a
 {
           )
              4E 6F 6E     // ....T..WrapNonEx
                                                                                                                   6F 6E    6F    )       // ceptionThrows.

   // --- 下列自定义特性会自动添加,不要取消注释 -------
   //  .custom instance void [mscorlib]System.Diagnostics.DebuggableAttribute::.ctor(valuetype [mscorlib]System.Diagnostics.DebuggableAttribute/DebuggingModes) = ( 01 00 07 01 00 00 00 00 ) 

   .hash algorithm 0x00008004
   :::
 }
 .module a.exe
 // MVID: {D65B3A6D-7D07-4C89-AB25-0B869EAF338C}
 .imagebase 0x00400000
 .file alignment 0x00000200
 .stackreserve 0x00100000
 .subsystem 0x0003       // WINDOWS_CUI
 .corflags 0x00000001    //  ILONLY
 // Image base: 0x02780000

 // =============== CLASS MEMBERS DECLARATION ===================

 .class private auto ansi beforefieldinit zzz
        extends [mscorlib]System.Object
 {
   .method public hidebysig static void  Main() cil managed
   {
     .entrypoint
     // 代码大小       13 (0xd)

     IL_0000:  nop
     IL_0001:  ldstr      "hi"
     IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
     IL_000b:  nop
     IL_000c:  ret
   } // end of method zzz::Main

   .method public hidebysig specialname rtspecialname
           instance void  .ctor() cil managed
   {
     // 代码大小       8 (0x8)

     IL_0000:  ldarg.0
     IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
     IL_0006:  nop
     IL_0007:  ret
   } // end of method zzz::.ctor

 } // end of class zzz

 // =============================================================

 // *********** 反汇编完成 ***********************
 // 警告: 创建了 Win32 资源文件 a.res

Let us take a look at the smallest VB.NET program. We have named it as one.vb and its source
code is as follows:

After writing the above code, we run the Visual.Net compiler, vbc. as:

This produces the file one.exe.
Next we execute ildasm as follows:

This produces the following file a.txt:

You would be amazed to see that the outputs produced by two different compilers are almost
identical. We have shown you this example to demonstrate that, irrespective of the language
you use, ultimately, the source code will get converted to IL code. Whether we use VB.NET or
C#, the same WriteLine function gets called.
Thus, the differences between programming languages has now become a superficial issue. The
endless debate over which language is superior has finally been put to rest. Thus, IL has
created a situation where programmers are free to use the programming language of their
choice.
Let us now demystify the code given above.
Every VB.NET program needs to be included into a module. We’ve called it modmain. All
modules in Visual Basic have to end with the keyword End, hence we see End Module. This is
where the syntax of VB differs that from C#, which does not understand modules.
In VB.NET, functions are known as sub-routines. We need a sub-routine to mark the starting
point of program execution. This sub-routine is called Main.
The VB.NET code not only does it refer to mscorlib.dll, but also uses the file Microsoft.
VisualBasic.
A class called _vbProject is created in IL; as the class name is not mandatory in VB.
The function called _main is the starting sub-routine to be called as it has the entrypoint
directive. Its name is preceded by a leading underscore. These names are chosen by the VB
compiler that generates the IL code.
This function is passed an array of strings as a parameter. It has a custom directive that deals
with the concept of metadata.
Next, we have the full prototype of the function, ending with an optional series of bytes. These
bytes are part of the metadata specifications.
The module modmain gets converted into a class having the same name. This class also has the
same directive .custom as before and a function called Main. The function uses a directive
called .locals to create a variable on the stack that can only be used within the method. This
variable exists only for the duration of the execution of the method and dies when the method
stops running.

Fields are also stored in memory but, it takes a longer time to allocate memory for them. The
word init signifies that on creation, these variables should be initialized to their default values.
The default values depend upon the type of the variable. Numbers are always initialized to the
value ZERO. The word init is followed by the data type of the variable and finally by its name.

C# to IL 1 Introduction to Microsoft’s IL(MSIL 介绍)的更多相关文章

  1. 【Moqui业务逻辑翻译系列】--UBPL Introduction同意的商业处理文库介绍

    h1. UBPL Introduction 通用的商业处理文库介绍h4. Why a Universal Business Process Library? 为什么需要通用的商业处理文库? The g ...

  2. Introduction to Microsoft Dynamics 365 licensing

    Microsoft Dynamics 365 will be released on November 1. In preparation for that, Scott Guthrie hosted ...

  3. C#基础之IL ,轻松读懂中间代码IL 转载

    [No0000152]C#基础之IL,轻松读懂IL   先说说学IL有什么用,有人可能觉得这玩意平常写代码又用不上,学了有个卵用.到底有没有卵用呢,暂且也不说什么学了可以看看一些语法糖的实现,或对.n ...

  4. 【翻译】A (very) short introduction to R R的简短介绍

    [前言] 本文翻译自Paul Torfs & Claudia Brauer的文章A (very) short introduction to R.其中比较简单的地方没有翻译,不好用中文描述的地 ...

  5. Introduction to MyBatis Generator Mybatis代码生成介绍

    Mybatis官方提供了代码生成工具,这里是官方网站: http://mybatis.github.io/generator/index.html 可以自动生成 Java POJOs, Mapper. ...

  6. 关于TagHelper的那些事情——Microsoft.AspNet.Mvc.TagHelpers介绍

    写在开始 在上一篇文章中,简单介绍了什么是TagHelper,怎么使用它.接下来我会简单介绍一下微软随着ASP.NET5一起发布的TagHelpers.它们分别是: AnchorTagHelper C ...

  7. CLR via C# 摘要二:IL速记

    最简单的IL程序 .assembly test {} .method void Func() { .entrypoint ldstr "hello world" call void ...

  8. VS2012 集成 IL DASM IL微软中间语言查看器

    第一步: 找到IL DASM的安装位置,默认在C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0A\bin\NETFX 4.0 Tools 第二步: ...

  9. 玩转动态编译 - 高级篇:一,IL访问静态属性和字段

    IL介绍 通用中间语言(Common Intermediate Language,简称CIL,发音为"sill"或"kill")是一种属于通用语言架构和.NET ...

随机推荐

  1. ngnix笔记

    ngnix可通过-s 参数控制,如quit正常退出:reload重载配置文件,具体参考:http://nginx.org/en/docs/switches.html ngnix的指令解释请参考这里:h ...

  2. L1-056 猜数字

    一群人坐在一起,每人猜一个 100 以内的数,谁的数字最接近大家平均数的一半就赢.本题就要求你找出其中的赢家. 输入格式: 输入在第一行给出一个正整数N(≤10​4​​).随后 N 行,每行给出一个玩 ...

  3. c/c++动态内存分配的区别

    c中动态内存分配使用malloc和free. malloc指定需要分配的内存大小,分配成功则返回指向该内存的指针,不成功则返回空指针.返回的指针类型为void *,表示不确定指针所指内存存放的数据类型 ...

  4. python+requests+excel 接口自动化框架

    一.项目框架如图: 1.common :这个包都是一些公共的方法,如:手机号加解密,get/post接口请求的方法封装,接口鉴权,发邮件,读写excel文件方法等等 2.result:存放每次运行的l ...

  5. 关于 global nonlocal 用法

    # 1 关于 globals() locals() nolocl 还有内置函数的引用## 概念的解释# 命名空间# 1 局部命名空间:每一个函数都有自己的命名空间# 2 全局命名空间:写在函数外的变量 ...

  6. iOS 获取当前正在显示的ViewController

    //获取当前屏幕显示的viewcontroller - (UIViewController *)getCurrentVC { UIViewController *result = nil; UIWin ...

  7. golang实现一个代理服务器(proxy)学习笔记

    golang是google公司开发一门新的编程语言.对于老的程序员来说,学习一门语言最好的方式,不过是做一个小的项目. 网上看到这一篇使用golang开发proxy的例子,觉得挺有意思.希望通过实际模 ...

  8. Day8作业及默写

    1,有如下文件,a1.txt,里面的内容为: 老男孩是最好的培训机构, 全心全意为学生服务, 只为学生未来,不为牟利. 我说的都是真的.哈哈 分别完成以下的功能: 将原文件全部读出来并打印. with ...

  9. 使用Chrome调试工具抢阿里云免费套餐

    活动地址如下: https://free.aliyun.com/ntms/free/experience/getTrial.html 首先打开地址,需要登录,登陆后看到如下页面: 选择个人免费套餐,这 ...

  10. Android: protecting the kernel

    Linux内置安全机制 Address space separation/process isolation unix permissions DAC capabilities SELinux sec ...