C#模块初始化注入
Creating a module initializer in .NET
This article will cover the process, techniques and code required to automatically resolve an embedded library in C# and inject a module initializer into the compiled assembly using IL weaving.
A technical blog post indeed, but the terms and concepts will be explained in depth so by the end of this read, you'll be weaving assemblies like a spider.
What's the officer, problem?
Say you have a utilities library that contains custom helper classes, extensions, native methods and general utility classes to support your application. That library might require functionality from other, third-party libraries to perform its tricks. For example, in your library you may have a custom compression wrapper that references a third-party library for its compression algorithm (like SharpZipLib or Protobuf-Net). Another example is a class for manipulating firewall rules that requires a .dll that may not always present on the Windows OSs you need to deploy to (yes, I'm looking at you NetFwTypeLib.dll!)
One solution to tackle this, is to deploy these third-party dependencies along with your application. However, this is not always an option and would require a change in the deployment process to facilitate every change in your library's dependencies. Moreover, you may end up dragging along a Christmas tree of third-party libraries. An alternative would be to encapsulate the third-party library file as part of the compiled utilities assembly and unpack this embedded resource when it's needed at runtime. This sounds like a great idea!
So how would this unpack mechanism work? Would each utilities class with a third-party dependency require an 'Initialize'-method to unpack it? That would work, but is unfriendly to the caller who would require an extra method call to achieve what he wants. Adding a static constructor for each class to unpack dependencies before any calls are made, might work as well, but it would be better and more generic to let the .NET runtime figure it all out with a little help.
From C# to native machine code
In order to fully understand the basic components involved in what we're trying to achieve, let's go through the process of converting C# code into native machine code.
The image below depicts, on a high-level, how the contents of a C#
project are compiled and handled by the .NET Framework at runtime.
The compiler grabs all C# source files, resources and references and compiles it into Microsoft Intermediate Language (MSIL or IL). MSIL is code that is independent of hardware and the operating system. This compilation process results in a Managed Assembly (.exe or .dll file).
An assembly is called 'managed' because it is managed by the Common Language Runtime (CLR or .NET runtime) when it is executed. The CLR uses the .NET Framework Class Libraries (FCL) and Just-in-Time (JIT) compiler to convert it to machine code that is specific, or native, for the Operating system it's running on. Of course, much more is involved, but this is the process in a nutshell just big enough to contain this nut.
The following chapters go into the details for each step required to create a module initializer to resolve embedded assemblies:
- 1. Embedding a resource into an assembly
- 2. Resolving an assembly
- 3. Resolving an embedded assembly
- 4. Adding a module initializer
Note that all code and the full proof of concept can be downloaded from my GitHub project.
1. Embedding a resource into an assembly
The preparation for the embedding of a resource, is done in the
Visual C# Project, while the actual embedment is executed by the
compiler. After embedding a resource into the Managed Assembly, the .dll
or .exe will be a little bigger in size. In this step, we want to embed
an assembly in order to be able to resolve it at runtime later.
To add a .dll to your project as an embedded resource, follow the steps below:
- In Visual Studio, right-click the project name, click 'Add', and then click 'Add Existing Item'.
- Browse to the location of the .dll to add to your project, select it and click 'Add'.
- Right-click the newly added file and select 'Properties'.
- In the Properties dialog box, locate the 'Build Action' property. By default, this property is set to 'Content'. Click the property and change it to 'Embedded Resource'.
For my 'Coen.Utilities' project, I added the 'SharpZipLib.dll' and placed it inside a folder called 'Embedded':
Compile the project and notice the output assembly file's size has
increased as it now contains the included file as well. You can also use
ildasm.exe (MSIL Disassembler, shipped with Visual Studio) to check the Manifest of your assembly.
My 'Coen.Utilities' increased by 200.704 bytes after embedding
the SharpZipLib.dll. The manifest below shows it's indeed part of my
assembly and it has a length of 0x00031000, which is 200.704 in decimal.
Note that the path of the project-folder in which
this file is stored, is added to the assembly's default namespace. Its
result is the namespace in which the embedded resource can be found at
runtime. In my 'Coen.Utilities' assembly with that same default namespace, an embedded resource in a project folder called 'Embedded' can be found in the namespace 'Coen.Utilities.Embedded'.
The screenshot above of the manifest proves this as well. It's easy
enough, but important when we build the embedded resource resolver later
on.
2. Resolving an assembly
Resolving of assemblies is done by the CLR at runtime, which is
nothing more than finding the required assemblies when they're needed.
Whenever the CLR cannot find an assembly in its usual probing paths, an AssemblyResolve
event will be invoked. By handling this event, the application can help
the CLR locate and load an assembly, its specific version from exotic
places like network locations or ... from an embedded resource.
Before we get into resolving embedded assemblies, let's see how a straight-forward runtime assembly resolving is done by .NET via the System.AppDomain.AssemblyResolve event.
Consider the following code, which is simplified for the purpose of explaining the mechanism:
static void Main()
{
System.AppDomain.CurrentDomain.AssemblyResolve += CurrentDomainAssemblyResolve;
var myClass = new MyClass();
myClass.DoSomething();
}
private static Assembly CurrentDomainAssemblyResolve(object sender, ResolveEventArgs args)
{
var fileName = new AssemblyName(args.Name).Name;
var filePath = Path.Combine(AssemblyVault, fileName);
if (File.Exists(filePath))
return Assembly.LoadFile(filePath);
return null;
}
- line 3 subscribes to the AssemblyResolve event with 'CurrentDomainAssemblyResolve' as a handler. This handler will be called whenever the AssemblyResolve event is invoked.
- lines 5 and 6 just suggest myClass does something to trigger the resolve event.
- line 11 gets the filename inside the ResolveEventArgs parameter.
- line 12 combines the filename with a custom assembly location (AssemblyVault constant) which would make up the full file path of the assembly that's being resolved.
- line 13 checks whether this assembly file actually exists and line 14 attempts to load it before returning it.
- if the resolve handler was not able to locate the assembly, return null in line 17.
So far so good, now the embedded part.
3. Resolving an embedded assembly
As described in the previous chapter, in order to get the resolving of embedded assemblies working, we need to subscribe to the AssemblyResolve event that is invoked by the appdomain of the current thread (System.AppDomain.CurrentDomain
).
The mechanism for resolving an embedded resource differs only in the way the Assembly is loaded. Consider the extended version of the CurrentDomainAssemblyResolve method from the previous chapter:
private static Assembly CurrentDomainAssemblyResolve(object sender, ResolveEventArgs args)
{
var assembly = System.AppDomain.CurrentDomain.GetAssemblies().FirstOrDefault(a => a.FullName == args.Name);
if (assembly != null)
return assembly;
using (var stream = Assembly.GetExecutingAssembly().GetManifestResourceStream(GetResourceName(args.Name)))
{
if (stream != null)
{
// Read the raw assembly from the resource
var buffer = new byte[stream.Length];
stream.Read(buffer, 0, buffer.Length);
// Load the assembly
return Assembly.Load(buffer);
}
}
return null;
}
private static string GetResourceName(string assemblyName)
{
var name = new AssemblyName(assemblyName).Name;
return $"{EmbeddedResourceNamespace}.{name}.dll";
}
- lines 3-6 prevent the assembly from being loaded twice.
- line 8 creates the stream of the resource of which the name is calculated in the method starting in line 23. In other words, this is the stream containing our embedded resource.
- line 13-14 read the entire resource stream into a buffer.
- line 17 loads the assembly from a buffer and returns it.
- line 25-26 construct the name of the resource. Which consists of the 'root' namespace for embedded resources (like 'Coen.Utilities.Embedded') and the name of the resource (like 'ICSharpCode.SharpZipLib') and the .dll extension.
Note again that this code snippet is simplified for explanation purposes. If you make your own AssemblyResolve handler, make sure to do proper exception handling.
The best time to subscribe to this event, is as soon as possible, before any other calls are made. Doing this manually by forcing the user to call an 'Initialize'-method before he can use the assembly or a specific class, is not so user-friendly. It is better to subscribe to the AssemblyResolve event as soon as the assembly containing the embedded resource is loaded; using the module initializer described in the next chapter.
4. Adding a module initializer
A module initializer can be seen as a constructor for an assembly (technically it is a constructor for a module; each .NET assembly is comprised of one or more modules, typically just one). It is run when a module is loaded for the first time, and is guaranteed to run before any other code in the module runs, before any type initializers, static constructors or any other initialization code.
Why can't I just use C# to do this?
Module initializers are a feature of the CLR that is not available in C#. C# cannot do this because it puts everything in a class/struct and module initializers need to be globally defined. This is why we will be injecting this module initializer into the MSIL (or as it is called: IL weaving) after the C# code is compiled.
Which library to use?
So, there are a few ways to create a module initializer. One way is to use Fody, the IL weaving library by Simon Cropp. This is a great library and definitely worth checking out.
However, for fun and to learn more about this technique, we're going a little deeper and do it ourselves. For this we use the Mono.Cecil library, which is actually used by Fody as well.
Mono.Cecil
Cecil is an awesome project written by Jb Evain
to generate and inspect programs and libraries in the ECMA CIL format.
With Cecil, it is possible to load existing managed assemblies, browse
all the contained types, modify them on the fly and save the modified
assembly back to the disk ... Which is exactly what we need.
The trick
For the assembly containing the embedded resource we are going to create an internal class called ModuleInitializer
, which will contain a public method called Initialize()
. This Initialize-method subscribes to the AssemblyResolve event:
internal class ModuleInitializer
{
public static void Initialize()
{
System.AppDomain.CurrentDomain.AssemblyResolve += CurrentDomainAssemblyResolve;
}
private static Assembly CurrentDomainAssemblyResolve(object sender, ResolveEventArgs args)
{
// Embedded resolve magic here
}
}
Nothing fancy here. Note that in this code snippet, the ModuleInitializer class is internal, so it cannot be called from outside the assembly. We don't want any other calls made to this class other than our module initializer. Another important thing to note is that the public Initialize()
-method is static and has no parameters. This is a requirement for using this technique and will be explained further on.
The trick comprises of a few steps:
- Read the compiled assembly.
- Find the
Initialize()
method in the compiled assembly. - Create an assembly constructor using Mono.Cecil and make it call the
Initialize()
method - Inject the constructor into the assembly
- Save the assembly and rewrite the program database (.pdb) to match the new assembly structure.
Notes:
We need to take into account that we may want the assembly to be strong named, which is why the final save-step will also take into account the key to sign the assembly.
Since injecting the module initializer into the assembly must be done
in the MSIL, obviously this process needs to be a post-build step. I
created a console application for this, so I can easily add it as a
post-build event for any assembly I want to use this technique for.
The implementation
Let's check out the code that makes this all happen. It consists of a public main Inject()
-method calling other private methods that will be described further on.
When explaining the code, I will leave out any boiler-plate code
required for this console application to work. If you want to check it
out in its full glory, check out my GitHub project.
Inject()
Consider the following code of the method that will be the main entry
for the console application after command line arguments have been
parsed:
private AssemblyDefinition InjectionTargetAssembly { get; set; }
public void Inject(string injectionTargetAssemblyPath, string keyfile = null)
{
// Validate the preconditions
// - Does the injectionTargetAssemblyPath exist?
// - If the keyfile is provided, does the file exist?
try
{
// Read the injectionTarget
ReadInjectionTargetAssembly(injectionTargetAssemblyPath);
// Get a reference to the initializerMethod
var initializerMethod = GetModuleInitializerMethod();
// Inject the Initializermethod into the assembly as a constructormethod
InjectInitializer(initializerMethod);
// Rewrite the assembly
WriteAssembly(injectionTargetAssemblyPath, keyfile);
}
catch (Exception ex)
{
throw new InjectionException(ex.Message, ex);
}
}
- line 1 defines a private property in which the AssemblyDefinition is stored after the assembly has been read from file. AssemblyDefinition is a type defined in the Cecil library.
ReadInjectiontargetAssembly()
in line 12 reads the AssemblyDefinition from disk and stores in the InjectionTargetAssembly property.GetModuleInitializerMethod()
in line 15 locates theInitialize()
method in the assembly and returns it.- line 18 calls the
InjectInitializer()
method, which creates the constructor and makes it call theInitialize()
method. WriteAssembly()
in line 21 rewrites the assembly.
This is the main structure of the Inject method. In the following paragraphs, each of the important calls will be further explained:
ReadInjectionTargetAssembly()
The ReadInjectionTargetAssembly-method reads the assembly in which the constructor should be injected. It also reads the Program Database (.pdb) file, if it could be located, in order to restructure its contents with respect to the changes made to the assembly after injecting the constructor.
private void ReadInjectionTargetAssembly(string assemblyFile)
{
var readParams = new ReaderParameters(ReadingMode.Immediate);
if (GetPdbFilePath(assemblyFile) != null)
{
readParams.ReadSymbols = true;
readParams.SymbolReaderProvider = new PdbReaderProvider();
}
InjectionTargetAssembly = AssemblyDefinition.ReadAssembly(assemblyFile, readParams);
}
- line 3 defines the parameters for the read action. We want don't want to defer reading to a later time, so set it to Immediate ReadingMode.
- line 5 determines whether a .pdb file is present
- line 7-8 configure the reader parameters to enable reading the symbols.
- line 11 reads the assembly using the configured reader parameters and stores the AssemblyDefinition into a private property 'InjectionTargetAssembly' so it can be accessed in later stages.
GetModuleInitializerMethod()
After the target assembly was read, the GetModuleInitializerMethod is called, which locates the 'Initialize' method that should be called by the injected constructor. After it has been located, some validation is done to ensure the call to this method can actually be made.
Note that in the following snippet the className and methodName are provided as parameters. In my proof of concept, I retrieved these via command line parameters of the injector program. They correspond with the classname/methodname of the target assembly's module initializer class and method as defined in the previous subchapter The Trick.
private MethodReference GetModuleInitializerMethod(string className, string methodName)
{
if (InjectionTargetAssembly == null)
{
throw new InjectionException("Unable to determine ModuleInitializer: InjectionTargetAssembly is null");
}
// Retrieve the ModuleInitializer Class
var moduleInitializerClass = InjectionTargetAssembly.MainModule.Types.FirstOrDefault(t => t.Name == className);
if (moduleInitializerClass == null)
{
throw new InjectionException($"No type found named '{className}'");
}
// Retrieve the ModuleInitializer method
var resultMethod = moduleInitializerClass.Methods.FirstOrDefault(m => m.Name == methodName);
if (resultMethod == null)
{
throw new InjectionException($"No method named '{methodName}' exists in the type '{moduleInitializerClass.FullName}'");
}
// Validate the found method
if (resultMethod.Parameters.Count > 0)
{
throw new InjectionException("Module initializer method must not have any parameters");
}
// Initialize method cannot be private or protected
if (resultMethod.IsPrivate || resultMethod.IsFamily)
{
throw new InjectionException("Module initializer method may not be private or protected, use public or internal instead");
}
//Return type must be void
if (!resultMethod.ReturnType.FullName.Equals(typeof(void).FullName))
{
throw new InjectionException("Module initializer method must have 'void' as return type");
}
// Method must be static
if (!resultMethod.IsStatic)
{
throw new InjectionException("Module initializer method must be static");
}
return resultMethod;
}
- line 1 is the method definition. Note that it returns a MethodReference type which references the details of the Module Initialize method inside the assembly.
- line 9 attempts to find the initializer class in the assembly's main module.
- line 16 attempts to find the initialize method in the initializer class found in line 9.
- lines 23 and further make sure the initialize method has the required features:
- It should be parameterless
- It must be public or internal (preferably internal)
- Its return type should be void
- It must be static
- Special note for line 35. The reason this void-type comparison is made using the full names, is that we don't want to compare types themselves. The void type in the current CLR, may differ from the one for which the target assembly is created. This would result in a false negative.
Now the assembly is read and the method is located that should be called by the constructor, we're ready to modify the assembly.
InjectInitializer()
This method is where the magic happens.
private void InjectInitializer(MethodReference initializer)
{
if (initializer == null)
{
throw new ArgumentNullException(nameof(initializer));
}
const MethodAttributes Attributes = MethodAttributes.Static | MethodAttributes.SpecialName | MethodAttributes.RTSpecialName;
var initializerReturnType = InjectionTargetAssembly.MainModule.Import(initializer.ReturnType);
// Create a new method .cctor (a static constructor) inside the Assembly
var cctor = new MethodDefinition(".cctor", Attributes, initializerReturnType);
var il = cctor.Body.GetILProcessor();
il.Append(il.Create(OpCodes.Call, initializer));
il.Append(il.Create(OpCodes.Ret));
var moduleClass = InjectionTargetAssembly.MainModule.Types.FirstOrDefault(t => t.Name == "<Module>");
if (moduleClass == null)
{
throw new InjectionException("No module class found");
}
moduleClass.Methods.Add(cctor);
}
- line 8 defines a constant which will be important later. This constant defines the attribute of the static constructor method that we inject into the assembly. Module initializers should have the following attributes:
- Static - Indicates that the method is defined on the type; otherwise, it is defined per instance, which is something we don't want in this case.
- SpecialName - Indicates that the method is special. The name describes how this method is special.
- RTSpecialName - Indicates that the common language runtime checks the name encoding.
- line 10 determines the return type of the static constructor by hijacking it from the initializer-method's return type. This will always be void (we validated it before). The reason why we don't want to use
typeof(void)
here, is the same why we didn't compare void types before (line 35 in theGetModuleInitializerMethod()
-method): The void type of the target assembly maybe different from the one in the current CLR. - line 13 creates the method definition for the static constructor (or .cctor) with the given attributes and proper return type.
- line 14 gets an ILProcessor object which can be used to further modify the method body and inject whatever IL code we want to add.
- line 15 adds a call to the inject-method to the ILProcessor object
- line 16 adds a 'return' to the ILProcessor object. This will exit the static constructor (in technical terms: pushing a return value from the callee's evaluation stack onto the caller's evaluation stack).
- line 18 gets the main module itself. This is the module to which we want to add our new static constructor method.
- line 25 adds the new static constructor to the module.
All this above is still done in memory; nothing has been changed to the assembly yet. This is what the last method in this sequence is for.
WriteAssembly()
The WriteAssembly method saves the changes to the assembly and modifies the .pdb file with respect to these changes. It is quite similar to the ReadInjectionTargetAssembly method we defined before.
private void WriteAssembly(string assemblyFile, string keyfile)
{
if (InjectionTargetAssembly == null)
{
throw new InjectionException("Unable to write the Injection TargetAssembly: InjectionTargetAssembly is null");
}
var writeParams = new WriterParameters();
if (GetPdbFilePath(assemblyFile) != null)
{
writeParams.WriteSymbols = true;
writeParams.SymbolWriterProvider = new PdbWriterProvider();
}
if (keyfile != null)
{
writeParams.StrongNameKeyPair = new StrongNameKeyPair(File.ReadAllBytes(keyfile));
}
InjectionTargetAssembly.Write(assemblyFile, writeParams);
}
- line 8 creates a new instance of the class we use to configure the write process.
- line 10 determines whether a .pdb file is present. If so, lines 12 and 13 configure the writer to output to the .pdb file as well.
- If a keyfile was provided, lines 16-19 will read its contents and use it to generate a strong named assembly.
- line 21 writes the changes to the assembly using the configured parameters.
Source code
All of the code in this article can be found in my GitHub project. It contains 3 solutions in 2 projects:
- [Solution] Coen.Utilities
- Coen.Utilities - This is the project in which a third-party assembly (SharpZipLib) is be embedded and resolved. The library itself contains a class that makes a simple call to the SharpZipLib, forcing the Coen.Utilities library to resolve it at runtime.
- TestApplication - This project calls the Coen.Utilities library to test whether the embedded assembly could be resolved
- [Solution] Injector
- Injector - This is the code for the tool that modifies the IL to create a module initializer for an assembly.
Closing thoughts
This may not be a solution to an everyday problem. However, I found it very useful and, once in place, a very elegant way to handle this problem. Credits to Einar Egilsson on who's work this code was built. Moreover, IL weaving is an interesting technique and Mono.Cecil a powerful library that can be used to do much more than just create module initializers.
C#模块初始化注入的更多相关文章
- nginx源码分析之模块初始化
在nginx启动过程中,模块的初始化是整个启动过程中的重要部分,而且了解了模块初始化的过程对应后面具体分析各个模块会有事半功倍的效果.在我看来,分析源码来了解模块的初始化是最直接不过的了,所以下面主要 ...
- JobTracker等相关功能模块初始化
[Hadoop代码笔记]Hadoop作业提交之JobTracker等相关功能模块初始化 一.概要描述 本文重点描述在JobTracker一端接收作业.调度作业等几个模块的初始化工作.想过模块的介绍会在 ...
- nginx-push-stream模块源码学习(二)——模块初始化
本文重点介绍push stream模块的构成,至于nginx如何启动.维护该模块不会详细阐述,以后有时间会做详细阐述. 一.模块定义 1.1. 模块配置 通用nginx模块的配置struct有三种, ...
- nodejs -- 主模块 ,初始化.
一:知识点: 1-1: 模块初始化: 1-2 主模块: 二: 测试 2-1: 代码: 1) 主模块 1: main.js var counter1 = require("./counte ...
- Framebuffer 驱动学习总结(二)---- Framebuffer模块初始化
---恢复内容开始--- Framebuffer模块初始化过程:--driver\video\fbmem.c 1. 初始化Framebuffer: FrameBuffer驱动是以模块的形式注册到系统 ...
- Nodejs模块初始化
模块初始化 一个模块中的JS代码仅在模块第一次被使用时执行一次,并在执行过程中初始化模块的导出对象.之后,缓存起来的导出对象被重复利用. 主模块 通过命令行参数传递给NodeJS以启动程序的模块被称为 ...
- Windows加载器与模块初始化
本文是Matt Pietrek在1999年9月的MSJ杂志上发表的关于Windows加载器与模块初始化方面的文章.作者深入分析了LdrpRunInitialize璕outines例程的作用,用C语言写 ...
- C# 9.0新特性详解系列之三:模块初始化器
1 背景动机 关于模块或者程序集初始化工作一直是C#的一个痛点,微软内部外部都有大量的报告反应很多客户一直被这个问题困扰,这还不算没有统计上的客户.那么解决这个问题,还有基于什么样的考虑呢? 在库加载 ...
- ABP中模块初始化过程(二)
在上一篇介绍在StartUp类中的ConfigureService()中的AddAbp方法后我们再来重点说一说在Configure()方法中的UserAbp()方法,还是和前面的一样我们来通过代码来进 ...
随机推荐
- React全家桶入门
http://blog.csdn.net/column/details/14545.html
- [七月挑选]Tomcat使用命令行启动之指定jdk版本
title: Tomcat使用命令行启动之指定jdk版本 准备好环境,jdk和tomcat. 主要步骤 1.找到Tomcat/bin/catalina.bat文件. 2.在文件前端添加如下. set ...
- odoo ERP 系统安装与使用
https://hub.docker.com/_/odoo/ #!/bin/bash sudo docker pull postgres:10sudo docker pull odoo:11.0 su ...
- Python自动化学习--元素定位
from selenium import webdriver import time driver = webdriver.Chrome() driver.get("https://www. ...
- libevent cs
int evutil_make_listen_socket_reuseable(evutil_socket_t sock): 相当于执行以下操作 int one = 1; setsockopt(soc ...
- 树莓派vnc连接时PyQt(或Qt)键盘键位混乱的解决办法
使用树莓派通过vnc连接到PC端,运行PyQt时会发现键盘输入时候乱码,按下abcde对应出现asdfg.是由于vnc版本不合适造成的. 解决方法: 1.删除原有vnc sudo apt-get re ...
- Mybatis SQL 使用JAVA 静态资源
常量:${@com.htsc.backtest.component.Global@PAGE_SIZE} 静态方法:${@com.htsc.backtest.component.Global@doMet ...
- linux 性能分析常用命令汇总
CPU性能分析工具: vmstatpssartimestracepstreetop Memory(内存)性能分析工具:vmstatstracetopipcsipcrmcat /proc/meminfo ...
- Arduino-舵机
舵机一般都外接三根线,一般棕色为接地线(GND),红色为电源正极线(VCC),橙色为信号线(PWM). 用Arduino控制舵机的方法有两种: 一种是通过Arduino的普通数字传感器接口产生占空比不 ...
- hihocoder 1582 : Territorial Dispute (计算几何)(2017 北京网络赛E)
题目链接 题意:给出n个点.用两种颜色来给每个点染色.问能否存在一种染色方式,使不同颜色的点不能被划分到一条直线的两侧. 题解:求个凸包(其实只考虑四个点就行.但因为有板子,所以感觉这样写更休闲一些. ...