/BASE (Base Address)

https://msdn.microsoft.com/en-us/library/f7f5138s.aspx

Need for Rebasing a DLL(good)

https://www.codeproject.com/Articles/9426/Need-for-Rebasing-a-DLL

Introduction

This article explains the need for Rebasing a DLL to improve performance during application startup using more than one DLL. It covers how to change compiler settings to rebase a DLL and also the use of /FIXED switch in rebasing.

Every executable and DLL module has a preferred base address, which identifies the ideal memory address where the module should get mapped into a process' address space. When you build an executable module, the linker sets the module's preferred base address to 0x00400000. For a DLL module, the linker sets a preferred base address of 0x10000000. Using Visual Studio's DumpBin utility (with the /headers switch), you can see an image's preferred base address.

Go to command line and use the command Dumpbin /headers exename.exe.

Or use Visual Studio's Depends (Dependency Walker) utility and click an EXE, you will get the information of all DLLs and base addresses where they are loaded.

When this executable module is invoked, the operating system loader creates a virtual address for the new process. Then the loader maps the executable module at memory address 0x00400000 and the DLL module at 0x10000000. Why is this preferred base address so important? Let's look at this code:

Using the code

Let's look at a simple piece of code. I am initializing an integer i in a function.

int i;

void Func();

{

    int i = 5; // This is the important line.

}

When the compiler processes the Func function, the compiler and linker produce machine code that looks something like this:

MOV   [0x10014540], 5

In other words, the compiler and linker have created machine code that is actually hard-coded in the address of the "i" variable, i.e., 0x10014540. This memory address is absolutely correct as long as the DLL does in fact load at its preferred base address.

OK, now let's say that you're designing an application that requires two DLLs. By default, the linker sets the .exemodule's preferred base address to 0x00400000 and the linker sets the preferred base address for both DLLs to 0x10000000. If you attempt to run the .exe, the loader creates the virtual address space and maps the .exemodule at the 0x00400000 memory address. Then the loader maps the first DLL to the 0x10000000 memory address. But now, when the loader attempts to map the second DLL into the process' address space, it can't possibly map it at the module's preferred base address. It must relocate the DLL module, placing it somewhere else.

Below are the dependencies for a test EXE using DLL1 and DLL2 without rebasing. As seen, both DLLs have the same base address and only one will be loaded at that address and the other needs to be reallocated.

Relocating an executable (or DLL) module is an absolutely horrible process, and you should take measures to avoid it. Let's see why. Suppose that the loader relocates the second DLL to address 0x20000000. In that case, the code that changes the "i" variable to 5 should be:

MOV   [0x20014540], 5

But the code in the file's image looks like this:

MOV   [0x10014540], 5

If the code from the file's image without changing the address is allowed to execute, some 4-byte value in the first DLL module will be overwritten with the value 5. This can't possibly be allowed. The loader must somehow fix this code. When the linker builds your module, it embeds a relocation section in the resulting file. If the loader can map a module at its preferred base address, the module's relocation section is never accessed by the system. This is certainly what we want—you never want the relocation section to be used because of the below reasons.

If the module cannot be mapped at its preferred base address, the loader opens the module's relocation section and iterates though all the entries. For each entry found, the loader goes to the page of storage that contains the machine code instruction to be modified. It then grabs the memory address that the machine instruction is currently using and adds to the address the difference between the module's preferred base address and the address where the module actually got mapped.

So, in the example above, the second DLL was mapped at 0x20000000, but its preferred base address is 0x10000000. This yields a difference of 0x10000000, which is then added to the address in the machine code instruction, giving us this:

MOV   [0x20014540], 5

To avoid this, instead change the settings while compilation so as to give different base addresses during compilation itself.

Figure below shows how to achieve it:

Below are the dependencies for a test EXE using DLL1 and DLL2 with rebasing. As seen, both DLLs have different base addresses (DLL1: 0x10000000 and DLL2: 0x20000000) and will be loaded properly. If for some reason, it cannot be loaded at the other address specified, then it has to reallocate the DLL and the above process is carried.

Now this code in the second DLL will reference its "i" variable correctly.

There are two major drawbacks when a module cannot load at its preferred base address:

The loader has to iterate through the relocation section and modify a lot of the module's code. This produces a major performance hit and can really hurt an application's initialization time.
As the loader writes to the module's code pages, the system's copy-on-write mechanism forces these pages to be backed by the system's paging file.

The second point above is truly bad. It means that the module's code pages can no longer be discarded and reloaded from the module's file image on disk. Instead, the pages are swapped to and from the system's paging file as necessary. This hurts performance too. But wait, it gets worse. Since the paging file backs all of the module's code pages, the system has less storage available for all processes running in the system.

By the way, you can create an executable or DLL module that doesn't have a relocation section in it. You do this by passing the /FIXED switch to the linker when you build the module. Using this switch makes the module smaller in bytes but it means that the module cannot be relocated. If the module cannot load at its preferred base address, it cannot load at all. If the loader must relocate a module but no relocation section exists for the module, the loader kills the entire process and displays an "Abnormal Process Termination" message to the user.

In this case, I have made base address of both DLLs same, i.e., 2000000, and have a fixed switch so only one DLL will be loaded and the other cannot be at that location, and you get an error as shown:

Points of Interest

In this particular example, I have tried to cover all aspects, i.e., without rebasing, how to rebase, and how to rebase using /FIXED switch, what are the needs of rebasing, and drawbacks of rebasing using /FIXED. Suggestions for improvement are most welcome.

This was my first article, hope all of you liked it.

Acknowledgement and References

I would like to acknowledge author Mr. Jeffery Richter and his book on Windows OS, which is one of the best books to know about the Windows operating system internals. Parts of this article is taken from the book and examples were added to simplify things.

Rebasing Win32 DLLs

http://www.drdobbs.com/rebasing-win32-dlls/184416272?pgno=1

In this article, I discuss why “basing” a DLL is desirable and what it involves. Then I present a post-link utility, called Libase (for “library base”) to automate the procedure. Libase differs from the Platform SDK utility Rebase in that it chooses the new base address for the DLL based on a hash of the filename, instead of asking you to provide a base address explicitly.

Base Addresses and Rebasing

　　Every Win32 application loads in a private memory address space. The operating system makes it appear that each process has a linear address range that starts from zero. When one process reads from memory address 0x12345, it reads from an entirely different physical memory address than another process that reads from the same address. The operating system keeps the various logical memory spaces apart by implicitly using the segment registers (CS,DS, etc.) as a selector into a table that maps logical memory addresses to physical memory addresses. This is the way that “protected mode” works on the Intel processors.

　　In a process, the application (.exe) and all loadable components (mostly DLLs, but also “in-process” ActiveX servers — which in fact are DLLs) share the logical address space. If one DLL reads from address 0x12345, it reads from the same memory location as another DLL in the same process.

　　Applications and DLLs have functions and variables. Code in an application calls a function by “jumping” to the address at which the function starts. The starting address of the function was determined by the linker when it built the executable. The linker cannot just choose any address; it has to take into account areas of memory that the operating system has reserved. A function may not start at address 0x00000000, for example, and in Windows 9x it may also not exceed address 0x80000000.

　　This raises a problem for DLLs. When you link the DLL, the linker cannot know what linear address the DLL will have to actually load at for any given application (or even application invocation). That’s the “Dynamic Link” part of the DLL acronym. It is entirely possible, and even very likely these days, that an application loads two DLLs that both are “based” by the linker to load at the same starting address. (The base address set by the linker is the “preferred load address.”)

　　If your application needs to load a DLL whose preferred load address conflicts with memory that’s already in use (such as by a previously-loaded DLL that had the same preferred load address), the operating system “rebases” the conflicting DLL by loading it at a different address that does not overlap and then by adjusting all addresses. The physical format of a.dll file includes relocation information that points to, for example, the target addresses ofCALL and JMP instructions, and addresses that reference global/static variables (such as literal strings). All these addresses have to get revised if the operating system cannot load the DLL at its preferred load address.

　　This procedure, done at load time, is time consuming, of course, but it also increases the memory footprint that the DLL takes. For every loaded module, Windows creates a “section object,” a memory mapped file for the DLL. Whenever your application accesses memory that was swapped out, Windows reloads it from the section object. When the executable module was loaded at a different base address than its preferred base address, the image of the module in memory no longer matches the image of the module on disk, and, therefore, those portions of the module that contain relocations are swapped out to the system pagefile. In summary, if a module loads at its preferred base address, it is not copied to the pagefile; if a module is rebased, nearly all of the code section and some of the data section of the DLL is copied to the pagefile at load time.

　　Basing a DLL means to explicitly select a preferred load address when you link it — hopefully selecting one that will cause it to avoid memory locations used by the application or other DLLs. Two reasons why this is desirable are, as discussed above, to make the DLLs load faster and to reduce pagefile usage. A third reason, brought forward in John Robbins’ bookDebugging Applications is to be able to determine the module and the source code line (with the help of a .map file) when given a crash address. (If the DLL had to be rebased when it was loaded, the DLL’s .map file addresses will no longer reflect reality for that invocation of the DLL.)

　　By the way, on the “faster load time” issue, I should mention that when an executable module is unloaded, Windows puts its pages on a “standby” list, a kind of cache from where the module’s pages can be retrieved very efficiently when it is loaded again. So if you load a DLL for a second time, and its pages are still in the standby list, it will load a lot quicker than the first time.

　　Slow load times are most irritating for applications that you start frequently, such as compilers, and Windows’ standby list already deals adequately with this category. Still, some utilities should always start in a snap, even when run only occasionally. For example, when a screen saver launched while a colleague and I were intensively watching a simulation developing, we were both disturbed. But had it not taken three to five seconds to load (in the heat of the moment, we forgot to time it), I would not have disabled it immediately. If you produce screen savers, it is worth taking care of such trivial matters. There are many other examples of applications that you will want to launch quickly from the first time on, from right-click context menu extensions for Windows Explorer to optional macro/script engines in applications.

　　Base address conflicts are not the only cause of slow DLL load times, or even the most important one. Ruediger Asche (see the bibliography) gives a detailed report of load times for a set of DLLs before and after base address conflicts were resolved. However, resolving base address conflicts is so easy that there is hardly any reason not to do it.

To end this overview, I checked the base addresses of executable modules built by the compilers that I have:

Applications (.exe files) start at 0x00400000 for all compilers that I tested. These executable images are loaded first in a process, and they will never need to be relocated. (In fact, they sometimes do not even contain a .RELOC section — the part of a .exe or .dllthat contains detailed relocation information for rebasing.)
Microsoft Visual C/C++ places DLLs at address 0x10000000; this is the address that you will encounter most.
Microsoft Visual Basic places DLLs at address 0x11000000.
Borland C++, Watcom C/C++, and LCC-Win32 place DLLs at address 0x00400000, thereby guaranteeing a conflicting base address with the application.

Manual Rebasing（手动方式（其实也挺简单的））

The address range for an application that is not reserved by any version of Windows is from0x00400000 to 0x80000000. The system DLLs for Windows are currently based in memory from 0x70000000 to 0x78000000 on the Intel processors and from 0x68000000 to0x78000000 on the MIPS processors. Other standard DLLs (for OLE support) are apparently in the range 0x50000000 to 0x5f000000. When selecting base addresses for DLLs, Microsoft suggests that you select them from the top of the allowed address range downwards, in order to avoid conflicts with memory allocated dynamically by the application (which is allocated from the bottom up).

In conclusion, the most suitable address range for DLLs is from 0x60000000 through0x6f000000. Microsoft, seeking portability where it cannot be achieved, proposes to reduce the range further to 0x60000000 through 0x68000000 in order to accommodate both Intel and MIPS processors. (Also note that Microsoft’s upper limit overlaps the reserved range of the MIPS processor.) Microsoft’s proposal continues with a “first letter” scheme for the selection of the base address, which I have summarized in Table 1. In other words, you select a base address for your DLL based on the first letter of the DLL’s name and the addresses in Table 1.

After selecting a load address for a DLL, you have to tell it to the linker. Note again that applications (.exe files) do not need a base adjustment; they are the first executable module that the loader will load, and, therefore, they always load at the address that the linker has fixed them at. The linker options are:

With Watcom C/C++, add the “OP OFFSET=address” to the linker line (WLINK), where you replace “address” with the desired base address. You can use decimal or hexadecimal notation for this address. (Hexadecimal is in the same format as C/C++ literals, for example, “OP OFFSET=0x62000000”.)
With Borland C++, use the “-B:address” option (TLINK32); the value is in hexadecimal.
With Microsoft C/C++, use the option “-base:address”; the value is in hexadecimal.
Alternatively, you can use a post-link utility. For the Rebase utility, which comes with the Platform SDK, use the “-b address” option; the value is in hexadecimal.

Automatic Rebasing: Libase（自动方式）

The drawbacks of the manual rebasing scheme are that the table is difficult to memorize, and that choosing a base address only on the first letter is too simplistic. When I tried it on several somewhat larger projects that I take part in, conflicts arose so quickly that a “rolling the dice” scheme produced better results than the “first letter” proposal. Initially, I extended the scheme to take the first two letters into account (with the added rule that, if many filenames start with the same prefix, the “second letter” to select is the first letter in the filename behind that prefix). This worked in the sense that it resolved nearly all of the conflicts, but the procedure became even harder to know by heart, now requiring two tables instead of one. This called for an automatic solution. And while I was at it, why stop at considering only two letters of the filename?

Libase is a little post-link utility that I wrote that chooses a base address of a DLL (considering all letters in the filename) and rebases the DLL to that address. You do not need to add linker flags to use it; instead, Libase must run after the linker has finished. Libase is configurable via a .ini file; by default it uses an address range of 0x60000000 to0x6ff00000 (larger than the one proposed by Microsoft) with a step size of 0x00100000. The chosen range and step size allow for 256 different base addresses (instead of just nine with Microsoft’s proposal). The hash is adapted from the well-known hash function published inCompilers: Principles, Techniques and Tools by Aho, Sethi, and Ullman (page 435) as P. J. Weinberger’s algorithm for computing hash values. The source code for Libase is in libase.c(Listing 1), and libase.ini (Listing 2) contains a sample .ini file to control it.

In its default configuration, Libase disregards the case of characters in a filename. That is, the files mylib.dll and MYLIB.DLL are rebased to the same address. By setting “IgnoreCase” to “0” in the .ini file, Libase uses the case of the filename as stored on disk.

To use Libase, simply run it with the path to a DLL on the command line. Libase can rebase multiple DLLs in one invocation, but unlike the Platform SDK utility Rebase, it does not choose consecutive, non-overlapping addresses; Libase chooses the base address for each DLL from a hash of its filename. One added feature of Libase is that it keeps the addresses to which it has rebased all modules that it has seen in its .ini file. This allows you to check whether a collision has occurred and to which DLLs that collision applies.

The workhorse function of Libase is ReBaseImage(), which is exported by Microsoft’simagehlp.dll. The implementation of Libase is trivial for the remainder (except for the frustration of the SDK documentation for the ReBaseImage() function mismatching the prototype in imagehlp.h and conflicting with a comment in that header file).

Libase does not guarantee that a DLL gets a unique “preferred load” address; a base address collision may still occur; it is just less likely. The default “step size” assumes that no DLL is bigger than 1MB. You will get a warning for a DLL whose size exceeds the step size, because the report in the .ini file is then no longer accurate.

In closing, I would like to mention that to have a DLL load quickly, the first step is to make sure that Windows can locate it quickly. My advice is to keep implicitly loaded DLLs in the same directory as the application that uses them and to use a full path for DLLs that the application loads explicitly.

Bibliography

Ruediger R. Asche. “Rebasing Win32 DLLs: The Whole Story,” MSDN library, September 1995. This article does exhaustive tests on the load-time degradation of DLLs that must be rebased by the operating system at load time.

John Robbins. Debugging Applications (Microsoft Press, 2000), ISBN 0-7356-0886-5.

Modify the Base Addresses for a DLL Files Series

https://www.codeproject.com/articles/35829/modify-the-base-addresses-for-a-dll-files-series

　　The address range for an application that is not reserved by any version of Windows is from 0x00400000 to 0x80000000. The system DLLs for Windows are currently based in memory from 0x70000000 to 0x78000000 on the Intel processors and from 0x68000000 to 0x78000000 on the MIPS processors. Other standard DLLs (for OLE support) are apparently in the range 0x50000000 to 0x5f000000. When selecting base addresses for DLLs, Microsoft suggests that you select them from the top of the allowed address range downwards, in order to avoid conflicts with memory allocated dynamically by the application (which is allocated from the bottom up).

　　In conclusion, the most suitable address range for DLLs is from 0x60000000 through 0x6f000000. Microsoft, proposes to reduce the range further to 0x60000000 through 0x68000000 in order to accommodate with MIPS processors too,

Dll base address ?

http://stackoverflow.com/questions/7395447/dll-base-address

At runtime I use VMMap(是一个tool，能动态查看。也可以用walk dependency查看（不过只能静态查看）) to monitor the virtual address space and it reveals that the ngen'd Dlls are sitting within a consistant range of virtual memory

rebase的更多相关文章

Git 进阶指南（git ssh keys / reset / rebase / alias / tag / submodule ）
在掌握了基础的 Git 使用之后,可能会遇到一些常见的问题.以下是猫哥筛选总结的部分常见问题,分享给各位朋友,掌握了这些问题的中的要点之后,git 进阶也就完成了,它包含以下部分: 如何修改 ori ...
利用rebase来压缩多次提交
我们可以用Git merge –squash来将分支中多次提交合并到master后,只保留一次提交历史.但是有些提交到github远程仓库中的commit信息如何合并呢? 历史记录首先我们查看一下m ...
git rebase
git rebase -i HEAD~[number_of_commits] git rebase -i HEAD~2
git 开发merge rebase 记录
git status git lg git add src/ git commit -m "restful api and portal" //先commit到自己的本地branc ...
git commit之后未submit，rebase之后找不到自己代码的处理方法
今天使用sourceTree提交代码的时候,commit之后未submit,直接rebase主分支代码,完了发现自己本地做的修改都没了,且远程没有本地分支.google之后发现有一个简单方法可以恢复到 ...
git rebase与 git合并（error: failed to push some refs to）解决方法
1.遇到的问题本地有一个git仓库,在github上新建了一个空的仓库,但是更新了REWADME.md的信息,即在github上多了一个提交. 关联远程仓库,操作顺序如下: git remote a ...
聊下git pull --rebase
有一种场景是经常发生的. 大家都基于develop拉出分支进行并行开发,这里的分支可能是多到数十个.然后彼此在进行自己的逻辑编写,时间可能需要几天或者几周.在这期间你可能需要时不时的需要pull下远程 ...
聊下 git rebase -i
在使用git作为源代码管理工具的时候,开发的时经常会面临一个常见的问题,多个commit 需要合并为一个完整的commit提交. 在一个基本的迭代周期里,你会有很多次commit,有跟配置文件相关的, ...
[译]merge vs rebase
git rebase和git merge设计的初衷是解决相同的一件事, 即把一个分支合并到另外一个分支--只是他们两个处理的方式非常不一样. 当你在一个特定的分支开发新功能, 团队的其它成员在mast ...
[git]rebase和merge
转自:http://blog.csdn.net/wh_19910525/article/details/7554489 Git merge是用来合并两个分支的. git merge b # 将b分支合 ...

随机推荐

开发win8 metro monogame，显示pubcenter广告时会使游戏卡住的问题的解决方法。
开发win8 metro游戏,使用monogame,并且还加上pubcenter广告,但是在x64的机子上遇到了问题,广告一出,游戏卡死.经过一系列的判断,发现monogame一直在update和dr ...
JavaEE：XML解析
XML解析技术概述1.XML 技术主要企业应用1)存储和传输数据 2)作为框架的配置文件2.使用xml 存储和传输数据涉及到以下两点1)通过程序生成xml2)读取xml 中数据 ---- xml 解析 ...
自制AutoMapper实现DTO到持久层Entity的转换
自制AutoMapper实现DTO到持久层Entity的转换项目中经常涉及到页面DTO更新,保存到数据库的操作,这就必然牵扯到DTO和持久层对象的转换,常见的第三方库有: java:dozer .n ...
输入参数能动态调决定调用哪个实现类 spring的一个特性
今天做公司的以前项目的时候发现项目中有个特别好的东西,记录下来,分享一下发现spring有个这样的功能,我也不知道这个是东西应该怎么称呼,就是通过输入参数,动态决定调用接口的实现类.简单理解就是在s ...
嵌入式Linux学习(二)
嵌入式系统和通用计算机的主要区别嵌入式系统是指以应用为中心,以计算机技术为基础,软件硬件可裁剪,适应应用系统对功能.可靠性.成本.体积.功耗严格要求的专用计算机系统. 嵌入式系统主要由嵌入式微处理器 ...
MFC控件(7):Split Button
VS2008中可以看到MFC有一个叫Split Button的控件,要想看它的效果,瞧下QQ那聊天窗口的"发送", "消息记录"这两个按钮就知道了.实际上就是还 ...
hdu1043-素数回文
http://acm.hdu.edu.cn/showproblem.php?pid=1431 整体思想可以理解为打表,可以通过如下办法打表(但是相对比较麻烦),还可以直接使用数组,将所有数据直接存储进 ...
Swift 3.0 字符串、数组、字典的使用
1.字符串 string func stringTest() -> Void { // 字符串 let str1 = "yiyi" let str2 = "2222 ...
python基础语言以及if/while语句结构
接下来学会了变量:用简单的变量来代替复杂的字符串变量首字母不能是数字或者特殊符号~!@#￥等. 字符集的发展: ASCII 255个 1个占1bytes------>1980年 GB2312 ...
Oracle wm_concat（列转行函数）实际使用
接触到了一个开发需求.其中是要把NC单据表体行的字段拼成一个字符串.例如: id name work age 1 王一搬运工 20 2 李二清洁工 21 3 张三洗脚工 22 出现结果字符串为: ...

rebase