[转]Part2: Understanding !PTE, Part2: Flags and Large Pages
Hello, it's Ryan Mangipano with part two of my PTE series. Today I'll discuss PDE/PTE flags, the TLB, and show you a manual conversion of x86 PAE Large Page Virtual Addresses to Physical. If you haven’t read the first part of this series please find it here. It's a good primer before proceeding.
PDE and PTE flags
I'll start with a discussion about the PDE/PTE flags. If you recall from part one not all of the bits of the Page Directory Entry (PDE) are related to the index (used to form the pointer to the base of the next level). This is true of the table entries in all the levels. For example, on a PAE x86 systems only 9 bits of the PTE (page table entry) are used for the index. During our previous conversion, we only used some of the bits for the index into the next table. The rest of the data, we simply dropped off and replaced with zeros as needed. So what are the other bits used for? They are used for a series of flags. You will observe the state of these flags output by !PTE in the following manner: (-G-DA—KWEV).
These flags are documented in the Intel Manuals. Intel and AMD reserved some of the flags for use by the Operating System. All of these are also documented in chapter 9 (Memory Management) of “Windows Internals, 5th edition”. Let’s dump the PDE from the virtual address we dissected last time. This will allow you to see some of the flags that are present in the other bits
Obtaining the Virtual Address of the PDE
1: kd> !pte 0xf9a10054
VA f9a10054
PDE at 00000000C0603E68 PTE at 00000000C07CD080
contains 000000000102D963 contains 0000000002010121
pfn 102d -G-DA--KWEV pfn 2010 -G--A—KREV
Here is the data-type of our PDE
1: kd> dt nt!_MMPTE u.Hard
+0x000 u :
+0x000 Hard : _MMPTE_HARDWARE
Dumping the PDE and flags
1: kd> dt _MMPTE_HARDWARE 00000000C0603E68
nt!_MMPTE_HARDWARE
+0x000 Valid : 0y1
+0x000 Writable : 0y1
+0x000 Owner : 0y0
+0x000 WriteThrough : 0y0
+0x000 CacheDisable : 0y0
+0x000 Accessed : 0y1
+0x000 Dirty : 0y1
+0x000 LargePage : 0y0
+0x000 Global : 0y1
+0x000 CopyOnWrite : 0y0
+0x000 Prototype : 0y0
+0x000 Write : 0y1
+0x000 PageFrameNumber : 0y00000000000001000000101101 (0x102d)
+0x000 reserved1 : 0y00000000000000000000000000 (0)
Take note of the Letters in the PDE and PTE section of the !pte output, such as -G-DA--KWEV . These letters represent various flags. The presence or absence of the letter in the !PTE output tells you the state of the flag. These flags can also be seen in the hardware pte output above.
Valid (V) - Indicates that the data is located in physical memory. If this flag is not set, then the software can use ALL of the rest of the bits for whatever it wants(like storing the pagefile number and offset where the page is stored.
Write (W/R) - Indicates if the data is writeable or read-only. Multiprocessor or Vista or later. Hardware bit is documented in the processor manuals. Reserved Bit 11’s use is documented in Windows Internals, Chap. 9.
Owner (K/U) - Indicates if the page is owned kernel mode or usermode. Kernel if cleared. User if set.
WriteThrough (T) - When set indicates Writethrough caching policy. When not set indicates write-back caching policy
CacheDisable (N) - If set, the page translation table or physical page it points to cannot be cached.
Accessed (A) - Set when the page itself, or the table referencing it has been read from or written to
Dirty (D) - Indicates if any data on this page has been updated
LargePage (L) - This field is only used on PDEs, not PTEs. It indicates whether or not the PDE is the last table level (meaning that this entry references an actual page in memory) or if it is instead referencing a Page Table. If this bit is set in the PDE, this PDE points directly to a 2-MB page when PAE is in use. If PAE is not being used, the large page size that we are referencing is 4-MB. So basically, this is the page size bit. If this bit is cleared, the final destination page is 4k and can be found in the page table that this PDE points to. If this bit is set, then the final destination page is equal to the size of a large page on your system (2MB when PAE is in use) and can be located using the index value of this particular PDE since it becomes the last level. Keep in mind that a larger offset will be needed to reference all the positions in this large page since it is larger. To use this feature, the PSE bit (bit 4 which is the 5th bit over) must be set in CR4. The setting in CR4 is a global setting, enable the use of large pages on the system. The flag in the PDE only applies to the individual PDE.
Global (G) - If not set Translation Caching flushes affect this bit. If set, other processes use this translation also, so don’t flush it from the Translation Lookaside Buffer cache upon process context switches.
CopyOnWrite (C) - Intel states this is a software field. Windows uses this for processes to share the same copy of a page. The system will give the process a private copy of this page if there is any attempt to write to the page by the process (by copying it). Any attempt to execute code in this page occurs on a No execute system will cause an access violation.
Prototype (P) - Intel states this is a software field. Windows uses this to indicate that this is a prototype PTE.
Reserved0 - These Bits are Reserved
E (E) - Executable page. E is always displayed on platformst that Do not support hardware No-Execute.
Inspecting the state of the flags is important when attempting to manually convert addresses from Virtual Addresses to Physical. For example, since the valid bit is not set in the following invalid PTE, all of the fields are available for Windows to use. This means the information in the processor manuals doesn’t apply. Instead it is an nt!_MMPTE_SOFTWARE which references data located in the page file.
3: kd> !pte b8ae900c
VA b8ae900c
PDE at 00000000C0602E28 PTE at 00000000C05C5748
contains 000000000B880863 contains 000B8AF500000000
pfn b880 ---DA--KWEV not valid
PageFile: 0
Offset: b8af5
Protect: 0
For more information on the different types of invalid PTEs, refer to page 775 of “Windows Internals, 5th edition”.
Manually Converting x86 PAE Large Page Virtual Address to Physical
In part one of this blog, we manually translated a PAE 4-KByte Page Virtual Address (VA). Now we are going to manually translate a VA that represents a Large Page from our PAE system. As discussed in the previous section on PTE flags, a large page allocation means that the page size is larger and the PDE points directly to the page itself. The PDE will not point to the base of a page table. This means that there will be one less level of tables used in the translation. This also means that more bits will be needed to represent the offsets in the large page. I found the following address on my system that references a Large Page, 8054099e. Once again, all the required information was obtained from the processor manuals, debugger help file, and Windows Internals Book.
1: kd> !pte 8054099e
VA 8054099e
PDE at 00000000C0602010 PTE at 00000000C0402A00
contains 00000000004009E3 contains 0000000000000000
pfn 400 -GLDA--KWEV LARGE PAGE pfn 540
Below is the Virtual Address in binary.
1: kd> .formats 8054099e
Binary: 10000000 01010100 00001001 10011110
I have split this VA into it's three parts.
10 Page Directory Pointer Table Offset
000000 010 Page Directory Table Offset
10100 00001001 10011110 This is the Offset into the large page
Let’s get the base of the Page Directory Pointer Table and indentify which of the four entries we will need to follow.
1: kd> !dq (@cr3 & 0xffffffe0) + ( 0y10 * 8) L1
# 23406f0 00000000`06c46801
Now take our address from above, add our zeros and we have the base of Page Directory Table. Then add the offset from our Virtual Address and we'll dump out the PDE.
1: kd> !dq (6c46801 & 0xFFFFFF000) + ( 0y000000010 * 8) L1
# 6c46010 00000000`004009e3
Let’s convert the PDE to binary format to analyze the lower 12 bits. This will allow us to analyze the flags. The last Twelve bits (0-11) are used for the PFN. They are used for the flags that we discussed earlier.
1: kd> .formats (00000000`004009e3 & 0x0000000FFF)
Binary: 00000000 00000000 00001001 11100011
Let’s analyze the flags from this VA using the information we learned earlier....
· Bit Zero is set indicating that the page is Valid, located in physical memory, and all other bits
· Bit One is set indicating that this page is Writeable (Hardware Field)
· Bit Two is cleared indicating that this is a Kernel Mode Page
· Bit Three is cleared indicating a Write-Back Caching policy (caching of writes to the page is enabled)
· Bit Four is cleared indicating that caching is not disabled for the page.
· Bit Five is set indicating this page has been Accessed
· Bit Six is set indicating that this page is Dirty
· Bit Seven is set indicating that this is a Large Page. This PDE points directly to a page, not a Page Table.
· Bit Eight is set indicating other process share this Global PDE. No Delete upon TLB Cache Flush for process context switches.
· Bit Nine is cleared indicating this page is not Copy-On-Write
· Bit Ten is cleared indicating this is NOT a Prototype PTE
· Bit Eleven is set also indicating this page is Writeable (Reserved Field, See Windows Internals, Chap. 9.)
...and compare our findings to the Flags output from !PTE, -GLDA—KWEV. My system doesn’t support No-Execute, so the E is also displayed. For more information, .hh !PTE in windbg.
We know this is a Large Page and is Valid, so we can obtain the directory of our 2-MB Large Page (on this PAE system) from this PDE. The Intel Manual states that in our PDE the last 21 bits aren’t part of the address base.
1: kd> .formats (004009e3 & 0y11111111111000000000000000000000)
Binary: 00000000 01000000 00000000 00000000
So let’s combine the data from the PDE (Highlighted) with the offset from the VA (Virtual Address).
00000000 010 10100 00001001 10011110
Now I'll remove the spaces, precede this binary value with 0y, and send it to .formats.
1: kd> .formats 0y00000000010101000000100110011110
Hex: 0054099e
We could have obtained the same data in this manner
1: kd> ? (004009e3 & 0y11111111111000000000000000000000) + (8054099e & 0y00000000000111111111111111111111)
Evaluate expression: 5507486 = 0054099e
Now let’s dump the data in memory at this physical address
1: kd> !db 0054099e
# 54099e 33 db 8b 75 18 8b 7d 1c-0f 23 fb 0f 23 c6 8b 5d 3..u..}..#..#..]
# 5409ae 20 0f 23 cf 0f 23 d3 8b-75 24 8b 7d 28 8b 5d 2c .#..#..u$.}(.],
# 5409be 0f 23 de 0f 23 f7 0f 23-fb e9 43 ff ff ff 8b 44 .#..#..#..C....D
Now let’s dump the same data using the virtual address
1: kd> db 8054099e
8054099e 33 db 8b 75 18 8b 7d 1c-0f 23 fb 0f 23 c6 8b 5d 3..u..}..#..#..]
805409ae 20 0f 23 cf 0f 23 d3 8b-75 24 8b 7d 28 8b 5d 2c .#..#..u$.}(.],
805409be 0f 23 de 0f 23 f7 0f 23-fb e9 43 ff ff ff 8b 44 .#..#..#..C....D
So now you can see how I used the debugger to translate virtual addresses to physical adrresess. This concludes part two of this blog and in part three we will cover translation of x86 Non-PAE Virtual Address Translation, x64 Address Translation, and the TLB.
[转]Part2: Understanding !PTE, Part2: Flags and Large Pages的更多相关文章
- [转]Part 3: Understanding !PTE - Non-PAE and X64
http://blogs.msdn.com/b/ntdebugging/archive/2010/06/22/part-3-understanding-pte-non-pae-and-x64.aspx ...
- [转]Part1: Understanding !PTE , Part 1: Let’s get physical
http://blogs.msdn.com/b/ntdebugging/archive/2010/02/05/understanding-pte-part-1-let-s-get-physical.a ...
- System and method to prioritize large memory page allocation in virtualized systems
The prioritization of large memory page mapping is a function of the access bits in the L1 page tabl ...
- Microsoft SQL Server Trace Flags
Complete list of Microsoft SQL Server trace flags (585 trace flags) REMEMBER: Be extremely careful w ...
- 【转】mysql对large page的支持
昨天同事问我关于大页内存的事,我也只是有个模糊的概念,从别的博客转过来的,先记录下 在 Linux 操作系统上运行内存需求量较大的应用程序时,由于其采用的默认页面大小为 4KB,因而将会产生较多 TL ...
- 剖析虚幻渲染体系(06)- UE5特辑Part 1(特性和Nanite)
目录 6.1 本篇概述 6.1.1 本篇内容 6.1.2 基础概念 6.2 UE5新特性 6.2.1 UE5编辑器 6.2.1.1 下载编辑器及资源 6.2.1.2 启动示例工程 6.2.1.3 编辑 ...
- Linux Process Virtual Memory
目录 . 简介 . 进程虚拟地址空间 . 内存映射的原理 . 数据结构 . 对区域的操作 . 地址空间 . 内存映射 . 反向映射 .堆的管理 . 缺页异常的处理 . 用户空间缺页异常的校正 . 内核 ...
- 【OS】NMON的简介和使用
[OS]NMON的简介和使用 目前NMON已开源,以sourceforge为根据地,网址是http://nmon.sourceforge.net. 1. 目的 本文介绍操作系统监控工具Nmon的概念. ...
- Linux内存管理 (10)缺页中断处理
专题:Linux内存管理专题 关键词:数据异常.缺页中断.匿名页面.文件映射页面.写时复制页面.swap页面. malloc()和mmap()等内存分配函数,在分配时只是建立了进程虚拟地址空间,并没有 ...
随机推荐
- Kafka 解析
Kafak采用硬盘顺序写入和内存映射文件技术提示性能.即便是顺序写入硬盘,硬盘的访问速度还是不可能追上内存.所以Kafka的数据并不是实时的写入硬盘,它充分利用了现代操作系统分页存储来利用内存提高I/ ...
- bootstrap 部分css样式
clip: rect(0, 0, 0, 0);剪裁绝对定位元素.outline: 0; cursor: not-allowed;
- h5移动端常见问题
meta基础知识 H5页面窗口自动调整到设备宽度,并禁止用户缩放页面 1 <meta name="viewport" content="width=device-w ...
- async 和 await 的进阶
异常的捕获: static void Main(string[] args) { //继续我们的异步编程的使用嘀呀: //关于主线程是无法捕获我们子线程中的异常滴滴啊: var t = DoExcep ...
- Swift: 比较Swift中闭包传值、OC中的Block传值
一.介绍 开发者对匿名函数应该很清楚,其实它就是一个没有名字的函数或者方法,给人直观的感觉就是只能看到参数和返回值.在iOS开发中中,它又有自己的称呼,在OC中叫Block代码块,在Swift中叫闭包 ...
- http的一些事
查找资料将http中缓存相关的知识记录下 一般来说:Last-Modifed和Expires是一对,ETag和Cache-Control是一对 Last-Modified Client端跟Server ...
- TestNG测试报告美化buid.xml配置
<?xml version="1.0" encoding="UTF-8"?> <project name="myproject&qu ...
- ubuntu server 12.04U盘安装,提示无法挂载安装光盘或光盘读取数据出错
今天用Ultraiso将Ubuntu server 12.04 刻入U盘中安装系统,中间提示错误:1.检测不到cdrom(即U盘没有挂载上):2.从光盘中读取数据出错.问题如下图所示: 上网搜了下解决 ...
- jqgrid笔记
//重置列表请求url var url = "url?name="+name; $(grid_list_selector).jqGrid('setGridParam',{url:u ...
- mySql 注入攻击
注入攻击 1.原理: a.只要是带有参数的动态网页且此网页访问了数据库,那么就有可能存在SQL注入; b.字符串拼接和没有判断用户输入是否合法------>导致用户可以玩填字游戏-----> ...