[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)
原文链接: https://blogs.msdn.microsoft.com/abhinaba/2009/03/02/back-to-basics-generational-garbage-collection/
This post is Part 8 in the series of posts on Garbage Collection (GC). Please see the index here.
One of the primary disadvantage discussed in the post on mark-sweep garbage collection is that it introduces very large system pauses when the entire heap is marked and swept. One of the primary optimization employed to solve this issue is employing generational garbage collection. This optimization is based on the following observations
- Most objects die young
- Over 90% garbage collected in a GC is newly created post the previous GC cycle
- If an object survives a GC cycle the chances of it becoming garbage in the short term is low and hence the GC wastes time marking it again and again in each cycle
The optimization based on the above observations is to segregate objects by age into multiple generations and collect each with different frequencies.
This scheme has proven to work rather well and is widely used in many modern systems (including .NET).
Detailed algorithm
The objects can be segregated into age based generations in different ways, e.g. by time of creation. However one common way is to consider a newly created object to be in Generation 0 (Gen0) and then if it is not collected by a cycle of garbage collection then it is promoted to the next higher generation, Gen1. Similarly if an object in Gen1 survives a GC then that gets promoted to Gen2.
Lower generations are collected more often. This ensures lower system pauses. The higher generation collection is triggered fewer times.
How many generations are employed, varies from system to system. In .NET 3 generations are used. Here for simplicity we will consider a 2 generation system but the concepts are easily extended to more than 2.
Let us consider that the memory is divided into two contiguous blocks, one for Gen1 and the other for Gen0. At start memory is allocated only from Gen0 area as follows
![]()
So we have 4 objects in Gen0. Now one of the references is released
![]()
Now if GC is fired it will use mark and sweep on Gen0 objects and cleanup the two objects that are not reachable. So the final state after cleaning up is
![]()
The two surviving objects are then promoted to Gen1. Promotion includes copying the two objects to Gen1 area and then updating the references to them
![]()
Now assume a whole bunch of allocation/de-allocation has happened. Since new allocations are in Gen0 the memory layout looks like
![]()
The whole purpose of segregating into generations is to reduce the number of objects to inspect for marking. So the first root is used for marking as it points to a Gen0 object. While using the second root the moment the marker sees that the reference is into a Gen1 object it does not follow the reference, speeding up marking process.
Now if we only consider the Gen0 objects for marking then we only mark the objects indicated by ✓. The marking algorithm will fail to locate the Gen1 to Gen0 references (shown in red) and some object marking will be left out leading to dangling pointers.
One of the way to handle this is to somehow record all references from Gen1 to Gen0 (way to do that is in the next section) and then use these objects as new roots for the marking phase. If we use this method then we get a new set of marked objects as follows
![]()
This now gives the full set of marked objects. Post another GC and promotion of surviving objects to higher generation we get
![]()
At this point the next cycle as above resumes…
Tracking higher to lower generation references
In general applications there are very few (some studies show < 1% of all references) of these type of references. However, they all need to be recorded. There are two general approached of doing this
Write barrier + card-table
First a table called a card table is created. This is essentially an array of bits. Each bit indicates if a given range of memory is dirty (contains a write to a lower generation object). E.g. we can use a single bit to mark a 4KB block.
![]()
Whenever an reference assignment is made in user code, instead of directly doing the assignment it is redirected to a small thunk (incase .NET the JITter does this). The thunk compares the assignees address to that of the Gen1 memory range. If the range falls within, then the thunk updates the corresponding bit in the card table to indicate that the range which the bit covers is now dirty (shown as red).
First marking uses only Gen0 objects. Once this is over it inspects the card table to locate dirty blocks. Then it considers every object in that dirty block to be new roots and marks objects using it.
As you can see that the 4KB block is just an optimization to reduce the size of the card table. If we increase the granularity to be per object then we can save marking time by having to consider only one object (in contrast to all in 4KB range) but our card table size will also significantly increase.
One of the flip sides is that the thunk makes reference assignment slower.
HW support
Hardware support also uses card table but instead of using thunk it simply uses special features exposed by the HW+OS for notification of dirty writes. E.g. it can use the Win32 api GetWriteWatch to get the list of pages where write happened and use that information to get the card table entries.
However, these kind of support is not available on all platforms (or older version of platforms) and hence is less utilized.
[转] 分代垃圾回收的 新旧代引用问题(原标题:Back To Basics: Generational Garbage Collection)的更多相关文章
- Java分代垃圾回收机制:年轻代/年老代/持久代(转)
虚拟机中的共划分为三个代:年轻代(Young Generation).年老点(Old Generation)和持久代(Permanent Generation).其中持久代主要存放的是Java类的类信 ...
- Java中的分代垃圾回收策略
一.分代GC的理论基础 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大 ...
- JVM分代垃圾回收策略的基础概念
由于不同对象的生命周期不一样,因此在JVM的垃圾回收策略中有分代这一策略.本文介绍了分代策略的目标,如何分代,以及垃圾回收的触发因素. 文章总结了JVM垃圾回收策略为什么要分代,如何分代,以及垃圾回收 ...
- JVM调优总结(五)-分代垃圾回收详述1
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM调优总结:分代垃圾回收详述
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- java虚拟机学习-JVM调优总结-分代垃圾回收详述(9)
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- JVM堆内存控制/分代垃圾回收
JVM的堆的内存, 是通过下面面两个参数控制的 -Xms 最小堆的大小, 也就是当你的虚拟机启动后, 就会分配这么大的堆内存给你 -Xmx 是最大堆的大小 当最小堆占满后,会尝试进行GC,如果GC之后 ...
- JVM调优总结(4):分代垃圾回收
为什么要分代 分代的垃圾回收策略,是基于这样一个事实:不同的对象的生命周期是不一样的.因此,不同生命周期的对象可以采取不同的收集方式,以便提高回收效率. 在Java程序运行的过程中,会产生大量的对象, ...
- Java 垃圾回收机制 (分代垃圾回收ZGC)
什么是自动垃圾回收? 自动垃圾回收是一种在堆内存中找出哪些对象在被使用,还有哪些对象没被使用,并且将后者删掉的机制.所谓使用中的对象(已引用对象),指的是程序中有指针指向的对象:而未使用中的对象(未引 ...
随机推荐
- java 编程英语单词,语句
记录一下java 编程工作学习中常用的英语汇总 in other words: 换句话说 dangle :悬挂 separated:分开的 distinct:明显的,独特的 actual :实际的 i ...
- redis缓存雪崩,缓存穿透,缓存击穿的解决方法
一.缓存雪崩 缓存雪崩表示在某一时间段,缓存集中失效,导致请求全部走数据库,有可能搞垮数据库,使整个服务瘫痪. 使缓存集中失效的原因: 1.redis服务器挂掉了. 2.对缓存数据设置了相同的过期时间 ...
- Cannot attach medium 'D:\program\VirtualBox\VBoxGuestAdditions.iso' {}: medium is already associated with the current state of machine uuid {}返回 代码: VBOX_E_OBJECT_IN_USE (0x80BB000C)
详细的错误信息如下: Cannot attach medium 'D:\program\VirtualBox\VBoxGuestAdditions.iso' {83b35b10-8fa2-4b81-8 ...
- 跨域的处理方式 JSONP和CORS和反向代理
什么是跨域? 首先了解同源策略,三个相同,协议,域名端口号相同就是同源,那么三者有任意不同就会造成跨域.跨域不常见,跨域基本上就是访问别人的资源. 如何解决跨域问题? 常见的有三种 一:jsonp处理 ...
- node.js 调试 eggs launch.json配置信息
{ // 使用 IntelliSense 了解相关属性. // 悬停以查看现有属性的描述. // 欲了解更多信息,请访问: https://go.microsoft.com/fwlink/?linki ...
- Json列表数据查找更新
/* 从Json数组按某个字段中查找记录 IN array 数据列表 fieldName 字段名称 fieldValue 字段值 OUT 查找到的数据列表 */ var SearchRecordsFr ...
- Hello The Merciless World!
这里是一名FJ蒟蒻OIer的Blog,ID在上面自己不会看嘛QAQQQ是GldHkkowo(很随性的名字w 联系方式:QQ:735900335 加 Q Q 看 蒟 蒻 WA 题 爱好? 死宅的爱好是什 ...
- 归并排序之python
想更好的了解归并排序, 需先了解, 将两个有序列表, 组成一个有序列表 有两个列表 l1 = [1, 3, 5, 7] l2 = [2, 4, 6] 需要将 l1 和 l2 组成一个 有序大列表 ...
- pymysql-python爬虫数据存储准备
mongodb 和mysql 在使用哪个数据库 来存储数据上 小哥还是纠结了一下下. 很多爬虫教程都推荐mongodb 优势是速度快 因为我已经本机安装了一下 php开发环境,mysql是现成的, s ...
- Java第4次实训作业
编写"电费管理类"及其测试类. 第一步 编写"电费管理"类 私有属性:上月电表读数.本月电表读数 构造方法:无参.2个参数 成员方法:getXXX()方法.se ...