Java 8 VM GC Tuning Guide Charter3-4
第三章 Generations
One strength of the Java SE platform is that it shields the developer from the complexity of memory allocation and garbage collection. However, when garbage collection is the principal bottleneck, it is useful to understand some aspects of this hidden implementation. Garbage collectors make assumptions about the way applications use objects, and these are reflected in tunable parameters that can be adjusted for improved performance without sacrificing the power of the abstraction.
Java平台的优点之一就是对开发者屏蔽内存分配和gc的复杂性。然而,当gc成为主要瓶颈的时候,了解一些gc隐藏实现的内容是十分有必要的。gc对应用程序使用对象的方式做了一些预先假设,这反映在调优参数中,这些参数调整之后可以提升性能而又不失抽象的灵活性。
An object is considered garbage when it can no longer be reached from any pointer in the running program. The most straightforward garbage collection algorithms iterate over every reachable object. Any objects left over are considered garbage. The time this approach takes is proportional to the number of live objects, which is prohibitive for large applications maintaining lots of live data.
当一个对象,在运行的程序中,从任意一个指针都不可达的时候,就会被认为是垃圾。最直白的gc算法就是遍历每一个可达的对象。任何被剩下的对象都被认作垃圾。这种算法的耗时和生存对象的数量有关,当面对超大型程序持有海量对象时,开销过高。
The virtual machine incorporates a number of different garbage collection algorithms that are combined using generational collection. While naive garbage collection examines every live object in the heap, generational collection exploits several empirically observed properties of most applications to minimize the work required to reclaim unused (garbage) objects. The most important of these observed properties is the weak generational hypothesis, which states that most objects survive for only a short period of time.
虚拟机采用不同的算法对内存进行分代回收。尽管gc还是对堆里面每一个对象都逐一检查,但是通过经验观察法,分代回收的策略观察到大部分应用程序的一些性质,利用这些性质,可以大大减少回收垃圾对象的负担。这些被观察到的性质中,最重要的特性就是弱代假设,即大部分的对象生存的周期都很短。
The blue area in Figure 3-1, "Typical Distribution for Lifetimes of Objects" is a typical distribution for the lifetimes of objects. The x-axis is object lifetimes measured in bytes allocated. The byte count on the y-axis is the total bytes in objects with the corresponding lifetime. The sharp peak at the left represents objects that can be reclaimed (in other words, have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.
Figure 3-1 Typical Distribution for Lifetimes of Objects
Description of "Figure 3-1 Typical Distribution for Lifetimes of Objects"
Some objects do live longer, and so the distribution stretches out to the right. For instance, there are typically some objects allocated at initialization that live until the process exits. Between these two extremes are objects that live for the duration of some intermediate computation, seen here as the lump to the right of the initial peak. Some applications have very different looking distributions, but a surprisingly large number possess this general shape. Efficient collection is made possible by focusing on the fact that a majority of objects "die young."
有效率的回收算法应该聚焦于这样一个事实:大部分的对象在创建之初就死了。
To optimize for this scenario, memory is managed in generations (memory pools holding objects of different ages). Garbage collection occurs in each generation when the generation fills up. The vast majority of objects are allocated in a pool dedicated to young objects (the young generation), and most objects die there. When the young generation fills up, it causes a minor collection in which only the young generation is collected; garbage in other generations is not reclaimed. Minor collections can be optimized, assuming that the weak generational hypothesis holds and most objects in the young generation are garbage and can be reclaimed. The costs of such collections are, to the first order, proportional to the number of live objects being collected; a young generation full of dead objects is collected very quickly. Typically, some fraction of the surviving objects from the young generation are moved to the tenured generation during each minor collection. Eventually, the tenured generation will fill up and must be collected, resulting in a major collection, in which the entire heap is collected. Major collections usually last much longer than minor collections because a significantly larger number of objects are involved.
为了优化上图的场景,内存被分代管理(内存池里持有的对象按照不同的时期划分)。gc在每一个代被填满的时候发生,并且发生在每个代内。大多数的对象被分配在一个为年轻对象准备的内存池里(年轻代),并且大部分的对象也死在这里。当年轻代被占满的时候,会触发minor gc,minor gc仅回收年轻代的内存,不回收其他代的内存。基于弱代假设的前提:大部分在年轻代的对象是垃圾,并且可以被回收,minor gc是有优化的余地的。Minor gc的开销和年轻代存活的对象数量成正比,如果年轻代的对象全是死对象,那么回收起来是很快的。一般情况下,某些在年轻代存活的对象,在minor gc后,将被从年轻代移动到成熟代。最终,成熟代也会被占满,需要回收,导致发生major gc,在major gc中,整个堆将被回收。Major gc比minor gc持续的时间要长很多,因为major gc涉及到全局所有的对象。
As noted in the section Ergonomics, ergonomics selects the garbage collector dynamically to provide good performance on a variety of applications. The serial garbage collector is designed for applications with small data sets, and its default parameters were chosen to be effective for most small applications. The parallel or throughput garbage collector is meant to be used with applications that have medium to large data sets. The heap size parameters selected by ergonomics plus the features of the adaptive size policy are meant to provide good performance for server applications. These choices work well in most, but not all, cases, which leads to the central tenet of this document:
如同在第二章Ergonomics开头中提到的,针对不同的应用,Ergonomics会动态选择gc来提供好的性能。串行gc适合小量数据集合,默认的参数设定也适合大多数小型程序。并行或者吞吐量gc适合中到大型数据集合。Ergonomics通过选择设定合适的堆的初始大小,加上动态调整策略,可以为不同的应用程序提供较好的性能保证。大部分情况下这种机制工作量好,但并不是所有情况都一帆风顺,这也就引出了本文的主旨:
If garbage collection becomes a bottleneck, you will most likely have to customize the total heap size as well as the sizes of the individual generations. Check the verbose garbage collector output and then explore the sensitivity of your individual performance metric to the garbage collector parameters.
如果gc变成了瓶颈,你将不得不亲自定制堆的大小,并仔细的设定每个单独的代的大小。检查gc的详细输出信息,采用你程序最为性能敏感的参数设定。
Figure 3-2, "Default Arrangement of Generations, Except for Parallel Collector and G1" shows the default arrangement of generations (for all collectors with the exception of the parallel collector and G1):
Figure 3-2 Default Arrangement of Generations, Except for Parallel Collector and G1
Description of "Figure 3-2 Default Arrangement of Generations, Except for Parallel Collector and G1"
At initialization, a maximum address space is virtually reserved but not allocated to physical memory unless it is needed. The complete address space reserved for object memory can be divided into the young and tenured generations.
在初始化的时候,一个最大的地址空间被保留,但是仅在需要的时候才会分配物理地址。整个内存地址空间被划分为年轻代和成熟代。
The young generation consists of eden and two survivor spaces. Most objects are initially allocated in eden. One survivor space is empty at any time, and serves as the destination of any live objects in eden; the other survivor space is the destination during the next copying collection. Objects are copied between survivor spaces in this way until they are old enough to be tenured (copied to the tenured generation).
年轻代包含一个eden区和两个survivor区。大部分对象初始被分配在eden区。其中一个survivor区在任意时点始终是空的,时刻准备着在下一次回收中拷贝另外一个survivor区存活的对象(这种空区拷贝可以有效保持对象紧凑并且避免删除操作,回收的时候,只需要把存活的对象紧凑的搬到另外一个区域然后一股脑儿的将本区域置为空即可,效率高)
1. Performance Considerations
性能考量主要有两点:一是吞吐量,即gc时间占总运行时间的比值,二是暂停时间,即gc运行时,主程序停止的时间。
不同的用户对gc有不同的需求,如果是一个web用户,那么吞吐量是优先考虑的因素,因为偶尔的长暂停可以被归咎于网络延迟;但是如果是一个强交互的应用,那么即便是一个短暂的停止后也是用户体验上无法忍受的。
但是有些用户也有其他考量,Footprint(资源占用)是一个工作进程的所包含的内容的集合,其度量参照物通常为page(页,内存概念)和cache line(缓存行,内存概念)。在物理内存和进程数有限的操作系统上,footprint需要精确考量。
还有一个指标叫做promptness,这个是指当一个对象变为不可达状态(dead),到被回收,将内存空间释放的时间差。这个指标对分布式系统有很大的意义,比如Java RMI。
总体上来说,在上述这些指标的考量之间来选择每个内存“代”的大小是一种平衡工作。比如,设定一个非常大的年轻代可以显著提高吞吐量,但是却以更大的资源占用(footprint),更长的promptness时间和gc暂停时间为代价的。相反,小的年轻代,因为gc暂停的时间更小,但是却牺牲了吞吐量。每一个代进行回收的频率和暂停的时间,不会影响其他代。
法无常法,要根据实际应用程序的需要来配置每代内存的大小。
2. Measurement
使用虚拟机参数-verbose:gc,可以打印出gc时候的详细信息。输出的格式如下:
minor gc是年轻代的gc;major gc是全gc
其中total仅包含了一个survivor区域的内存大小,因为另外一个总是空的。
在虚拟机参数中,使用-XX:+PrintGCDetails参数,可以打印更为详细的回收日志。
[GC [PSYoungGen: 76256K->10745K(141824K)] 95363K->41563K(315392K), 0.0073975 secs] [Times: user=0.09 sys=0.02, real=0.01 secs]
使用-XX:+PrintGCTimeStamps可以增加一个gc的时间戳,来看到底gc有多频繁,比如:
1.617: [GC [PSYoungGen: 76268K->10743K(76288K)] 78162K->19282K(249856K), 0.0110779 secs] [Times: user=0.11 sys=0.00, real=0.01 secs]
这个1.617是距离程序启动的时间点
第四章 Sizing the Generations
上图中,committed就是用掉的,virtual就是保留的,但是这些内存都已经是VM向操作系统申请的内存,理论上说是已经被接管的。
By default, the virtual machine grows or shrinks the heap at each collection to try to keep the proportion of free space to live objects at each collection within a specific range. This target range is set as a percentage by the parameters -XX:MinHeapFreeRatio=<minimum> and -XX:MaxHeapFreeRatio=<maximum>, and the total size is bounded below by -Xms<min> and above by -Xmx<max>.
默认虚拟机动态增长或者减少堆内存占用的空间,将活动对象占用的内存控制在一定比例之内。这个行为可以通过设定参数来改变。
-XX:MinHeapFreeRatio=<minimum>
设定空闲堆内存占总使用内存的最小比率,在Solaris 64 bit操作系统上默认是40,如果空闲比例比这个要低,那么内存中的“代”就会扩大,来保持这个比例
-XX:MaxHeapFreeRatio=<maximum>
和上方的选项类似,设定空闲堆内存占总使用内存的最大比率,默认70,如果超过70,那么各代就会压缩,来保证不会空闲太多。
-Xms<min>
最小堆内存,单位可以使用M和K来表示
-Xmx<max>
最大堆内存,单位可以使用M和K来表示
举个例子如下:
当给eclipse中的tomcat,设定一个过小的最大内存时:
在启动tomcat时,会看到vm为了节省内存而疯狂的gc信息,但最终导致内存溢出
年轻代
The bigger the young generation, the less often minor collections occur. However, for a bounded heap size, a larger young generation implies a smaller tenured generation, which will increase the frequency of major collections. The optimal choice depends on the lifetime distribution of the objects allocated by the application.
年轻代越大,minor gc发生的次数就越少,但是因为内存总是有限的,如果年轻大越大就意味着老生代越小,那么又会增加major gc的频率。最优的选择应当基于程序里对象的生存特性。
By default, the young generation size is controlled by the parameter NewRatio. For example, setting -XX:NewRatio=3 means that the ratio between the young and tenured generation is 1:3. In other words, the combined size of the eden and survivor spaces will be one-fourth of the total heap size.
默认年轻代的大小被参数-XX:NewRatio控制,比如,-XX:NewRatio=3意味着年轻代和老生代的比例是年轻代占1分,老生代占3分,比例是1:3。换句话说,eden和survior区的空间是整个堆的四分之一。
The parameters NewSize and MaxNewSize bound the young generation size from below and above. Setting these to the same value fixes the young generation, just as setting -Xms and -Xmx to the same value fixes the total heap size. This is useful for tuning the young generation at a finer granularity than the integral multiples allowed by NewRatio.
参数NewSize和MaxNewSize来限定年轻代大小的上下限。如果两个值设定为相同,就是强制年轻代为固定大小。
默认值:
-XX:NewRatio=2
-XX:NewSize=1310M
-XX:MaxNewSize=根据系统容量
Survivor Space Sizing
You can use the parameter SurvivorRatio can be used to tune the size of the survivor spaces, but this is often not important for performance. For example, -XX:SurvivorRatio=6 sets the ratio between eden and a survivor space to 1:6. In other words, each survivor space will be one-sixth the size of eden, and thus one-eighth the size of the young generation (not one-seventh, because there are two survivor spaces).
使用参数SurvivorRatio来控制survivor区域和eden区域的比值,但是通常这个设定对性能的影响不大。比如给这个值设定为6,那么意味着eden区和survivor区的比值大小为6:1(从这里看到,所有百分比的设定,设定的都是分母),因为有两个survivor区,并且区域大小一致,可以推得结论:每个survivor区域占整个新生代的1/8。
If survivor spaces are too small, copying collection overflows directly into the tenured generation. If survivor spaces are too large, they will be uselessly empty. At each garbage collection, the virtual machine chooses a threshold number, which is the number times an object can be copied before it is tenured. This threshold is chosen to keep the survivors half full. The command line option -XX:+PrintTenuringDistribution (not available on all garbage collectors) can be used to show this threshold and the ages of objects in the new generation. It is also useful for observing the lifetime distribution of an application.
如果survivor区域过小,那么VM就会将对象直接拷贝到老生代。如果survivor过大,那又会造成浪费。每一种gc,虚拟机都会选择一个阀值,这个阀值是一个对象在变为老生代之前可以被拷贝的次数(两个survivor之间互相copy的次数)。这个阀值是保持survivor半空的值。命令行参数-XX:+PrintTenuringDistribution(不是所有gc都有的)可以展示这个门槛值,和对象的生存时间。
Java 8 VM GC Tuning Guide Charter3-4的更多相关文章
- Java 8 VM GC Tuning Guide Charter2
第二章 Ergonomics Ergonomics is the process by which the Java Virtual Machine (JVM) and garbage collect ...
- Java 8 VM GC Tunning Guide Charter 6
第六章 并行GC The Parallel Collector The parallel collector (also referred to here as the throughput coll ...
- Java 8 VM GC Tunning Guide Charter 5
第5章 Available GC The Java HotSpot VM includes three different types of collectors, each with differe ...
- Java 8 VM GC Tunning Guide Charter 7-8-b
第七章 并发gc Java 8提供两种并发gc,CMS和G1 Concurrent Mark Sweep (CMS) Collector This collector is for applicati ...
- Java 8 VM GC Tunning Guild Charter 9-b
第九章 G1 GC The Garbage-First (G1) garbage collector is a server-style garbage collector, targeted for ...
- 提交并发量的方法:Java GC tuning :Garbage collector
三色算法,高效率垃圾回收,jvm调优 Garbage collector:垃圾回收器 What garbage? 没有任何引用指向它的对象 JVM GC回收算法: 引用计数法(ReferenceCou ...
- JVM:从实际案例聊聊Java应用的GC优化
原文转载自美团从实际案例聊聊Java应用的GC优化,感谢原作者的贡献 当Java程序性能达不到既定目标,且其他优化手段都已经穷尽时,通常需要调整垃圾回收器来进一步提高性能,称为GC优化.但GC算法复杂 ...
- HotSpot VM GC 的种类
collector种类 GC在 HotSpot VM 5.0里有四种: incremental (sometimes called train) low pause collector已被废弃,不在介 ...
- [转]HotSpot VM GC 的种类
原文地址:http://www.cnblogs.com/redcreen/archive/2011/05/04/2037029.html collector种类 GC在 HotSpot VM 5.0里 ...
随机推荐
- 五.CSS盒子模型
所谓盒模型,就是浏览器为每个HTML元素生成的矩形盒子.即HTML页面实际上就是由一系列盒子组成.这些盒子是按照可见版式在页面上排布的.并由三个属性进行控制:position属性,display属性, ...
- C# 中怎么将string转换成int型
int intA = 0;1.intA =int.Parse(str);2.int.TryParse(str, out intA);3.intA = Convert.ToInt32(str);以上都可 ...
- 第六十八篇、OC_按照某一字段对数值进行排序
代码中是根据"create_time_" 进行排序 ascending:决定的是升序还是降序排序 NSSortDescriptor *sortDescriptor = [[NS ...
- JavaScript语言基础-环境搭建
我们要想编写和运行JavaScript脚本,则需要:JavaScript编辑工具和JavaScript运行测试环境.下面我们分别介绍一下.JavaScript编辑工具JavaScript编辑工具最简单 ...
- TreeView控件的CheckBox级联选中或取消
背景: 在一个项目开发中遇到这样的要求:当选中树中一个节点时,需要同时选中其父节点,直至根节点.在取消一个节点的选中时,需要将其所有子节点取消选中,直至叶子节点.由于项目用户体验暂时可以不用考虑,直接 ...
- (转)为首次部署MongoDB做好准备:容量计划和监控
如果你已经完成了自己新的MongoDB应用程序的开发,并且现在正准备将它部署进产品中,那么你和你的运营团队需要讨论一些关键的问题: 最佳部署实践是什么? 为了确保应用程序满足它所必须的服务层次我们需要 ...
- (转)如何构建高性能,稳定SOA应用之-负载均衡-Decoupled Invocation(一)
当我们在为一个软件设计架构的时候,我们不仅仅要确保所做出来的架构要满足系统的业务需求,更加要确保做出来的架构要满足可维护性,安全,稳定性的非业务行的需求. 另外一个非常重要的非功能性需求就是性能.性能 ...
- 10款精美的HTML5表单登录联系和搜索表单
1.HTML5/CSS3仿Facebook登录表单 这款纯CSS3发光登录表单更是绚丽多彩.今天我们要分享一款仿Facebook的登录表单,无论从外观还是功能上说,这款登录表单还是挺接近Faceboo ...
- Java线程面试题 Top 50(转载)
原文链接:http://www.importnew.com/12773.html 本文由 ImportNew - 李 广 翻译自 javarevisited.欢迎加入Java小组.转载请参见文章末尾的 ...
- java.util.Vector
public class Vector<E> extends AbstractList<E> implements List<E>, RandomAccess, C ...