http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html

MONDAY, JANUARY 25, 2016 AT 8:56AM

This is a guest post by Benjamin Manes, who did engineery things for Google and is now doing engineery things for a new load documentation startup, LoadDocs.

Caching is a common approach for improving performance, yet most implementations use strictly classical techniques. In this article we will explore the modern methods used by Caffeine, an open-source Java caching library, that yield high hit rates and excellent concurrency. These ideas can be translated to your favorite language and hopefully some readers will be inspired to do just that.

Eviction Policy

A cache’s eviction policy tries to predict which entries are most likely to be used againin the near future, thereby maximizing the hit ratio. The Least Recently Used (LRU) policy is perhaps the most popular due to its simplicity, good runtime performance, and a decent hit rate in common workloads. Its ability to predict the future is limited to the history of the entries residing in the cache, preferring to give the last access the highest priority by guessing that it is the most likely to be reused again soon.

Modern caches extend the usage history to include the recent past and give preference to entries based on recency and frequency. One approach for retaining history is to use a popularity sketch (a compact, probabilistic data structure) to identify the “heavy hitters” in a large stream of events. Take for example CountMin Sketch, which uses a matrix of counters and multiple hash functions. The addition of an entry increments a counter in each row and the frequency is estimated by taking the minimum value observed. This approach lets us tradeoff between space, efficiency, and the error rate due to collisions by adjusting the matrix’s width and depth.

Window TinyLFU (W-TinyLFU) uses the sketch as a filter, admitting a new entry if it has a higher frequency than the entry that would have to be evicted to make room for it. Instead of filtering immediately, an admission window gives an entry a chance to build up its popularity. This avoids consecutive misses, especially in cases like sparse bursts where an entry may not be deemed suitable for long-term retention. To keep the history fresh an aging process is performed periodically or incrementally to halve all of the counters.

W-TinyLFU uses the Segmented LRU (SLRU) policy for long term retention. An entry starts in the probationary segment and on a subsequent access it is promoted to the protected segment (capped at 80% capacity). When the protected segment is full it evicts into the probationary segment, which may trigger a probationary entry to be discarded. This ensures that entries with a small reuse interval (the hottest) are retained and those that are less often reused (the coldest) become eligible for eviction.

As the database and search traces show, there is a lot of opportunity to improve upon LRU by taking into account recency and frequency. More advanced policies such as ARCLIRS, andW-TinyLFU narrow the gap to provide a near optimal hit rate. For additional workloads see the research papers and try our simulator if you have your own traces to experiment with.

Expiration Policy

Expiration is often implemented as variable per entry and expired entries are evicted lazily due to a capacity constraint. This pollutes the cache with dead items, so sometimes a scavenger thread is used to periodically sweep the cache and reclaim free space. This strategy tends to work better than ordering entries by their expiration time on a O(lg n) priority queue due to hiding the cost from the user instead of incurring a penalty on every read or write operation.

Caffeine takes a different approach by observing that most often a fixed duration is preferred. This constraint allows for organizing entries on O(1) time ordered queues. A time to live duration is a write order queue and a time to idle duration is an access order queue. The cache can reuse the eviction policy’s queues and the concurrency mechanism described below, so that expired entries are discarded during the cache’s maintenance phase.

Concurrency

Concurrent access to a cache is viewed as a difficult problem because in most policies every access is a write to some shared state. The traditional solution is to guard the cache with a single lock. This might then be improved through lock striping by splitting the cache into many smaller independent regions. Unfortunately that tends to have a limited benefit due to hot entries causing some locks to be more contented than others. When contention becomes a bottleneck the next classic step has been to update only per entry metadata and use either a random sampling or a FIFO-based eviction policy. Those techniques can have great read performance, poor write performance, and difficulty in choosing a good victim.

An alternative is to borrow an idea from database theory where writes are scaled by using a commit log. Instead of mutating the data structures immediately, the updates are written to a log and replayed in asynchronous batches. This same idea can be applied to a cache by performing the hash table operation, recording the operation to a buffer, and scheduling the replay activity against the policy when deemed necessary. The policy is still guarded by a lock, or a try lock to be more precise, but shifts contention onto appending to the log buffers instead.

In Caffeine separate buffers are used for cache reads and writes. An access is recorded into a striped ring buffer where the stripe is chosen by a thread specific hash and the number of stripes grows when contention is detected. When a ring buffer is full an asynchronous drain is scheduled and subsequent additions to that buffer are discarded until space becomes available. When the access is not recorded due to a full buffer the cached value is still returned to the caller. The loss of policy information does not have a meaningful impact because W-TinyLFU is able to identify the hot entries that we wish to retain. By using a thread-specific hash instead of the key’s hash the cache avoids popular entries from causing contention by more evenly spreading out the load.

In the case of a write a more traditional concurrent queue is used and every change schedules an immediate drain. While data loss is unacceptable, there are still ways to optimize the write buffer. Both types of buffers are written to by multiple threads but only consumed by a single one at a given time. This multiple producer / single consumer behavior allows for simpler, more efficient algorithms to be employed.

The buffers and fine grained writes introduce a race condition where operations for an entry may be recorded out of order. An insertion, read, update, and removal can be replayed in any order and if improperly handled the policy could retain dangling references. The solution to this is a state machine defining the lifecycle of an entry.

In benchmarks the cost of the buffers is relatively cheap and scales with the underlying hash table. Reads scale linearly with the number of CPUs at about 33% of the hash table’s throughput. Writes have a 10% penalty, but only because contention when updating the hash table is the dominant cost.

Conclusion

There are many pragmatic topics that have not been covered. This could include tricks to minimize the memory overhead, testing techniques to retain quality as complexity grows, and ways to analyze performance to determine whether a optimization is worthwhile. These are areas that practitioners must keep an eye on, because once neglected it can be difficult to restore confidence in one’s own ability to manage the ensuing complexity.

The design and implementation of Caffeine is the result of numerous insights and the hard work of many contributors. Its evolution over the years wouldn’t have been possible without the help from the following people: Charles Fry, Adam Zell, Gil Einziger, Roy Friedman, Kevin Bourrillion, Bob Lee, Doug Lea, Josh Bloch, Bob Lane, Nitsan Wakart, Thomas Müeller, Dominic Tootell, Louis Wasserman, and Vladimir Blagojevic. Thanks to Nitsan Wakart, Adam Zell, Roy Friedman, and Will Chu for their feedback on drafts of this article.

Related Articles

Design Of A Modern Cache的更多相关文章

  1. Design a high performance cache for multi-threaded environment

    如何设计一个支持高并发的高性能缓存库 不 考虑并发情况下的缓存的设计大家应该都比较清楚,基本上就是用map/hashmap存储键值,然后用双向链表记录一个LRU来用于缓存的清理.这篇文章 应该是讲得很 ...

  2. Cheatsheet: 2016 01.01 ~ 01.31

    Mobile An Introduction to Cordova: Basics Web Angular 2 versus React: There Will Be Blood How to Bec ...

  3. 高性能缓存 Caffeine 原理及实战

    一.简介 Caffeine 是基于Java 8 开发的.提供了近乎最佳命中率的高性能本地缓存组件,Spring5 开始不再支持 Guava Cache,改为使用 Caffeine. 下面是 Caffe ...

  4. Networked Graphics: Building Networked Games and Virtual Environments (Anthony Steed / Manuel Fradinho Oliveira 著)

    PART I GROUNDWORK CHAPTER 1 Introduction CHAPTER 2 One on One (101) CHAPTER 3 Overview of the Intern ...

  5. How to receive a million packets per second

    Last week during a casual conversation I overheard a colleague saying: "The Linux network stack ...

  6. JDK下sun.net.www.protocol.http.HttpURLConnection类-----Http客户端实现类的实现分析

    HttpClient类是进行TCP连接的实现类, package sun.net.www.http; import java.io.*; import java.net.*; import java. ...

  7. 开源sip server & sip client 和开发库 一览

    http://www.voip-info.org/wiki/view/Open+Source+VOIP+Software http://blog.csdn.net/xuyunzhang/article ...

  8. magento增加左侧导航栏

    1.打开 app\design\frontend\default\modern\layout\catalog.xml,在适当位置加入以下代码: <reference name=”left”> ...

  9. Django 2.0.1 官方文档翻译: 文档目录 (Page 1)

    Django documentation contents 翻译完成后会做标记. 文档按照官方提供的内容一页一页的进行翻译,有些内容涉及到其他节的内容,会慢慢补上.所有的翻译内容按自己的理解来写,尽量 ...

随机推荐

  1. JavaWeb学习总结(十三)——使用Session防止表单重复提交

    在平时开发中,如果网速比较慢的情况下,用户提交表单后,发现服务器半天都没有响应,那么用户可能会以为是自己没有提交表单,就会再点击提交按钮重复提交表单,我们在开发中必须防止表单重复提交. 一.表单重复提 ...

  2. Markdown简单语法总结

    . 标题 # 一级标题 ## 二级标题 ### 三级标题 #### 四级标题 ##### 五级标题 ###### 六级标题 2. 列表 无序列表 * 西瓜 * 葡萄 * 香蕉 - 西瓜 - 葡萄 - ...

  3. Form - CHECKBOX全选功能

    FORM BUILDER开发,遇到这样一个需求: 添加一个CHECKBOX完成全选功能,红框为新添加的CHECKBOX(如图示) Try to use APP_RECORD.FOR_ALL_RECOR ...

  4. MySQL导入数据非常慢的解决办法

    MySQL导出的SQL语句在导入时有可能会非常非常慢,经历过导入仅45万条记录,竟用了近3个小时.在导出时合理使用几个参数,可以大大加快导入的速度. -e 使用包括几个VALUES列表的多行INSER ...

  5. poj棋盘分割(记忆化)

    http://poj.org/problem?id=1191 黑书上P116 想了挺久 没想出来 想推出一公式来着 退不出来.. 想偏了  正解:递归 #include <iostream> ...

  6. @Resource和@Autowired

    @Resource的作用相当于@Autowired,只不过@Autowired按byType自动注入,而@Resource默认按 byName自动注入罢了.@Resource有两个属性是比较重要的,分 ...

  7. 注意EntityFramework.extended中的坑

    EntityFramework.extended 的好处就不用多说了 详情:https://github.com/loresoft/EntityFramework.Extended 但是使用时还是要注 ...

  8. Java线程之Phaser

    Phaser是一个灵活的线程同步工具,他包含了CyclicBarrier和CountDownLatch的相关功能 首先,来看一下如何用Phaser替代CountDownLatch.对于CountDow ...

  9. Linux入门2

    请设置系统时间和硬件时间保持一致:# hwclock --hctosys sed用法: 1.删除/etc/grub.conf文件中行首的空白符:sed -r 's@^[[:space:]]+@@g' ...

  10. 我理解的RPC

    这两天在学习公司一牛人写的RPC框架,一直都对RPC的概念很模糊,现在稍微清晰了些.   rpc定义:RPC(Remote Procedure Call Protocol)——远程过程调用协议,它是一 ...