mcrouter facebook 开源的企业级memcached代理
原文地址:https://code.facebook.com/posts/296442737213493/introducing-mcrouter-a-memcached-protocol-router-for-scaling-memcached-deployments/
开源地址:https://github.com/facebook/mcrouter
Most web-based services begin as a collection of front-end application servers paired with databases used to manage data storage. As they grow, the databases are augmented with caches to store frequently-read pieces of data and improve site performance. Often, the ability to quickly access data moves from being an optimization to a requirement for a site. This evolution of cache from neat optimization to necessity is a common path that has been followed by many large web scale companies, including Facebook, Twitter[1], Instagram, Reddit, and many others.
Last year, at the Data@Scale eventand at the USENIX Networked Systems Design and Implementation conference, we spoke about turning caches into distributed systems using software we developed called mcrouter (pronounced “mick-router”). Mcrouter is a memcached protocol router that is used at Facebook to handle all traffic to, from, and between thousands of cache servers across dozens of clusters distributed in our data centers around the world. It is proven at massive scale — at peak, mcrouter handles close to 5 billion requests per second. Mcrouter was also proven to work as a standalone binary in an Amazon Web Services setup when Instagram used it last year before fully transitioning to Facebook's infrastructure.
Today, we are excited to announce that we are releasing mcrouter’s codeunder an open-source BSD license. We believe it will help many sites scale more easily by leveraging Facebook’s knowledge about large-scale systems in an easy-to-understand and easy-to-deploy package.
Features
Since any client that wants to talk to memcached can already speak the standard ASCII memcached protocol, we use that as the common API and enter the picture silently. To a client, mcrouter looks like a memcached server. To a server, mcrouter looks like a normal memcached client. But mcrouter's feature-rich configurability makes it more than a simple proxy.
Some features of mcrouter are listed below. In the following, a “destination” is a memcached host (or some other cache service that understands the memcached protocol) and “pool” is a set of destinations configured for some workload — e.g., a sharded pool with a specified hashing function, or a replicated pool with multiple copies of the data on separate hosts. Finally, pools can be organized into multiple clusters.
- Standard open source memcached ASCII protocol support: Any client that can talk the memcached protocol can already talk to mcrouter — no changes are needed. Mcrouter can simply be simply dropped in between clients and memcached boxes to take advantage of its functionality.
- Connection pooling: Multiple clients can connect to a single mcrouter instance and share the outgoing connections, reducing the number of open connections to memcached instances.
- Multiple hashing schemes: Mcrouter provides a proven consistent hashing algorithm (furc_hash) that allows distribution of keys across many memcached instances. Hostname hashing is useful for selecting a unique replica per client. There are a number of other hashes useful in specialized applications.
- Prefix routing: Mcrouter can route keys according to common key prefixes. For example, you can send all keys starting with “foo” to one pool, “bar” prefix to another pool, and everything else to a “wildcard” pool. This is a simple way to separate different workloads.
- Replicated pools: A replicated pool has the same data on multiple hosts. Writes are replicated to all hosts in the pool, while reads are routed to a single replica chosen separately for each client. This could be done either due to per-host packet limitations where a sharded pool would not be able to handle the read rate; or for increased availability of the data (one replica going down doesn't affect availability due to automatic failover).
- Production traffic shadowing: When testing new cache hardware, we found it extremely useful to be able to route a complete copy of production traffic from clients. Mcrouter supports flexible shadowing configuration. It's possible to shadow test a different pool size (re-hashing the key space), shadow only a fraction of the key space, or vary shadowing settings dynamically at runtime.
- Online reconfiguration: Mcrouter monitors its configuration files and automatically reloads them on any file change; this loading and parsing is done on a background thread and new requests are routed according to the new configuration as soon as it's ready. There's no extra latency from client's point of view.
- Flexible routing: Configuration is specified as a graph of small routing modules called “route handles,” which share a common interface (route a request and return a reply) and which can be composed freely. Route handles are easy to understand, create, and test individually, allowing for arbitrarily complex logic when used together. For example: An “all-sync” route handle will be set up with multiple child route handles (which themselves could be arbitrary route handles). It will pass a request to all of its children and wait for all of the replies to come back before returning one of these replies. Other examples include, among many others, “all-async” (send to all but don't wait for replies), “all-majority” (for consensus polling), and “failover” (send to every child in order until an non-error reply is returned). Expanding a pool can be done quickly by using a “cold cache warmup” route handle on the pool (with the old set of servers as the warm pool). Moving this handle handle up the stack will allow for an entire cluster to be warmed up from a warm cluster.
- Destination health monitoring and automatic failover: Mcrouter keeps track of the health status of each destination. If mcrouter marks a destination as unresponsive, it will fail over incoming requests to an alternate destination automatically (fast failover) without attempting to send them to the original destination. At the same time health check requests will be sent in the background, and as soon as a health check is successful, mcrouter will revert to using the original destination. We distinguish between “soft errors” (e.g., data timeouts) that are allowed to happen a few times in a row and “hard errors” (e.g., connection refused) that cause a host to be marked unresponsive immediately. Needless to say, all of this is completely transparent to the client.
- Cold cache warm up: Mcrouter can smooth the performance impact of starting a brand new empty cache host or set of hosts (as large as an entire cluster) by automatically refilling it from a designated “warm” cache.
- Broadcast operations: By adding a special prefix to a key in a request, it's easy to replicate the same request into multiple pools and/or clusters.
- Reliable delete stream: In a demand-filled look-aside cache, it's important to ensure all deletes are eventually delivered to guarantee consistency. Mcrouter supports logging delete commands to disk in cases when the destination is not accessible (due to a network outage or other failure). A separate process then replays those deletes asynchronously. This is done transparently to the client — the original delete command is always reported as successful.
- Multi-cluster support: Configuration management for large multi-cluster setups is easy. A single config can be distributed to all clusters and, depending on command line options, mcrouter will interpret the config based on its location.
- Rich stats and debug commands: Mcrouter exports many internal counters (via a “stats” command; also to a JSON file on disk). Introspection debug commands are also available, which can answer questions like “Which host would a particular request go to?” at runtime.
- Quality of service: Mcrouter allows throttling the rate of any type of request (e.g., get/set/delete) at any level (per-host, per-pool, per-cluster), rejecting requests over a specified limit. We also support rate limit requests to slow delivery.
- Large values: Mcrouter can automatically split/re-stitch large values that would not normally fit in a memcached slab.
- Multi-level caches: Mcrouter supports local/remote cache setup, where values would be looked up locally first and automatically set in a local cache from remote after fetching.
- IPv6 support: We have strong support internally for IPv6 at Facebook, so mcrouter is IPv6 compatible out of the box.
- SSL support: Mcrouter supports SSL connections (incoming or outgoing), as long as the client or the destination hosts support it as well. It is also possible to set up multiple mcrouters in series, in which case the middle connection between mcrouters can be over SSL out of the box.
- Multi-threaded architecture: Mcrouter can take full advantage of multicore systems by starting one thread per core.
Implementation
Mcrouter is written mostly in C++ (with heavy use of C++11 features), with some library code written in C and protocol parsing code written in Ragel. It uses Facebook's open source libraries Follyand fbthrift(for async networking code).
A mcrouter process starts up multiple independent threads, each running an event loop that processes all network events asynchronously (using libevent). Once each request or reply is parsed, it's processed inside its own lightweight thread or “fiber”; we have a custom fiber library implementation built on top of boost::context.
Mcrouter configuration is written in JSON format and allows specifying an arbitrary route handle scheme to easily adapt to any routing task. We have presented some common use cases in depth on our wiki.
What's next
We invite software engineers using memcached everywhere to evaluate mcrouter and see if it helps to simplify the site administration while providing the new capabilities listed above (shadow testing, cold cache warmup, and so on). Instagram used mcrouter for the last year, before transitioning to Facebook's infrastructure, so mcrouter is proven in an Amazon Web Services setup. Prior to open sourcing, we partnered with Reddit for a limited beta test, and they are currently running mcrouter in production for some of their caches.
We would also love to see patches come back that will make mcrouter more helpful to you and to others in the memcached community.
Mcrouter source code has been open sourced at https://github.com/facebook/mcrouter. We're always looking for ways to improve mcrouter's performance, fix bugs, and add new features. We will continuously update the external Github repo with our internal changes, so you can benefit from this work as well. We maintain mcrouter documentation on the Github wiki. We have also set up a Facebook discussion group.
Footnotes:
[1] https://blog.twitter.com/2012/twemproxy
mcrouter facebook 开源的企业级memcached代理的更多相关文章
- Facebook开源软件列表
从 Facebook 的 GitHub 账户中可以看到,Facebook 已经开源的开源项目有近 300 个,领域涉及移动.前端.Web.后端.大数据.数据库.工具和硬件等.Facebook 开源项目 ...
- facebook开源项目集合
Facebook的开源大手笔 1. 开源Facebook平台代码 Facebook在2008年选择将该平台上的重要部分的代码和应用工具开源.Facebook称,平台已经基本发展成熟,此举可以让开发 ...
- 基于Facebook开源框架SocketRocket的即时通讯
SocketRocket 介绍: SocketRock 是 Facebook 开源的框架,基于 WebSocket 客户端类库,适用于 iOS.Mac OS.tv OS.GitHub 传送门:http ...
- Facebook开源动画库 POP-POPBasicAnimation运用
动画在APP开发过程中还是经常出现,将花几天的时间对Facebook开源动画库 POP进行简单的学习:本文主要针对的是POPBasicAnimation运用:实例源代码已经上传至gitHub,地址:h ...
- Pop - Facebook 开源 iOS & OS X 动画库
Pop 是一个可扩展的 iOS & OS X 动画引擎.除了基本的静态动画,它支持弹簧和动态衰减的动画,因此可以用于构建现实的,基于物理的交互效果. 它的 API 可以与现有的 Objecti ...
- Android Fresco (Facebook开源的图片加载管理库)
Fresco是Facebook开源的一个图片加载和管理库. 这里是Fresco的GitHub网址. 同类型的开源库市面有非常多,比如Picasso, Universal Image Loader, G ...
- Facebook开源项目:我们为什么要用Fresco框架?
(Facebook开源项目)Fresco:一个新的Android图像处理类库 在Facebook的Android客户端上快速高效的显示图片是非常重要的.然而多年来,我们遇到了很多如何高效存储图片的问题 ...
- Facebook 开源 AI 所使用的硬件平台 'Big Sur'
Facebook 开源 AI 所使用的硬件平台 'Big Sur' Facebook 今开源其 AI 所使用的硬件平台 'Big Sur'.'Big Sur' 是兼容开放机架的 GPU 加速硬件平台. ...
- Facebook开源的基于SQL的操作系统检测和监控框架:osquery
osquery简介 osquery是一款由facebook开源的,面向OSX和Linux的操作系统检测框架. osquery顾名思义,就是query os,允许通过使用SQL的方式来获取操作系统的数据 ...
随机推荐
- 从OC和C#中找乐趣:相同又不同的delegate
不想说话,本来第一段打了一大堆废话,结果浏览器崩溃了...直接进入正题吧.看Demo: C#里面也有delegate,我今天的目的就是模仿着OC里面的写法来写一个网络请求模拟类.先建一个“Protoc ...
- Vue状态管理-Bus
1.父子组件之间进行通讯: 父组件通过属性和子组件通讯,子组件通过事件和父组件通讯.vue2.x只允许单向数据传递. 先定义一个子组件AInput.vue: <template> < ...
- 【题解】洛谷P1445 [Violet]樱花 (推导+约数和)
洛谷P1445:https://www.luogu.org/problemnew/show/P1445 推导过程 1/x+1/y=1/n! 设y=n!+k(k∈N∗) 1/x+1/(n!+k)=1 ...
- 学习MySql和MongoDB笔记
首先了解下关系型数据库和非关系型数据库 关系型数据库 SQL关系型数据库采用了关系模式来组织数据,即关系模式为二维表格模型. 主要的数据库:SQL Server,Oracle,Mysql,Postgr ...
- 转载:Python中的if __name__ == '__main__'
刚开始学习Python时,对于有些书出现的函数带有“if __name__ == '__main__'”总是迷惑不解,比如<dive into Python>中开头的哪个根据输入的数字计算 ...
- rest_framework -- 认证组件
#####认证组件##### 一.认证是什么就不说了,某些网页必须是用户登陆之后,才能访问的,所以这时候就需要用上认证组件. 你不用rest_framework的认证组件也行,这种认证的话,完全可以自 ...
- Mybaties保存后自动获取主键ID
<!-- 插入记录 --> <insert id="saveTvTypeBatch" useGeneratedKeys="true" keyP ...
- Largest Rectangle in a Histogram(hdu1506,单调栈裸题)
Largest Rectangle in a Histogram Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 ...
- 【Nowcoder 上海五校赛】二数(模拟)
题目描述: 我们把十进制下每一位都是偶数的数字叫做“二数”. 小埃表示自己很聪明,最近他不仅能够从小数到大:2,3,4,5....,也学会了从大数到小:100,99,98...,他想知道从一个数开始数 ...
- 如何在linux中创建虚拟环境
安装虚拟环境的命令 : sudo pip install virtualenv sudo pip install virtualenvwrapper 安装完虚拟环境后,如果提示找不到mkvirtual ...