Facebook MyRocks at MariaDB

Recently my colleague Rasmus Johansson announced that MariaDB is adding support for the Facebook MyRocks storage engine. Today I’m going to share a bit more on what that means for MariaDB users. Members of the Facebook Database Engineering team helped us answer some questions we think our community will have about MyRocks.

Benefits of MariaDB Server’s Extensible Architecture

Before discussing specifics of MyRocks, new readers may benefit from a description of MariaDB Serverarchitecture, which is extensible at every layerincluding the storage layer. This means users and the community can add functionality to meet unique needs. Community contributions are one of MariaDB’s greatest advantages over other databases, and a big reason for us becoming the fastest growing open source database in the marketplace.

Openness in the storage layer is especially important because being able to use the right storage engine for the right use case ensures better performance optimization. Both mysql and MariaDB support InnoDB - a well known, general purpose storage engine. But InnoDB is not suited to every use case, so the MariaDB engineering team is extending support for additional storage engines, including Facebook’s MyRocks for workloads requiring greater compression and IO efficiency, and MariaDBColumnStore (currently inbeta), which will provide faster time-to-insight with Massively Parallel Execution (MPP).

Facebook MyRocks for MariaDB

When searching for a storage engine that could give greater performance for web scale type applications, MyRocks was an obvious choice because of its superior handling of data compression and IO efficiency. Besides that, its LSM architecture allows for very efficient data ingestion, like read-free replication slaves, or fast bulk data loading.

As we add support for new storage engines, many of our current users may ask, “What happens to MariaDB’s support for InnoDB? Do I have to migrate?” Of course not! We have no plans to abandon InnoDB. InnoDB is a proven storage engine and we expect it to continue to be used by MariaDB users. But we do expect that deployments that need highest possible efficiency will opt for MyRocks because of its performance gains and IO efficiency. Over time, as MyRocks matures we expect it will become appropriate for even more use cases.

The first MariaDB version of MyRocks will be available in a release candidate of MariaDB Server 10.2 coming this winter. Our goal is for MyRocks to work with all MariaDB features, but some of them, like optimistic parallel replication, may not work in the first release. MariaDB is an open source project that follows the "release often, release early" approach, so our goal is to first make a release that meets core requirements, and then add support for special cases in subsequent releases.

Now let’s move onto my discussion with Facebook’s Database Engineering team!

Can you tell us a bit about the history of RocksDB at Facebook? In 2012, we started to build an embedded storage engine optimized for flash-based SSD, by forking LevelDB. The fork became RocksDB, which was open-sourced on November 2013 [1] . After RocksDB proved to be an effective persistent key-value store for SSD, we enhanced RocksDB for other platforms. We improved its performance on DRAM in 2014 and on hard drives in 2015, two platforms with production use cases now.

Over the past few years, we've introduced numerous features and improvements. To name a few, we built compaction filter and merge operator in 2013, backup and column families in 2014, transactions and bulk loading in 2015, and persistent cache in 2016. See the list of features that are not in LevelDB: https://github.com/facebook/rocksdb/wiki/Features-Not-in-LevelDB .

Early RocksDB adopters at Facebook such as the distributed key-value store ZippyDB [2], Laser [2] and Dragon [3] went into production in early 2013. Since then, many more new or existing services at Facebook started to use RocksDB every year. Now RocksDB is used in a number of services across multiple hardware platforms at Facebook. [1] https://code.facebook.com/posts/666746063357648/under-the-hood-building-and-open-sourcing-rocksdb/ and http://rocksdb.blogspot.com/2013/11/the-history-of-rocksdb.html [2] https://research.facebook.com/publications/realtime-data-processing-at-facebook/ [3] https://code.facebook.com/posts/1737605303120405/dragon-a-distributed-graph-query-engine/ Why did FB go down the RocksDB path for MySQL?

MySQL is a popular storage solution at Facebook because we have a great team dedicated to running MySQL at scale that provides a high quality of service. The MySQL tiers store many petabytes of data that have been compressed with InnoDB table compression. We are always looking for ways to improve compression and the LSM algorithm used by RocksDB has several advantages over the B-Tree used by InnoDB. This led us to MyRocks: RocksDB is a key-value storage engine. MyRocks implements that MySQL storage engine API to make RocksDB work with MySQL and provide SQL functionality. Our initial goal was to get 2x more compression from MyRocks than from compressed InnoDB without affecting read performance. We exceeded our goal. In addition to getting 2x better compression, we also got much lower write rates to storage, faster database loads, and better performance.

Lower write rates enable the use of lower endurance flash, and faster loads simplify the migration from MySQL on InnoDB to MySQL on RocksDB. While we don't expect better performance for all workloads, the way in which we operate the database tier for the initial MyRocks deployment favors RocksDB more than InnoDB. Finally, there are features unique to an LSM that we expect to support in the future, including the merge operator and compaction filters. MyRocks can be helpful to the MySQL community because of efficiency and innovation.

We considered multiple write-optimized database engines. We chose RocksDB because it has excellent performance and efficiency and because we work directly with the team. The MyRocks effort has benefited greatly from being able to collaborate on a daily basis with the RocksDB team. We appreciate that the RocksDB team treats us like a very important customer. They move fast to make RocksDB better for MyRocks.

How was MyRocks developed? MyRocks is developed by engineers from several locations across the globe. The team had the privilege to work with Sergey Petrunia right from the beginning, and he is based in Russia. At Facebook's Menlo Park campus, Siying Dong leads RocksDB development and Yoshinori Matsunobu leads the collaboration with MySQL infrastructure and data performance teams. From the Seattle office, Herman Lee worked on the initial validation of MyRocks that gave the team the confidence to proceed with MyRocks for our user databases as well as led the MyRocks feature development. In Oregon, Mark Callaghan has been benchmarking all aspects of MyRocks and

Let’s start by creating a table, where we will store the CIDR, split in two columns: one for the IPv6 address and one for the network length. The most compact way of storing IPv6 values is to use the binary(16) .

CREATETABLE `cidr` ( `id` int(11) NOT NULL AUTO_INCREMENT, `ip_address` binary(16) NOT NULL, `net_len` int(11) NOT NULL, PRIMARYKEY (`id`) ) ENGINE=InnoDB;

In order to generate some random data, I will use a stored procedure which will insert 100,000 IP addresses in the previously created table. The addresses will be generated by inserting the first 16 bytes of the current time-stamp’s SHA value.

DELIMITER // CREATEPROCEDUREgenerate_ips(no_ipsINT) BEGIN DECLARE x INT DEFAULT 0; DECLARE ipbinary(16); DECLARE rand_net_lenINT; REPEAT SET x = x + 1; # Generate a random ip address SETip= substring(unhex(sha(RAND())), 1, 16); # Generate a random network length between 64 and 123 SETrand_net_len= 64 + FLOOR(RAND() * 60); INSERTINTOcidr(ip_address, net_len) values (ip, rand_net_len); UNTIL x > no_ipsEND REPEAT; END // DELIMITER ; # Generate the IPs CALLgenerate_ips(100000);

Ifyou want to insert the values from a human-readable string, MySQL provides the INET6_ATON("2001:0db8:85a3:0000:0000:8a2e:0370:7334") function which will strip the colons (“:”)and convertthe 32 characters to a binary(16) string.

Next Iwill select the first 3 IP address. In order to do thatI will use the INET6_NTOA function, which will convert from the binary format to the human-readable format.

mysql> SELECTid, INET6_NTOA(ip_address), net_lenFROMcidrLIMIT 3; +----+-----------------------------------------+---------+ | id | INET6_NTOA(ip_address)| net_len | +----+-----------------------------------------+---------+ |1 | 200d:31c4:1905:9eb2:3c7f:c45c:de78:42cd |97 | |2 | 59b0:c4d6:48b4:3717:f031:d05b:705d:6c65 |95 | |3 | 788e:3f48:e62b:c3bb:da10:6a03:f987:7a16 |110 | +----+-----------------------------------------+---------+ 3 rowsin set (0,01 sec)

Next, I would like to select the network mask, host mask, network address and also generate the network range intervals. For this I will create a few helper functions:

# Returns the net mask based on the network length DELIMITER // CREATEFUNCTION net_mask(net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN (~INET6_ATON('::') << (128 - net_len)); END // # Returns the network address using an IP and the network length CREATEFUNCTION subnet(ipBINARY(16), net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN ip & ((~INET6_ATON('::') << (128 - net_len))); END // # Returns the host mask CREATEFUNCTION host_mask(net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN (~INET6_ATON('::') >> net_len); END // DELIMITER ;

Facebook MyRocks at MariaDB的更多相关文章

MySQL与MariaDB核心特性比较详细版v1.0（覆盖mysql 8.0/mariadb 10.3，包括优化、功能及维护）
注:本文严禁任何形式的转载,原文使用word编写,为了大家阅读方便,提供pdf版下载. MySQL与MariaDB主要特性比较详细版v1.0(不含HA).pdf 链接:https://pan.baid ...
【巨杉数据库SequoiaDB】巨杉Tech | 巨杉数据库的并发 malloc 实现
本文由巨杉数据库北美实验室资深数据库架构师撰写,主要介绍巨杉数据库的并发malloc实现与架构设计.原文为英文撰写,我们提供了中文译本在英文之后. SequoiaDB Concurrent mallo ...
数据库对比：选择MariaDB还是MySQL？
作者 | EverSQL 译者 | 无明这篇文章的目的主要是比较 MySQL 和 MariaDB 之间的主要相似点和不同点.我们将从性能.安全性和主要功能方面对这两个数据库展开对比,并列出在选择数据 ...
myrocks复制中断问题排查
背景 mysql可以支持多种不同的存储引擎,innodb由于其高效的读写性能,并且支持事务特性,使得它成为mysql存储引擎的代名词,使用非常广泛.随着SSD逐渐普及,硬件存储成本越来越高,面向写优化 ...
AliOS编译安装MyRocks
MyRocks是facabook版将自主研发的MySQL分支,其源码位于为:https://github.com/facebook/mysql-5.6/ 首先需要安装以下: sudo yum inst ...
MyRocks简介
RocksDB是facebook基于LevelDB实现的,目前为facebook内部大量业务提供服务.经过facebook大量工作,将RocksDB为MySQL的一个存储引擎移植到MySQL,称之为M ...
初识MariaDB存储引擎
在看MariaDB的存储引擎之前,可以先了解MySQL存储引擎. MySQL常用的存储引擎: MyISAM存储引擎:是MySQL的默认存储引擎.MyISAM不支持事务.也不支持外键,但其访问速度快,对 ...
【转】哦，mysql 的其它发行版本Percona， mariadb
原文:http://geek.csdn.net/news/detail/130146 2016年11月25日,沃趣科技"智慧应用数据先行"2016产品发布会暨新三板挂牌庆祝会在杭 ...
MyRocks DDL原理
最近一个日常实例在做DDL过程中,直接把数据库给干趴下了,问题还是比较严重的,于是赶紧排查问题,撸了下crash堆栈和alert日志,发现是在去除唯一约束的场景下,MyRocks存在一个严重的bug, ...

随机推荐

Vue.2.0.5-表单控件绑定
基础用法你可以用 v-model 指令在表单控件元素上创建双向数据绑定.它会根据控件类型自动选取正确的方法来更新元素.尽管有些神奇,但 v-model 本质上不过是语法糖,它负责监听用户的输入事件以 ...
ARC 没有自动释放内存
http://www.cnblogs.com/qingche/p/4569833.html 定位了好几天,才发现是打印日志没有即时释放内存,使用intrustment
ios9 升级后企业版app plist无法安装
昨天apple推送了ios9, 公司的一些app是企业版的,平常通过 item-service 结果更改如下 plist可以了 itms-services://?action=download-man ...
R12.2.0 buildStage 运行结果
# ./buildStage.sh Copyright (c) , Oracle Corporation Redwood Shores, California, USA Oracle E-Busine ...
Adobe Flash CC 安装报错的解决办法
安装FlashCC的时候莫名的报错 ---------------------------Flash.exe - 应用程序错误---------------------------应用程序无法正常启动 ...
Android遇到的错误，运行时崩溃
修改主题背景时在<Activity>中增加android:theme="@android:style/Theme.Black.NoTitleBar"时运行出现崩溃的现 ...
struts2 radio标签单选按钮
<s:radio name="sex" label="性别" list="#{'男':'男','女':'女'}" value=&quo ...
UVa10025-The ? 1 ? 2 ? ... ? n = k problem
分析:因为数字之间只有加减变换,所以-k和k是一样的,都可以当成整数来考虑,只要找到最小的n满足sum=n*(n+1)/2>=k:且sum和k同奇同偶即可,做法是用二分查找,然后在就近查找因为 ...
zoj The 12th Zhejiang Provincial Collegiate Programming Contest Convert QWERTY to Dvorak
http://acm.zju.edu.cn/onlinejudge/showContestProblem.do?problemId=5502 The 12th Zhejiang Provincial ...
NovaMind使用教程
NovaMind 使用教程目前NovaMind在网络上基本没什么中文资料,它自带的"欢迎"导图也只有英文版本.导致很多朋友对这个工具的使用技巧不够了解.今天我把自己的使用心得整理 ...

Facebook MyRocks at MariaDB

Facebook MyRocks at MariaDB的更多相关文章

随机推荐

热门专题