Recently my colleague Rasmus Johansson announced that MariaDB is adding support for the Facebook MyRocks storage engine. Today I’m going to share a bit more on what that means for MariaDB users. Members of the Facebook Database Engineering team helped us answer some questions we think our community will have about MyRocks.

Benefits of MariaDB Server’s Extensible Architecture

Before discussing specifics of MyRocks, new readers may benefit from a description of MariaDB Serverarchitecture, which is extensible at every layerincluding the storage layer. This means users and the community can add functionality to meet unique needs. Community contributions are one of MariaDB’s greatest advantages over other databases, and a big reason for us becoming the fastest growing open source database in the marketplace.

Openness in the storage layer is especially important because being able to use the right storage engine for the right use case ensures better performance optimization. Both mysql and MariaDB support InnoDB - a well known, general purpose storage engine. But InnoDB is not suited to every use case, so the MariaDB engineering team is extending support for additional storage engines, including Facebook’s MyRocks for workloads requiring greater compression and IO efficiency, and MariaDBColumnStore (currently inbeta), which will provide faster time-to-insight with Massively Parallel Execution (MPP).

Facebook MyRocks for MariaDB

When searching for a storage engine that could give greater performance for web scale type applications, MyRocks was an obvious choice because of its superior handling of data compression and IO efficiency. Besides that, its LSM architecture allows for very efficient data ingestion, like read-free replication slaves, or fast bulk data loading.

As we add support for new storage engines, many of our current users may ask, “What happens to MariaDB’s support for InnoDB? Do I have to migrate?” Of course not! We have no plans to abandon InnoDB. InnoDB is a proven storage engine and we expect it to continue to be used by MariaDB users. But we do expect that deployments that need highest possible efficiency will opt for MyRocks because of its performance gains and IO efficiency. Over time, as MyRocks matures we expect it will become appropriate for even more use cases.

The first MariaDB version of MyRocks will be available in a release candidate of MariaDB Server 10.2 coming this winter. Our goal is for MyRocks to work with all MariaDB features, but some of them, like optimistic parallel replication, may not work in the first release. MariaDB is an open source project that follows the "release often, release early" approach, so our goal is to first make a release that meets core requirements, and then add support for special cases in subsequent releases.

Now let’s move onto my discussion with Facebook’s Database Engineering team!

Can you tell us a bit about the history of RocksDB at Facebook? In 2012, we started to build an embedded storage engine optimized for flash-based SSD, by forking LevelDB. The fork became RocksDB, which was open-sourced on November 2013 [1] . After RocksDB proved to be an effective persistent key-value store for SSD, we enhanced RocksDB for other platforms. We improved its performance on DRAM in 2014 and on hard drives in 2015, two platforms with production use cases now.

Over the past few years, we've introduced numerous features and improvements. To name a few, we built compaction filter and merge operator in 2013, backup and column families in 2014, transactions and bulk loading in 2015, and persistent cache in 2016. See the list of features that are not in LevelDB: https://github.com/facebook/rocksdb/wiki/Features-Not-in-LevelDB .

Early RocksDB adopters at Facebook such as the distributed key-value store ZippyDB [2], Laser [2] and Dragon [3] went into production in early 2013. Since then, many more new or existing services at Facebook started to use RocksDB every year. Now RocksDB is used in a number of services across multiple hardware platforms at Facebook. [1] https://code.facebook.com/posts/666746063357648/under-the-hood-building-and-open-sourcing-rocksdb/ and http://rocksdb.blogspot.com/2013/11/the-history-of-rocksdb.html [2] https://research.facebook.com/publications/realtime-data-processing-at-facebook/ [3] https://code.facebook.com/posts/1737605303120405/dragon-a-distributed-graph-query-engine/ Why did FB go down the RocksDB path for MySQL?

MySQL is a popular storage solution at Facebook because we have a great team dedicated to running MySQL at scale that provides a high quality of service. The MySQL tiers store many petabytes of data that have been compressed with InnoDB table compression. We are always looking for ways to improve compression and the LSM algorithm used by RocksDB has several advantages over the B-Tree used by InnoDB. This led us to MyRocks: RocksDB is a key-value storage engine. MyRocks implements that MySQL storage engine API to make RocksDB work with MySQL and provide SQL functionality. Our initial goal was to get 2x more compression from MyRocks than from compressed InnoDB without affecting read performance. We exceeded our goal. In addition to getting 2x better compression, we also got much lower write rates to storage, faster database loads, and better performance.

Lower write rates enable the use of lower endurance flash, and faster loads simplify the migration from MySQL on InnoDB to MySQL on RocksDB. While we don't expect better performance for all workloads, the way in which we operate the database tier for the initial MyRocks deployment favors RocksDB more than InnoDB. Finally, there are features unique to an LSM that we expect to support in the future, including the merge operator and compaction filters. MyRocks can be helpful to the MySQL community because of efficiency and innovation.

We considered multiple write-optimized database engines. We chose RocksDB because it has excellent performance and efficiency and because we work directly with the team. The MyRocks effort has benefited greatly from being able to collaborate on a daily basis with the RocksDB team. We appreciate that the RocksDB team treats us like a very important customer. They move fast to make RocksDB better for MyRocks.

How was MyRocks developed? MyRocks is developed by engineers from several locations across the globe. The team had the privilege to work with Sergey Petrunia right from the beginning, and he is based in Russia. At Facebook's Menlo Park campus, Siying Dong leads RocksDB development and Yoshinori Matsunobu leads the collaboration with MySQL infrastructure and data performance teams. From the Seattle office, Herman Lee worked on the initial validation of MyRocks that gave the team the confidence to proceed with MyRocks for our user databases as well as led the MyRocks feature development. In Oregon, Mark Callaghan has been benchmarking all aspects of MyRocks and

Let’s start by creating a table, where we will store the CIDR, split in two columns: one for the IPv6 address and one for the network length. The most compact way of storing IPv6 values is to use the binary(16) .

CREATETABLE `cidr` ( `id` int(11) NOT NULL AUTO_INCREMENT, `ip_address` binary(16) NOT NULL, `net_len` int(11) NOT NULL, PRIMARYKEY (`id`) ) ENGINE=InnoDB;

In order to generate some random data, I will use a stored procedure which will insert 100,000 IP addresses in the previously created table. The addresses will be generated by inserting the first 16 bytes of the current time-stamp’s SHA value.

DELIMITER // CREATEPROCEDUREgenerate_ips(no_ipsINT) BEGIN DECLARE x INT DEFAULT 0; DECLARE ipbinary(16); DECLARE rand_net_lenINT; REPEAT SET x = x + 1; # Generate a random ip address SETip= substring(unhex(sha(RAND())), 1, 16); # Generate a random network length between 64 and 123 SETrand_net_len= 64 + FLOOR(RAND() * 60); INSERTINTOcidr(ip_address, net_len) values (ip, rand_net_len); UNTIL x > no_ipsEND REPEAT; END // DELIMITER ; # Generate the IPs CALLgenerate_ips(100000);

Ifyou want to insert the values from a human-readable string, MySQL provides the INET6_ATON("2001:0db8:85a3:0000:0000:8a2e:0370:7334") function which will strip the colons (“:”)and convertthe 32 characters to a binary(16) string.

Next Iwill select the first 3 IP address. In order to do thatI will use the INET6_NTOA function, which will convert from the binary format to the human-readable format.

mysql> SELECTid, INET6_NTOA(ip_address), net_lenFROMcidrLIMIT 3; +----+-----------------------------------------+---------+ | id | INET6_NTOA(ip_address)| net_len | +----+-----------------------------------------+---------+ |1 | 200d:31c4:1905:9eb2:3c7f:c45c:de78:42cd |97 | |2 | 59b0:c4d6:48b4:3717:f031:d05b:705d:6c65 |95 | |3 | 788e:3f48:e62b:c3bb:da10:6a03:f987:7a16 |110 | +----+-----------------------------------------+---------+ 3 rowsin set (0,01 sec)

Next, I would like to select the network mask, host mask, network address and also generate the network range intervals. For this I will create a few helper functions:

# Returns the net mask based on the network length DELIMITER // CREATEFUNCTION net_mask(net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN (~INET6_ATON('::') << (128 - net_len)); END // # Returns the network address using an IP and the network length CREATEFUNCTION subnet(ipBINARY(16), net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN ip & ((~INET6_ATON('::') << (128 - net_len))); END // # Returns the host mask CREATEFUNCTION host_mask(net_lenint) RETURNSbinary(16) DETERMINISTIC BEGIN RETURN (~INET6_ATON('::') >> net_len); END // DELIMITER ;

Facebook MyRocks at MariaDB的更多相关文章

  1. MySQL与MariaDB核心特性比较详细版v1.0(覆盖mysql 8.0/mariadb 10.3,包括优化、功能及维护)

    注:本文严禁任何形式的转载,原文使用word编写,为了大家阅读方便,提供pdf版下载. MySQL与MariaDB主要特性比较详细版v1.0(不含HA).pdf 链接:https://pan.baid ...

  2. 【巨杉数据库SequoiaDB】巨杉Tech | 巨杉数据库的并发 malloc 实现

    本文由巨杉数据库北美实验室资深数据库架构师撰写,主要介绍巨杉数据库的并发malloc实现与架构设计.原文为英文撰写,我们提供了中文译本在英文之后. SequoiaDB Concurrent mallo ...

  3. 数据库对比:选择MariaDB还是MySQL?

    作者 | EverSQL 译者 | 无明 这篇文章的目的主要是比较 MySQL 和 MariaDB 之间的主要相似点和不同点.我们将从性能.安全性和主要功能方面对这两个数据库展开对比,并列出在选择数据 ...

  4. myrocks复制中断问题排查

    背景 mysql可以支持多种不同的存储引擎,innodb由于其高效的读写性能,并且支持事务特性,使得它成为mysql存储引擎的代名词,使用非常广泛.随着SSD逐渐普及,硬件存储成本越来越高,面向写优化 ...

  5. AliOS编译安装MyRocks

    MyRocks是facabook版将自主研发的MySQL分支,其源码位于为:https://github.com/facebook/mysql-5.6/ 首先需要安装以下: sudo yum inst ...

  6. MyRocks简介

    RocksDB是facebook基于LevelDB实现的,目前为facebook内部大量业务提供服务.经过facebook大量工作,将RocksDB为MySQL的一个存储引擎移植到MySQL,称之为M ...

  7. 初识MariaDB存储引擎

    在看MariaDB的存储引擎之前,可以先了解MySQL存储引擎. MySQL常用的存储引擎: MyISAM存储引擎:是MySQL的默认存储引擎.MyISAM不支持事务.也不支持外键,但其访问速度快,对 ...

  8. 【转】哦,mysql 的其它发行版本Percona, mariadb

    原文:http://geek.csdn.net/news/detail/130146 2016年11月25日,沃趣科技"智慧应用 数据先行"2016产品发布会暨新三板挂牌庆祝会在杭 ...

  9. MyRocks DDL原理

    最近一个日常实例在做DDL过程中,直接把数据库给干趴下了,问题还是比较严重的,于是赶紧排查问题,撸了下crash堆栈和alert日志,发现是在去除唯一约束的场景下,MyRocks存在一个严重的bug, ...

随机推荐

  1. QQ拼音在中文输入下默认英文标点

    别小看这个功能, 感觉在写一些技术 Blog 的情况下还是挺有用的. 打开QQ拼音: 输入法设置->基本设置->初始状态->中文状态下使用英文标点.

  2. 客户端 ios与android 的判断

    <script> if (/(iPhone|iPad|iPod|iOS)/i.test(navigator.userAgent)) { //alert(navigator.userAgen ...

  3. Google 推出的 Java 编码规范(转)

    原文地址:http://www.dahuatu.com/1225/988516.html 原文地址:http://www.dahuatu.com/1225/988516.html 原文地址:http: ...

  4. java 读取Excel文件并数据持久化方法Demo

    import java.io.IOException; import java.io.InputStream; import java.util.ArrayList; import java.util ...

  5. 【你吐吧c#每日学习】11.10 C# Data Type conversion

    implicit explicit float f=12123456.213F int a = Convert.ToInt32(f); //throw exception or int a = (in ...

  6. Hibernate二进制或大文件类型数据和Oracle交互

    //测试存储二进制文件 @Test public void test() throws IOException{  InputStream in=new FileInputStream("E ...

  7. C# 实现 单例模式

    http://blog.sina.com.cn/s/blog_75247c770100yxpb.html

  8. 在Visual Studio 2013/2015上使用C#开发Android/IOS安装包和操作步骤

    Xamarin 配置手册和离线包下载 http://pan.baidu.com/s/1eQ3qw8a 具体操作: 安装前提条件 1. 安装Visual Studio 2013,安装过程省略,我这里安装 ...

  9. logstash配合filebeat监控tomcat日志

    环境:logstash版本:5.0.1&&filebeat 5.0.1 ABC为三台服务器.保证彼此tcp能够相互连接. Index服务器A - 接收BC两台服务器的tomcat日志 ...

  10. Concurrent Assertion

    Concurrent assertion中要求必须有clock,从而保证在每个clock edge都进行触发判断. assertion与design进行同步执行,concurrent assert只能 ...