本文重点关注了系统设计相关的内容,paper后半部分的具体应用此处没有过多涉及。从个人笔记修改而来,因此为英文版本。

Bigtable: A Distributed Storage System for Structured Data

Data model: not a relational data model

A Bigtable is a sparse, distributed, persistent multidimensional sorted map. —— part2

How the map indexed?

(row:string, column:string, time:int64) → string

just like json format, eg:

table{
// ...
"aaaaa" : { //row
"A:foo" : { //col
15 : "y", //timestamp
4 : "m"
},
"A:bar" : { //col
15 : "d",
},
"B:" : { //col
6 : "w"
3 : "o"
1 : "w"
}
},
// ...
}

a particular table: webtable

  • row(also called tablet): reversed URL

    concurrent: single row key is atomic

    lexicographic order

  • col: column families, contents

    family:qualifier

    Access control and both disk and memory accounting

  • timestamp

    avoid collisions: unique timestamp, decreasing order

    garbage-collection mechanism(eg.)


API

C++ read/write

MapReduce + Bigtable


Building Block

Google File System: store log and data files

distributed Google File System

Google SSTable file format: store Bigtable data

K-V map: iterate key/value pairs in a specified key range

  • a sequence of blocks
  • a block index
disk seek or memory seek?

Optionally, SSTable can be completely mapped into memory, which allows us to perform lookups and scans without touching disk.

Chubby: distributed lock service

5 active replicas: 1 master, 4 slave

Paxos algorithm: to keep its replicas consistent in the face of failure

namespace: including directory and small file, op r/w is atomic

session: when expires, lose locks and open handles


Implementation

consist:

  1. library(?) linked to every client
  2. 1 master server(schedule, garbage-collect......)
  3. many tablet server(10-1000 tablets)

As with many single-master distributed storage systems, client data does not move through the master: clients communicate directly with tablet servers for reads and writes.

hierarchy (B+-tree)

Chubby file -> Root tablet -> other METADATA tablets -> UserTables

METADATA: many other things stored in it

Master: schedule & manage

Each tablet is assigned to one tablet server at a time. Bigtable uses Chubby to keep track of tablet servers. When a tablet server starts, it creates, and acquires an exclusive lock on, a uniquely-named file in a specific Chubby directory. The master monitors this directory (the servers directory) to discover tablet servers.

The essential point for distributed database: lock

The Bigtable is only a series of ops, real data is stored in GFS.(SSTable)

Tablet Representation

memtable: the recently committed updates are stored in memory in a sorted buffer

reconstruct: redo points in commit logs

Compactions

As write operations execute, the size of the memtable increases. When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS.

minor(memtable) -> major(SSTable) compaction


Refinement

locality group

Clients can group multiple column families together into a locality group. A separate SSTable is generated for each locality group in each tablet.

This section describes portions of the implementation in more detail in order to highlight these refinements.

in-memory locality groups are loaded lazily

storage: compression

read performance: caching

Bloom filters

commit-log

Speeding up tablet recovery

Exploiting immutability


Performance Evaluation


Lesson

  1. large distributed systems are vulnerable to many types of failures
  2. it is important to delay adding new features until it is clear how the new features will be used
  3. the importance of proper system-level monitoring
  4. the value of simple designs

google三驾马车之一:Bigtable解读(英文版)的更多相关文章

  1. 分布式系统漫谈一 —— Google三驾马车: GFS,mapreduce,Bigtable

    分布式系统学习必读文章!!!! 原文:http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html 分布式系统漫谈一 —— Google三驾马车: GFS, ...

  2. [MapReduce] Google三驾马车:GFS、MapReduce和Bigtable

    声明:此文转载自博客开发团队的博客,尊重原创工作.该文适合学分布式系统之前,作为背景介绍来读. 谈到分布式系统,就不得不提Google的三驾马车:Google FS[1],MapReduce[2],B ...

  3. Google三驾马车:GFS、MapReduce和Bigtable

    谈到分布式系统,就不得不提Google的三驾马车:Google fs[1],Mapreduce[2],Bigtable[3]. 虽然Google没有公布这三个产品的源码,但是他发布了这三个产品的详细设 ...

  4. Google三驾马车

    Google旧三驾马车: GFS,mapreduce,Bigtable http://blog.sina.com.cn/s/blog_4ed630e801000bi3.html Google新三驾马车 ...

  5. 【技术与商业案例解读笔记】095:Google大数据三驾马车笔记

     1.谷歌三驾马车地位 [关键词]开启时代,指明方向 聊起大数据,我们通常言必称谷歌,谷歌有“三驾马车”:谷歌文件系统(GFS).MapReduce和BigTable.谷歌的“三驾马车”开启了大数据时 ...

  6. Childlife旗下三驾马车

    Childlife旗下,尤其以 “提高免疫力”为口号的“三驾马车”:第一防御液.VC.紫雏菊,是相当热门的海淘产品.据说这是一系列“成分天然.有效治愈感冒提升免疫力.由美国著名儿科医生研发”的药物.

  7. Ubuntu 安装 k8s 三驾马车 kubelet kubeadm kubectl

    Ubuntu 版本是 18.04 ,用的是阿里云服务器,记录一下自己实际安装过程的操作步骤. 安装 docker 安装所需的软件 apt-get update apt-get install -y a ...

  8. Qt 学习笔记 - 第三章 - Qt的三驾马车之一 - 串口编程 + 程序打包成Windows软件

    Qt 学习笔记全系列传送门: Qt 学习笔记 - 第一章 - 快速开始.信号与槽 Qt 学习笔记 - 第二章 - 添加图片.布局.界面切换 [本章]Qt 学习笔记 - 第三章 - Qt的三驾马车之一 ...

  9. 更强、更稳、更高效:解读 etcd 技术升级的三驾马车

    点击下载<不一样的 双11 技术:阿里巴巴经济体云原生实践> 本文节选自<不一样的 双11 技术:阿里巴巴经济体云原生实践>一书,点击上方图片即可下载! 作者 | 陈星宇(宇慕 ...

  10. itemKNN发展史----推荐系统的三篇重要的论文解读

    itemKNN发展史----推荐系统的三篇重要的论文解读 本文用到的符号标识 1.Item-based CF 基本过程: 计算相似度矩阵 Cosine相似度 皮尔逊相似系数 参数聚合进行推荐 根据用户 ...

随机推荐

  1. MongoDB 根据多个条件批量修改

    转载请注明出处: MongoDB 根据单个条件修改的sql 如下: db.collection_name.update({"userid":"1111111"} ...

  2. CDC设计实例-02

    CDC设计实例 加速器 假设要处理一项业务比如图像处理,有两种方向,第一种选择一些通用的处理器CPU\GPU\DSP等通用的处理器,第二种是将算法映射成IP,直接使用IP进行处理图像处理等专门的业务就 ...

  3. 问题--QT只有全屏的时候才能使用

    1.问题 安装的版本是3.8.0,只有在全屏的时候在编辑界面不会卡,其余情况会直接卡死在这. 2.解决方式 安装了较低版本的3.14.2,解决了上述问题

  4. Android——“EditText控件供获取最大长度的方法”

    package utils; import android.app.Activity; import android.content.Context; import android.text.Inpu ...

  5. MyBatis06——动态SQL

    动态SQL if choose (when, otherwise) trim (where, set) foreach 搭建环境 1.搭建数据库 CREATE TABLE `blog` ( `id` ...

  6. [转帖]字符集 AL32UTF8 和 UTF8

    https://blog.51cto.com/comtv/383254# 文章标签职场休闲字符集 AL32UTF8 和 UTF8文章分类数据库阅读数1992 The difference betwee ...

  7. [转帖]Web性能优化工具WebPageTest(三)——本地部署(Windows 7版本)

    http://www.zlprogram.com/Show/30/30117.shtml 这次先能够使用PC端的浏览器测试,首先需要下载官方的发布版本"WebPageTest 3.0&quo ...

  8. [转帖]宁可信鬼,也不信 iowait 这张嘴!

    https://zhuanlan.zhihu.com/p/407333624 原创:小姐姐味道(微信公众号ID:xjjdog),欢迎分享,转载请保留出处. 我们经常遇到iowait这个名词,在top命 ...

  9. Chrome 下载地址

    今天同事找到一个网页 感觉非常好用 这里保存并且推荐一下 https://www.chromedownloads.net/chrome64win-stable/

  10. 【团队效率提升】Python-PyWebIO介绍

    作者:京东零售 关键 Q&A快速了解PyWebIO Q:首先,什么是PyWebIO? A:PyWebIO提供了一系列命令式的交互函数,能够让咱们用只用Python就可以编写 Web 应用, 不 ...