使用TableBuilder构造一个Table

  1. struct TableBuilder::Rep { // TableBuilder内部使用的结构,记录当前的一些状态等
  2. Options options;
  3. Options index_block_options;
  4. WritableFile* file; // 对应的.sst文件
  5. uint64_t offset;
  6. Status status;
  7. BlockBuilder data_block; // Data Block
  8. BlockBuilder index_block; // Index Block
  9. std::string last_key; // 添加的最后一个key,一方面用于key是否排序的判断,另一方面当写入一个Data
  10. //+ Block时记录index Block中索引项(last_key+offset+size)
  11. int64_t num_entries; // .sst文件中已经添加的key/value数量
  12. bool closed; // Either Finish() or Abandon() has been called.
  13.  
  14. // Add下一Block的第一个key/value时,才根据这个key构造一个FindShortSuccessor,
  15. // 写入Index Block中的一个entry(max_key+offset+size),是为了能够找到
  16. // 一个更短的分割2个Block的key,从而减少存储容量;
  17. // 只有Finish中是根据最后一个Block的最后一个key构造的。
  18. // We do not emit the index entry for a block until we have seen the
  19. // first key for the next data block. This allows us to use shorter
  20. // keys in the index block. For example, consider a block boundary
  21. // between the keys "the quick brown fox" and "the who". We can use
  22. // "the r" as the key for the index block entry since it is >= all
  23. // entries in the first block and < all entries in subsequent
  24. // blocks.
  25. //
  26. // Invariant: r->pending_index_entry is true only if data_block is empty.
  27. bool pending_index_entry; // 标识是否刚写入一个Data Block,控制在Index
  28. //+ Block中添加一项索引信息(last_key+offset+size)
  29. BlockHandle pending_handle; // Handle to add to index block
  30.  
  31. std::string compressed_output; // 数据压缩
  32.  
  33. Rep(const Options& opt, WritableFile* f) // 构造函数
  34. : options(opt),
  35. index_block_options(opt),
  36. file(f),
  37. offset(),
  38. data_block(&options),
  39. index_block(&index_block_options),
  40. num_entries(),
  41. closed(false),
  42. pending_index_entry(false)
  43. {
  44. index_block_options.block_restart_interval = ; // Index Block中每个restart块只有一个record,查找方便
  45. }
  46. };// struct TableBuilder::Rep ;

TableBuilder头文件

  1. class TableBuilder {
  2. public:
  3. // Create a builder that will store the contents of the table it is
  4. // building in *file. Does not close the file. It is up to the
  5. // caller to close the file after calling Finish().
  6. //创建一个基于file的builder,存储table. 使用期间不能关闭文件,在调用Finish()后调用方关闭文件
  7. TableBuilder(const Options& options, WritableFile* file);
  8.  
  9. // REQUIRES: Either Finish() or Abandon() has been called.
  10. ~TableBuilder();
  11.  
  12. // Change the options used by this builder. Note: only some of the
  13. // option fields can be changed after construction. If a field is
  14. // not allowed to change dynamically and its value in the structure
  15. // passed to the constructor is different from its value in the
  16. // structure passed to this method, this method will return an error
  17. // without changing any fields.
  18. Status ChangeOptions(const Options& options);
  19.  
  20. // Add key,value to the table being constructed.
  21. // REQUIRES: key is after any previously added key according to comparator.
  22. // REQUIRES: Finish(), Abandon() have not been called
  23. //添加key value 稍后查看代码
  24. void Add(const Slice& key, const Slice& value);
  25.  
  26. // Advanced operation: flush any buffered key/value pairs to file.
  27. // Can be used to ensure that two adjacent entries never live in
  28. // the same data block. Most clients should not need to use this method.
  29. // REQUIRES: Finish(), Abandon() have not been called
  30. void Flush();
  31.  
  32. // Return non-ok iff some error has been detected.
  33. Status status() const;
  34.  
  35. // Finish building the table. Stops using the file passed to the
  36. // constructor after this function returns.
  37. // REQUIRES: Finish(), Abandon() have not been called
  38.  
  39. Status Finish();
  40.  
  41. // Indicate that the contents of this builder should be abandoned. Stops
  42. // using the file passed to the constructor after this function returns.
  43. // If the caller is not going to call Finish(), it must call Abandon()
  44. // before destroying this builder.
  45. // REQUIRES: Finish(), Abandon() have not been called
  46. void Abandon();
  47.  
  48. // Number of calls to Add() so far.
  49. uint64_t NumEntries() const;
  50.  
  51. // Size of the file generated so far. If invoked after a successful
  52. // Finish() call, returns the size of the final generated file.
  53. uint64_t FileSize() const;
  54.  
  55. private:
  56. bool ok() const { return status().ok(); }
  57. void WriteBlock(BlockBuilder* block, BlockHandle* handle);
  58.  
  59. struct Rep;
  60. Rep* rep_;
  61.  
  62. // No copying allowed
  63. TableBuilder(const TableBuilder&);
  64. void operator=(const TableBuilder&);
  65. };

主要是按照格式填充  这里做了简单的注释

  1. // Copyright (c) 2011 The LevelDB Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file. See the AUTHORS file for names of contributors.
  4.  
  5. #include "leveldb/table_builder.h"
  6.  
  7. #include <assert.h>
  8. #include <stdio.h>
  9. #include "leveldb/comparator.h"
  10. #include "leveldb/env.h"
  11. #include "table/block_builder.h"
  12. #include "table/format.h"
  13. #include "util/coding.h"
  14. #include "util/crc32c.h"
  15. #include "util/logging.h"
  16.  
  17. namespace leveldb {
  18.  
  19. struct TableBuilder::Rep {
  20. Options options;
  21. Options index_block_options;
  22. WritableFile* file;
  23. uint64_t offset;
  24. Status status;
  25. BlockBuilder data_block;
  26. BlockBuilder index_block;
  27. std::string last_key;
  28. int64_t num_entries;
  29. bool closed; // Either Finish() or Abandon() has been called.
  30.  
  31. // We do not emit the index entry for a block until we have seen the
  32. // first key for the next data block. This allows us to use shorter
  33. // keys in the index block. For example, consider a block boundary
  34. // between the keys "the quick brown fox" and "the who". We can use
  35. // "the r" as the key for the index block entry since it is >= all
  36. // entries in the first block and < all entries in subsequent
  37. // blocks.
  38. //
  39. // Invariant: r->pending_index_entry is true only if data_block is empty.
  40. bool pending_index_entry;
  41. BlockHandle pending_handle; // Handle to add to index block
  42.  
  43. std::string compressed_output;
  44.  
  45. Rep(const Options& opt, WritableFile* f)
  46. : options(opt),
  47. index_block_options(opt),
  48. file(f),
  49. offset(),
  50. data_block(&options),
  51. index_block(&index_block_options),
  52. num_entries(),
  53. closed(false),
  54. pending_index_entry(false) {
  55. index_block_options.block_restart_interval = ;
  56. }
  57. };
  58.  
  59. TableBuilder::TableBuilder(const Options& options, WritableFile* file)
  60. : rep_(new Rep(options, file)) {
  61. }
  62.  
  63. TableBuilder::~TableBuilder() {
  64. assert(rep_->closed); // Catch errors where caller forgot to call Finish()
  65. delete rep_;
  66. }
  67.  
  68. Status TableBuilder::ChangeOptions(const Options& options) {
  69. // Note: if more fields are added to Options, update
  70. // this function to catch changes that should not be allowed to
  71. // change in the middle of building a Table.
  72. if (options.comparator != rep_->options.comparator) {
  73. return Status::InvalidArgument("changing comparator while building table");
  74. }
  75.  
  76. // Note that any live BlockBuilders point to rep_->options and therefore
  77. // will automatically pick up the updated options.
  78. rep_->options = options;
  79. rep_->index_block_options = options;
  80. rep_->index_block_options.block_restart_interval = ;
  81. return Status::OK();
  82. }
  83.  
  84. void TableBuilder::Add(const Slice& key, const Slice& value) {
  85. Rep* r = rep_;
  86. assert(!r->closed);
  87. if (!ok()) return; //确保Rep没有关闭 并且状态正常
  88.  
  89. //如果不是添加的table本身的属性 添加的key 必然是有序的的 否则报错
  90. if (r->num_entries > ) {
  91. assert(r->options.comparator->Compare(key, Slice(r->last_key)) > );
  92. }
  93.  
  94. //pending_index_entry标记是否是新创建的一个block
  95. //当新创建一个block时 才可能确认上一个block和新block之间的key的一个分割字符串 记录在lastkey和index_block 方便以后查找key 定位
  96.  
  97. if (r->pending_index_entry) {
  98. assert(r->data_block.empty());
  99. //comparator 中有 FindShortestSeparator() / FindShortSuccessor()两个接口,
  100. //FindShortestSeparator(start, limit)是获得大于 start 但小于 limit 的最小值。
  101. //FindShortSuccessor(start)是获得比 start 大的最小值。比较都基于 user - commparator,二者会被
  102. //用来确定 sstable 中 block 的 end - key。
  103. r->options.comparator->FindShortestSeparator(&r->last_key, key);
  104. std::string handle_encoding;
  105. r->pending_handle.EncodeTo(&handle_encoding);
  106. r->index_block.Add(r->last_key, Slice(handle_encoding));
  107. r->pending_index_entry = false;
  108. }
  109. //更新lastkey 跟新记录计数 添加data block
  110. r->last_key.assign(key.data(), key.size());
  111. r->num_entries++;
  112. r->data_block.Add(key, value);
  113.  
  114. //data block 大于指定size 进行flush操作
  115. const size_t estimated_block_size = r->data_block.CurrentSizeEstimate();
  116. if (estimated_block_size >= r->options.block_size) {
  117. Flush();
  118. }
  119. }
  120.  
  121. //block flush落盘
  122. void TableBuilder::Flush() {
  123. Rep* r = rep_;
  124. assert(!r->closed);
  125. if (!ok()) return;
  126. if (r->data_block.empty()) return;
  127. assert(!r->pending_index_entry);
  128. WriteBlock(&r->data_block, &r->pending_handle);
  129. if (ok()) {
  130. r->pending_index_entry = true;
  131. r->status = r->file->Flush();
  132. }
  133. }
  134.  
  135. //每个block data 包含 n个字节内容 以及type 1个字节 crc 4个字节
  136. void TableBuilder::WriteBlock(BlockBuilder* block, BlockHandle* handle) {
  137. // File format contains a sequence of blocks where each block has:
  138. // block_data: uint8[n]
  139. // type: uint8
  140. // crc: uint32
  141. assert(ok());
  142. Rep* r = rep_;
  143. Slice raw = block->Finish();
  144.  
  145. Slice block_contents;
  146. CompressionType type = r->options.compression;
  147. // TODO(postrelease): Support more compression options: zlib?
  148. switch (type) {
  149. case kNoCompression:
  150. block_contents = raw;
  151. break;
  152.  
  153. case kSnappyCompression: {
  154. std::string* compressed = &r->compressed_output;
  155. if (port::Snappy_Compress(raw.data(), raw.size(), compressed) &&
  156. compressed->size() < raw.size() - (raw.size() / 8u)) {
  157. block_contents = *compressed;
  158. } else {
  159. // Snappy not supported, or compressed less than 12.5%, so just
  160. // store uncompressed form
  161. block_contents = raw;
  162. type = kNoCompression;
  163. }
  164. break;
  165. }
  166. }
  167. handle->set_offset(r->offset);
  168. handle->set_size(block_contents.size());
  169. r->status = r->file->Append(block_contents);
  170. if (r->status.ok()) {
  171. char trailer[kBlockTrailerSize];
  172. trailer[] = type;
  173. uint32_t crc = crc32c::Value(block_contents.data(), block_contents.size());
  174. crc = crc32c::Extend(crc, trailer, ); // Extend crc to cover block type
  175. EncodeFixed32(trailer+, crc32c::Mask(crc));
  176. r->status = r->file->Append(Slice(trailer, kBlockTrailerSize));
  177. if (r->status.ok()) {
  178. r->offset += block_contents.size() + kBlockTrailerSize;
  179. }
  180. }
  181. r->compressed_output.clear();
  182. block->Reset();
  183. }
  184.  
  185. Status TableBuilder::status() const {
  186. return rep_->status;
  187. }
  188.  
  189. Status TableBuilder::Finish() {
  190. Rep* r = rep_;
  191. Flush();
  192. assert(!r->closed);
  193. r->closed = true;
  194. BlockHandle metaindex_block_handle;
  195. BlockHandle index_block_handle;
  196. if (ok()) {
  197. BlockBuilder meta_index_block(&r->options);
  198. // TODO(postrelease): Add stats and other meta blocks
  199. WriteBlock(&meta_index_block, &metaindex_block_handle);
  200. }
  201. if (ok()) {
  202. if (r->pending_index_entry) {
  203. r->options.comparator->FindShortSuccessor(&r->last_key);
  204. std::string handle_encoding;
  205. r->pending_handle.EncodeTo(&handle_encoding);
  206. r->index_block.Add(r->last_key, Slice(handle_encoding));
  207. r->pending_index_entry = false;
  208. }
  209. WriteBlock(&r->index_block, &index_block_handle);
  210. }
  211. if (ok()) {
  212. Footer footer;
  213. footer.set_metaindex_handle(metaindex_block_handle);
  214. footer.set_index_handle(index_block_handle);
  215. std::string footer_encoding;
  216. footer.EncodeTo(&footer_encoding);
  217. r->status = r->file->Append(footer_encoding);
  218. if (r->status.ok()) {
  219. r->offset += footer_encoding.size();
  220. }
  221. }
  222. return r->status;
  223. }
  224.  
  225. void TableBuilder::Abandon() {
  226. Rep* r = rep_;
  227. assert(!r->closed);
  228. r->closed = true;
  229. }
  230.  
  231. uint64_t TableBuilder::NumEntries() const {
  232. return rep_->num_entries;
  233. }
  234.  
  235. uint64_t TableBuilder::FileSize() const {
  236. return rep_->offset;
  237. }
  238.  
  239. }

参考

https://blog.csdn.net/tankles/article/details/7663918

《leveldb实现解析》淘宝 那岩

leveldb 学习记录(七) SSTable构造的更多相关文章

  1. leveldb 学习记录(五)SSTable格式介绍

    本节主要记录SSTable的结构 为下一步代码阅读打好基础,考虑到已经有大量优秀博客解析透彻 就不再编写了 这里推荐 https://blog.csdn.net/tankles/article/det ...

  2. leveldb 学习记录(六)SSTable:Block操作

    block结构示意图 sstable中Block 头文件如下: class Block { public: // Initialize the block with the specified con ...

  3. leveldb 学习记录(三) MemTable 与 Immutable Memtable

    前文: leveldb 学习记录(一) skiplist leveldb 学习记录(二) Slice 存储格式: leveldb数据在内存中以 Memtable存储(核心结构是skiplist 已介绍 ...

  4. leveldb 学习记录(四) skiplist补与变长数字

    在leveldb 学习记录(一) skiplist 已经将skiplist的插入 查找等操作流程用图示说明 这里在介绍 下skiplist的代码 里面有几个模块 template<typenam ...

  5. leveldb 学习记录(四)Log文件

    前文记录 leveldb 学习记录(一) skiplistleveldb 学习记录(二) Sliceleveldb 学习记录(三) MemTable 与 Immutable Memtablelevel ...

  6. Spring学习记录(七)---表达式语言-SpEL

    SpEL---Spring Expression Language:是一个支持运行时查询和操作对象图表达式语言.使用#{...}作为定界符,为bean属性动态赋值提供了便利. ①对于普通的赋值,用Sp ...

  7. leveldb 学习记录(八) compact

    随着运行时间的增加,memtable会慢慢 转化成 sstable. sstable会越来越多 我们就需要进行整合 compact 代码会在写入查询key值 db写入时等多出位置调用MaybeSche ...

  8. leveldb 学习记录(一) skiplist

    leveldb LevelDb是一个持久化存储的KV系统,并非完全将数据放置于内存中,部分数据也会存储到磁盘上. 想了解这个由谷歌大神编写的经典项目. 可以从数据结构以及数据结构的处理下手,也可以从示 ...

  9. leveldb 学习记录(二) Slice

    基本每个KV库都有一个简洁的字符串管理类 比如redis的sds  比如leveldb的slice 管理一个字符串指针和数据长度 通过对字符串指针 长度的管理实现一般的创建 判断是否为空 获取第N个位 ...

随机推荐

  1. Java8-dateTimeFormatter

    时间格式化LocalDate,DateTimeFormatter--->parse,ofParttern 伴随lambda表达式.streams以及一系列小优化,Java 8 推出了全新的日期时 ...

  2. 算法实践--最小生成树(Kruskal算法)

    什么是最小生成树(Minimum Spanning Tree) 每两个端点之间的边都有一个权重值,最小生成树是这些边的一个子集.这些边可以将所有端点连到一起,且总的权重最小 下图所示的例子,最小生成树 ...

  3. linux git pull/push时提示输入账号密码之免除设置

    1.先cd到根目录,执行git config --global credential.helper store命令 [root@iZ25mi9h7ayZ ~]# git config --global ...

  4. Grafana介绍

    Grafana是一个开源的度量分析与可视化套件.纯 Javascript 开发的前端工具,通过访问库(如InfluxDB),展示自定义报表.显示图表等.大多使用在时序数据的监控方面,如同Kibana类 ...

  5. Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication

    Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication Overview Galera Cluster 由 Coders ...

  6. sharepoint环境安装过程中几点需要注意的地方

    写在前面 上篇文章也说明了,在安装sharepoint环境的时候,确实吃了不少苦头,这里纪录一下安装过程中遇到的几个问题. 安装环境 windows server 2012 r2 standard x ...

  7. vue路由的钩子函数和跳转

    首页可以控制导航跳转,beforeEach,afterEach等,一般用于页面title的修改.一些需要登录才能调整页面的重定向功能. beforeEach主要有3个参数to,from,next. t ...

  8. 分布式之redis复习精讲

    看到一片不错的精简的redis文档,转载之,便于复习梳理之用 转自:https://www.cnblogs.com/rjzheng/p/9096228.html ------------------- ...

  9. android 开发 View _15 导入一张图片将它裁剪成圆形 与 paint图层叠加处理详解

    方法一: /* 实现思维是这样的: 1.首先拿到bitmap图片 2.得到bitmap图片的高度 宽度,并且计算好各个画图尺寸 3.创建一个空白的 bitmap图片: Bitmap output = ...

  10. win7激活成功 但每次开机后又显示此windows副本不是正版的解决办法

    进入命令行界面,输入 SLMGR -REARM 命令,点击回车 此方法只是暂时激活系统,大概4个月左右,即要使用此方法再次进行激活