GroupVarint
folly/GroupVarint.h
folly/GroupVarint.h
is an implementation of variable-length encoding for 32- and 64-bit integers using the Group Varint encoding scheme as described in Jeff Dean's WSDM 2009 talk and in Information Retrieval: Implementing and Evaluating Search Engines.
Briefly, a group of four 32-bit integers is encoded as a sequence of variable length, between 5 and 17 bytes; the first byte encodes the length (in bytes) of each integer in the group. A group of five 64-bit integers is encoded as a sequence of variable length, between 7 and 42 bytes; the first two bytes encode the length (in bytes) of each integer in the group.
GroupVarint.h
defines a few classes:
GroupVarint<T>
, whereT
isuint32_t
oruint64_t
:Basic encoding / decoding interface, mainly aimed at encoding / decoding one group at a time.
GroupVarintEncoder<T, Output>
, whereT
isuint32_t
oruint64_t
, andOutput
is a functor that acceptsStringPiece
objects as arguments:Streaming encoder: add values one at a time, and they will be flushed to the output one group at a time. Handles the case where the last group is incomplete (the number of integers to encode isn't a multiple of the group size)
GroupVarintDecoder<T>
, whereT
isuint32_t
oruint64_t
:Streaming decoder: extract values one at a time. Handles the case where the last group is incomplete.
The 32-bit implementation is significantly faster than the 64-bit implementation; on platforms supporting the SSSE3 instruction set, we use the PSHUFB instruction to speed up lookup, as described in SIMD-Based Decoding of Posting Lists(CIKM 2011).
For more details, see the header file folly/GroupVarint.h
and the associated test file folly/test/GroupVarintTest.cpp
.
GroupVarint的更多相关文章
- 今天听说了一个压缩解压整型的方式-group-varint
group varint https://github.com/facebook/folly/blob/master/folly/docs/GroupVarint.md 这个是facebook的实现 ...
- folly学习心得(转)
原文地址: https://www.cnblogs.com/Leo_wl/archive/2012/06/27/2566346.html 阅读目录 学习代码库的一般步骤 folly库的学习心得 ...
- Folly: Facebook Open-source Library Readme.md 和 Overview.md(感觉包含的东西并不多,还是Boost更有用)
folly/ For a high level overview see the README Components Below is a list of (some) Folly component ...
随机推荐
- (转)MapReduce Design Patterns(chapter 3 (part 1))(五)
Chapter 3. Filtering Patterns 本章的模式有一个共同点:不会改变原来的记录.这种模式是找到一个数据的子集,或者更小,例如取前十条,或者很大,例如结果去重.这种过滤器模式跟前 ...
- [Python] print中的左右对齐问题
一.数值类型(int.float) # %d.%f是占位符>>> a = 3.1415926>>> print("%d"%a) #%d只 ...
- linux 系统优化+定时任务
安装软件 通过yum安装 自动补全工具:yum completion yum install -y tree bash-completion wget vim find -[TAB] 更改系统的yum ...
- Linux 中断处理
HardIRQ 和 softirq : http://www.it165.net/os/html/201211/3766.html 博文:http://blog.csdn.net/yin262/art ...
- POJ3682King Arthur's Birthday Celebration(数学期望||概率DP)
King Arthur is an narcissist who intends to spare no coins to celebrate his coming K-th birthday. Th ...
- 树形DP新识
HihoCoder: 1041(点) 1063(边) 1035(边) HDU1520 (签到) HDU2415(emm) 目前我遇到的树形DP有两类: ∂:点处理,大概就是点的乱搞,比如找一些点,这些 ...
- stm32寄存器版学习笔记07 ADC
STM32F103RCT有3个ADC,12位主逼近型模拟数字转换器,有18个通道,可测量16个外部和2个内部信号源.各通道的A/D转换可以单次.连续.扫描或间断模式执行. 1.通道选择 stm32把A ...
- 【2018.06.26NOIP模拟】T2号码bachelor 【数位DP】*
[2018.06.26NOIP模拟]T2号码bachelor 题目描述 Mike 正在在忙碌地发着各种各样的的短信.旁边的同学 Tom 注意到,Mike 发出短信的接收方手机号码似乎都满足着特别的性质 ...
- IIS并发瓶颈线程数的限制
.NET线程池最大线程数的限制-记一次IIS并发瓶颈 https://www.cnblogs.com/7rhythm/p/9964543.html .NET ThreadPool 最大线程数的限制 I ...
- pat 乙级 1093 字符串A+B (20 分)
给定两个字符串 A 和 B,本题要求你输出 A+B,即两个字符串的并集.要求先输出 A,再输出 B,但重复的字符必须被剔除. 输入格式: 输入在两行中分别给出 A 和 B,均为长度不超过 1的.由可见 ...