Histogram
folly/Histogram.h
Classes
Histogram
Histogram.h
defines a simple histogram class, templated on the type of data you want to store. This class is useful for tracking a large stream of data points, where you want to remember the overall distribution of the data, but do not need to remember each data point individually.
Each histogram bucket stores the number of data points that fell in the bucket, as well as the overall sum of the data points in the bucket. Note that no overflow checking is performed, so if you have a bucket with a large number of very large values, it may overflow and cause inaccurate data for this bucket. As such, the histogram class is not well suited to storing data points with very large values. However, it works very well for smaller data points such as request latencies, request or response sizes, etc.
In addition to providing access to the raw bucket data, the Histogram
class also provides methods for estimating percentile values. This allows you to estimate the median value (the 50th percentile) and other values such as the 95th or 99th percentiles.
All of the buckets have the same width. The number of buckets and bucket width is fixed for the lifetime of the histogram. As such, you do need to know your expected data range ahead of time in order to have accurate statistics. The histogram does keep one bucket to store all data points that fall below the histogram minimum, and one bucket for the data points above the maximum. However, because these buckets don't have a good lower/upper bound, percentile estimates in these buckets may be inaccurate.
HistogramBuckets
The Histogram
class is built on top of HistogramBuckets
. HistogramBuckets
provides an API very similar to Histogram
, but allows a user-defined bucket class. This allows users to implement more complex histogram types that store more than just the count and sum in each bucket.
When computing percentile estimates HistogramBuckets
allows user-defined functions for computing the average value and data count in each bucket. This allows you to define more complex buckets which may have multiple different ways of computing the average value and the count.
For example, one use case could be tracking timeseries data in each bucket. Each set of timeseries data can have independent data in the bucket, which can show how the data distribution is changing over time.
Example Usage
Say we have code that sends many requests to remote services, and want to generate a histogram showing how long the requests take. The following code will initialize histogram with 50 buckets, tracking values between 0 and 5000. (There are 50 buckets since the bucket width is specified as 100. If the bucket width is not an even multiple of the histogram range, the last bucket will simply be shorter than the others.)
folly::Histogram<int64_t> latencies(, , );
The addValue() method is used to add values to the histogram. Each time a request finishes we can add its latency to the histogram:
latencies.addValue(now - startTime);
You can access each of the histogram buckets to display the overall distribution. Note that bucket 0 tracks all data points that were below the specified histogram minimum, and the last bucket tracks the data points that were above the maximum.
unsigned int numBuckets = latencies.getNumBuckets();
cout << "Below min: " << latencies.getBucketByIndex().count << "\n";
for (unsigned int n = ; n < numBuckets - ; ++n) {
cout << latencies.getBucketMin(n) << "-" << latencies.getBucketMax(n)
<< ": " << latencies.getBucketByIndex(n).count << "\n";
}
cout << "Above max: "
<< latencies.getBucketByIndex(numBuckets - ).count << "\n";
You can also use the getPercentileEstimate()
method to estimate the value at the Nth percentile in the distribution. For example, to estimate the median, as well as the 95th and 99th percentile values:
int64_t median = latencies.getPercentileEstimate(0.5);
int64_t p95 = latencies.getPercentileEstimate(0.95);
int64_t p99 = latencies.getPercentileEstimate(0.99);
Thread Safety
Note that Histogram
and HistogramBuckets
objects are not thread-safe. If you wish to access a single Histogram
from multiple threads, you must perform your own locking to ensure that multiple threads do not access it at the same time.
Histogram的更多相关文章
- [LeetCode] Largest Rectangle in Histogram 直方图中最大的矩形
Given n non-negative integers representing the histogram's bar height where the width of each bar is ...
- poj 2559 Largest Rectangle in a Histogram - 单调栈
Largest Rectangle in a Histogram Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 19782 ...
- LeetCode 笔记系列 17 Largest Rectangle in Histogram
题目: Largest Rectangle in Histogram Given n non-negative integers representing the histogram's bar he ...
- LeetCode: Largest Rectangle in Histogram(直方图最大面积)
http://blog.csdn.net/abcbc/article/details/8943485 具体的题目描述为: Given n non-negative integers represent ...
- DP专题训练之HDU 1506 Largest Rectangle in a Histogram
Description A histogram is a polygon composed of a sequence of rectangles aligned at a common base l ...
- Largest Rectangle in Histogram
Given n non-negative integers representing the histogram's bar height where the width of each bar is ...
- 数据结构与算法(1)支线任务3——Largest Rectangle in Histogram
题目如下:(https://leetcode.com/problems/largest-rectangle-in-histogram/) Given n non-negative integers r ...
- LeetCode之Largest Rectangle in Histogram浅析
首先上题目 Given n non-negative integers representing the histogram's bar height where the width of each ...
- Largest Rectangle in a Histogram(DP)
Largest Rectangle in a Histogram Time Limit : 2000/1000ms (Java/Other) Memory Limit : 65536/32768K ...
- Elasticsearch聚合 之 Histogram 直方图聚合
Elasticsearch支持最直方图聚合,它在数字字段自动创建桶,并会扫描全部文档,把文档放入相应的桶中.这个数字字段既可以是文档中的某个字段,也可以通过脚本创建得出的. 桶的筛选规则 举个例子,有 ...
随机推荐
- MVC4中视图获取控制器中返回的json格式数据
再开发MVC项目时,有时只需要从控制器中返回一个处理的结果,这时返回Json格式的数据非常的方便,在Controller中,提供了几种返回类型和方法,如: Content() 返回文本类型的Conte ...
- 用css和php脚本来使得poscms的list标签顺利输出记录
每次拿到前端给的页面,都会有意想不到的惊喜,因为他们给的页面总是不能很好地契合poscms的模板标签 输出规范,这时候就需要动点脑筋去解决问题了. 拿前两天拿到的一类(对,你没看错,是一类)页面来说吧 ...
- Mac终端建立替身 并置于桌面或Finder中
前情 Xcode存放log的文件夹路径忒长了,且需要用终端才能查看.所以就想制作个替身,放在Finder中便于查看. going on command+space打开terminal 一直cd...进 ...
- 【pandas】pandas.Series.str.split()---字符串分割
原创博文,转载请注明出处! 本文代码的github地址 series中的元素均为字符串时,通过str.split可将字符串按指定的分隔符拆分成若干列的形式. 例子: 拆分以逗号为分隔符的字 ...
- 文件的copy
def mycopy(src_filename, dst_filename): try: fr = open(src_filename, "rb") try: try: fw = ...
- java 实现共享锁和排它锁
一直对多线程有恐惧,在实现共享锁和排它锁之后,感觉好了很多. 共享锁 就是查询的时候,如果没有修改,可以支持多线程查询: 排它锁 就是修改的时候,锁定共享锁,停止查询,同时,锁定排它锁,只 ...
- FreeMarker自定义TemplateDirectiveModel
[参考:http://blog.csdn.net/fangzhangsc2006/article/details/8687371] 在采用FreeMarker做前台视图模板的情况下,我们可以通过< ...
- Codeforces 1030F 【线段树】【好题】
LINK 题目大意: 给你n个物品,每一个物品有一个位置p和一个权值w,移动一个物品的代价是移动距离*物品权值 有q个询问: 把第i个物品的权值变成j 问把第l到第r个物品移动到一个相邻的区间中\([ ...
- BZOJ4540 Hnoi2016 序列 【莫队+RMQ+单调栈预处理】*
BZOJ4540 Hnoi2016 序列 Description 给定长度为n的序列:a1,a2,-,an,记为a[1:n].类似地,a[l:r](1≤l≤r≤N)是指序列:al,al+1,-,ar- ...
- koa2 中间件里面的next到底是什么
koa2短小精悍,女人不爱男人爱. 之前一只有用koa写一点小程序,自认为还吼吼哈,知道有一天某人问我,你说一下 koa或者express中间件的实现原理.然后我就支支吾吾,好久吃饭都不香. 那么了解 ...