解读：计数器Counter

Counters: 44
File System Counters
        FILE: Number of bytes read=655771325
        FILE: Number of bytes written=984244425
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=260407668
        HDFS: Number of bytes written=17681802
        HDFS: Number of read operations=37
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=10
Job Counters
        Launched map tasks=4
        Launched reduce tasks=5
        Other local map tasks=1
        Data-local map tasks=3
        Total time spent by all maps in occupied slots (ms)=60987
        Total time spent by all reduces in occupied slots (ms)=50362
Map-Reduce Framework
        Map input records=1152870
        Map output records=22472940
        Map output bytes=282888289
        Map output materialized bytes=327843405
        Input split bytes=1173
        Combine input records=0
        Combine output records=0
        Reduce input groups=579532
        Reduce shuffle bytes=327843405
        Reduce input records=22472940
        Reduce output records=579532
        Spilled Records=67418820
        Shuffled Maps =20
        Failed Shuffles=0
        Merged Map outputs=20
        GC time elapsed (ms)=2826
        CPU time spent (ms)=69670
        Physical memory (bytes) snapshot=2287190016
        Virtual memory (bytes) snapshot=7904223232
        Total committed heap usage (bytes)=1572864000
Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
File Input Format Counters
        Bytes Read=0
File Output Format Counters
        Bytes Written=17681802

Counters: 44表示计数器总共44个，粉色表示计数器种类，即6类。

1). File System Counters：MR-Job执行依赖的数据来自不同的文件系统，这个group表示job与文件系统交互的读写统计

HDFS: Number of bytes read=260407668　　//map从HDFS读取数据，包括源文件内容、split元数据。所以这个值比FileInputFormatCounters.BYTES_READ 要略大些。
FILE: Number of bytes written=984244425　　//表示map task往本地磁盘中总共写了多少字节（其实，Reduce端的Merge也会写入本地File）
FILE: Number of bytes read=655771325　　//reduce从本地文件系统读取数据（map结果保存在本地磁盘）
HDFS: Number of bytes written=17681802　　//最终结果写入HDFS

2). Job Counters：MR子任务统计，即map tasks 和 reduce tasks

Launched map tasks=4　　//启用map task的个数
Launched reduce tasks=5　　//启用reduce task的个数

3). Map-Reduce Framework：MR框架计数器

Map input records=1152870　　//map task从HDFS读取的文件总行数
Reduce input groups=579532 //Reduce输入的分组个数，如<hello,{1,1}> <me,1> <you,1>。如果有Combiner的话，那么这里的数值就等于map端Combiner运算后的最后条数，如果没有，那么就应该等于map的输出条数
Combine input records=0　　//Combiner输入 = map输出
Spilled Records=67418820　　//spill过程在map和reduce端都会发生，这里统计在总共从内存往磁盘中spill了多少条数据

4). Shuffle Errors：

5). File Input Format Counters：文件输入格式化计数器

　　Bytes Read=0　　//map阶段，各个map task的map方法输入的所有value值字节数之和

6). File Output Format Counters：文件输出格式化计数器

　　Bytes Written=17681802　　//MR输出总的字节数，包括【单词】,【空格】,【单词个数】及每行的【换行符】

自定义计数器

//自定义计数器<Key , Value>的形式
 
Counter counter = context.getCounter("查找hello", "hello");
 
if(string.contains("hello")){
 
counter.increment(1l);//出现一次+1
 
}

解读：计数器Counter的更多相关文章

计数器(counter)，有序字典(OrderDict)，默认字典(defaultdict)，可命名元祖(namedtuple)，双向队列(deque)，单项队列(deuqe.Queue)
Python_Day_05 计数器(counter),有序字典(OrderDict),默认字典(defaultdict),可命名元祖(namedtuple),双向队列(deque),单项队列(deuq ...
JMeter 配置元件之计数器Counter
配置元件之计数器Counter by:授客 QQ:1033553122 测试环境 apache-jmeter-2.13 1. 计数器简介允许用户创建一个在线程组范围之内都可以被引用的计数器. ...
028_MapReduce中的计数器Counter的使用
一.分析运行wordcount程序屏幕上打印信息 ##运行wordcount单词频率统计程序,基于输出输出路径. [hadoop@hadoop-master hadoop-1.2.1]$ hadoop ...
CSS计数器:counter
最近的需求,明星字体销售排行榜中,需要对字体的销售情况进行排序. 在早期,只有ol和ul可以对子元素li进行排序:如果不使用这两个标签,就由前台开发去手动填写序号. 当然,在这个需求中,数据不是实时更 ...
Python_Day_05 计数器(counter)，有序字典(OrderDict)，默认字典(defaultdict)，可命名元祖(namedtuple)，双向队列(deque)，单项队列(deuqe.Queue)
Counter(计数器) 是一个字典的子类,存储形式同样为字典,其中存储的键为字典的元素,值为元素出现的次数,在使用之前我们需要先导入文件 import collections 初始化一个计数器 im ...
Cassandra 计数器counter类型和它的限制
文档基础 Cassandra 2.* CQL3.1 翻译多数来自这个文档更新于2015年9月7日,最后有参考资料作为Cassandra的一种类型之一,Counter类型算是限制最多的一个.Coun ...
计数器counter
今天就讲了2个属性:1.计数器 2.列规则列规则很简单:column-count:3; (列的具体个数) column-width:30px;(列宽)N个浏览器不兼容column-gap:10px; ...
Jmeter -----计数器(counter)
计数器的定义 Allows the user to create a counter that can be referenced anywhere in the Thread Group. The ...
jmeter 配置元件之计数器Counter
用jmeter生成数据我用过几种以下几种方法 1.CSV Data Set Config 参数化 2.${_Random} ${_Random}是jmeter函数助手里面自带的一个函数,作用是返回 ...

随机推荐

Ubuntu 16.04 安装 Gnome 桌面环境
个人博客链接:Ubuntu 16.04 安装 Gnome 桌面环境
Ubuntu下编译C语言程序（同时给编译生成的文件命名）
1.创建c文件 test.c touch test.c 2.编写test.c vim test.c #include "stdio.h" int main(){ printf(&q ...
netty/example/src/main/java/io/netty/example/http/snoop/
netty/example/src/main/java/io/netty/example/http/snoop at 4.1 · netty/netty https://github.com/nett ...
BSSID,SSID,ESSID区别
SSID(Service Set Identifier) SSID,AP唯一的ID码,许多人认为可以将SSID写成ESSID,其实不然,SSID是个笼统的概念,包含了ESSID和BSSID,用来区 ...
nodejs(三)下之mangoDB
mongoDB 简介一.什么是MongoDB ? 1.MongoDB 是由C++语言编写的,是一个基于分布式文件存储的开源数据库系统.在高负载的情况下,添加更多的节点,可以保证服务器性能. 2.Mo ...
前端开发 - JavaScript - 总结
一.JavaScript的特征 javaScript是一种web前端的描述语言,也是一种基于对象(object)和事件驱动(Event Driven)的.安全性好的脚本语言.它运行在客户端从而减轻服务 ...
Hive简介及使用
一.Hive简介 1.hive概述 Apache Hive™数据仓库软件有助于使用SQL读取,编写和管理驻留在分布式存储中的大型数据集. 可以将结构投影到已存储的数据中.提供了命令行工具和JDBC驱动 ...
内核通信之Netlink源码分析-用户内核通信原理2
2017-07-05 上文以一个简单的案例描述了通过Netlink进行用户.内核通信的流程,本节针对流程中的各个要点进行深入分析 sock的创建 sock管理结构 sendmsg源码分析 sock的 ...
Spring Bean声明周期
Bean的生命周期理解Spring Bean的生命周期很容易.当一个bean被实例化时,它可能需要执行一些初始化使它转换成可用状态.同样,当bean不再需要,并且从容器中移除时,可能需要做一些清除工 ...
0606-Zuul构建API Gateway-Zuul过滤器以及禁用Zuul过滤器
一.概述针对Spring Cloud的Zuul配备了许多在代理和服务器模式下默认启用的ZuulFilter bean. 有关启用的可能过滤器,请参阅zuul过滤器包. 二.Zuul过滤器使用 2.1 ...

解读：计数器Counter

解读：计数器Counter的更多相关文章

随机推荐

热门专题