Hadoop 2.7.3 完全分布式维护-简单测试篇
1. 测试MapReduce Job
1.1 上传文件到hdfs文件系统
$ jps
Jps
SecondaryNameNode
JobHistoryServer
NameNode
ResourceManager
$ jps > infile
$ hadoop fs -mkdir /inputdir
$ hadoop fs -put infile /inputdir
$ hadoop fs -ls /inputdir
Found items
-rw-r--r-- hduser supergroup -- : /inputdir/infile
1.2 进行word count计算
$ hadoop jar /usr/local/hadoop-2.7./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7..jar wordcount /inputdir /outputdir
// :: INFO client.RMProxy: Connecting to ResourceManager at /172.16.101.55:
// :: INFO input.FileInputFormat: Total input paths to process :
// :: INFO mapreduce.JobSubmitter: number of splits:
// :: INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1504106569900_0001
// :: INFO impl.YarnClientImpl: Submitted application application_1504106569900_0001
// :: INFO mapreduce.Job: The url to track the job: http://sht-sgmhadoopnn-01:8088/proxy/application_1504106569900_0001/
// :: INFO mapreduce.Job: Running job: job_1504106569900_0001
// :: INFO mapreduce.Job: Job job_1504106569900_0001 running in uber mode : false
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: map % reduce %
// :: INFO mapreduce.Job: Job job_1504106569900_0001 completed successfully
// :: INFO mapreduce.Job: Counters:
File System Counters
FILE: Number of bytes read=
FILE: Number of bytes written=
FILE: Number of read operations=
FILE: Number of large read operations=
FILE: Number of write operations=
HDFS: Number of bytes read=
HDFS: Number of bytes written=
HDFS: Number of read operations=
HDFS: Number of large read operations=
HDFS: Number of write operations=
Job Counters
Launched map tasks=
Launched reduce tasks=
Data-local map tasks=
Total time spent by all maps in occupied slots (ms)=
Total time spent by all reduces in occupied slots (ms)=
Total time spent by all map tasks (ms)=
Total time spent by all reduce tasks (ms)=
Total vcore-milliseconds taken by all map tasks=
Total vcore-milliseconds taken by all reduce tasks=
Total megabyte-milliseconds taken by all map tasks=
Total megabyte-milliseconds taken by all reduce tasks=
Map-Reduce Framework
Map input records=
Map output records=
Map output bytes=
Map output materialized bytes=
Input split bytes=
Combine input records=
Combine output records=
Reduce input groups=
Reduce shuffle bytes=
Reduce input records=
Reduce output records=
Spilled Records=
Shuffled Maps =
Failed Shuffles=
Merged Map outputs=
GC time elapsed (ms)=
CPU time spent (ms)=
Physical memory (bytes) snapshot=
Virtual memory (bytes) snapshot=
Total committed heap usage (bytes)=
Shuffle Errors
BAD_ID=
CONNECTION=
IO_ERROR=
WRONG_LENGTH=
WRONG_MAP=
WRONG_REDUCE=
File Input Format Counters
Bytes Read=
File Output Format Counters
Bytes Written=
1.3 查看wordcount结果
$ hadoop fs -ls /outputdir
Found items
-rw-r--r-- hduser supergroup -- : /outputdir/_SUCCESS
-rw-r--r-- hduser supergroup -- : /outputdir/part-r-
$ hadoop fs -cat /outputdir/part-r- JobHistoryServer
Jps
NameNode
ResourceManager
SecondaryNameNode
2. 测试hdfs分布式存储
2.1 上传测试文件
$ ls -lh hadoop-2.7..tar.gz
-rw-r--r-- root root 205M May : hadoop-2.7..tar.gz
$ hadoop fs -put hadoop-2.7..tar.gz /inputdir
$ hadoop fs -ls -h /inputdir
Found items
-rw-r--r-- hduser supergroup 204.2 M -- : /inputdir/hadoop-2.7..tar.gz
-rw-r--r-- hduser supergroup -- : /inputdir/infile
2.2 查看datanode副本信息
Hadoop 2.7.3 完全分布式维护-简单测试篇的更多相关文章
- Hadoop 2.7.3 完全分布式维护-部署篇
测试环境如下 IP host JDK linux hadop role 172.16.101.55 sht-sgmhadoopnn-01 1.8.0_111 CentOS release ...
- Hadoop 2.7.3 完全分布式维护-动态增加datanode篇
原有环境 http://www.cnblogs.com/ilifeilong/p/7406944.html IP host JDK linux hadop role 172.16.101 ...
- 安装部署Apache Hadoop (本地模式和伪分布式)
本节内容: Hadoop版本 安装部署Hadoop 一.Hadoop版本 1. Hadoop版本种类 目前Hadoop发行版非常多,有华为发行版.Intel发行版.Cloudera发行版(CDH)等, ...
- Hadoop Single Node Setup(hadoop本地模式和伪分布式模式安装-官方文档翻译 2.7.3)
Purpose(目标) This document describes how to set up and configure a single-node Hadoop installation so ...
- ZooKeeper分布式锁简单实践
ZooKeeper分布式锁简单实践 在分布式解决方案中,Zookeeper是一个分布式协调工具.当多个JVM客户端,同时在ZooKeeper上创建相同的一个临时节点,因为临时节点路径是保证唯一,只要谁 ...
- Hadoop平台K-Means聚类算法分布式实现+MapReduce通俗讲解
Hadoop平台K-Means聚类算法分布式实现+MapReduce通俗讲解 在Hadoop分布式环境下实现K-Means聚类算法的伪代码如下: 输入:参数0--存储样本数据的文本文件inpu ...
- Hadoop、Zookeeper、Hbase分布式安装教程
参考: Hadoop安装教程_伪分布式配置_CentOS6.4/Hadoop2.6.0 Hadoop集群安装配置教程_Hadoop2.6.0_Ubuntu/CentOS ZooKeeper-3.3 ...
- Hadoop 在windows 上伪分布式的安装过程
第一部分:Hadoop 在windows 上伪分布式的安装过程 安装JDK 1.下载JDK http://www.oracle.com/technetwork/java/javaee/d ...
- Hadoop 2.4.0完全分布式平台搭建、配置、安装
一:系统安装与配置 Hadoop选择下载2.4.0 http://hadoop.apache.org / http://mirror.bit.edu.cn/apache/hadoop/common/h ...
随机推荐
- jquery及jquery常用选择器使用
本文为博主原创,未经允许不得转载: 1.jquery强大之处: 容易上手,强大的选择器,解决浏览器的兼容 完善的时间机制,出色的ajax封装,丰富的ui 2.jquery是一个javas ...
- 批量Excel数据导入Oracle数据库
由于一直基于Oracle数据库上做开发,因此常常会需要把大量的Excel数据导入到Oracle数据库中,其实如果从事SqlServer数据库的开发,那么思路也是一样的,本文主要介绍如何导入Excel数 ...
- wow.js
一.首先说明一下怎么使用这个插件: 1.wow.js依赖于animate.css,首先在头部引用animate.css或者animate.min.css. <link rel="sty ...
- 用R创建Word和PowerPoint文档--转载
https://www.jianshu.com/p/7df62865c3ed Rapp --简书 Microsoft的Office软件在办公软件领域占有绝对的主导地位,几乎每个职场人士都必须掌握Wor ...
- _itemmod_day_limit
控制玩家每天获得的物品上限 表说明 `comment` 备注 `entry` 物品 `limitCount`获取上限
- Codeforces Round #271 (Div. 2) F. Ant colony 线段树
F. Ant colony time limit per test 1 second memory limit per test 256 megabytes input standard input ...
- IIS字体 404错误
问题:最近在IIS上部署web项目的时候,发现浏览器总是报找不到woff.woff2字体的错误.导致浏览器加载字体报404错误,白白消耗了100-200毫秒的加载时间. 原因:因为服务器IIS不认SV ...
- Python缩进与if语句 空格的魅力
缩进 Python最具特色的是用缩进来标明成块的代码.我下面以if选择结构来举例.if后面跟随条件,如果条件成立,则执行归属于if的一个代码块. 先看C语言的表达方式(注意,这是C,不是Python! ...
- maven项目, 单元测试失败提示 Class not found datastorage........
---恢复内容开始--- 单元测试失败: 提示 Class not found datastorage........ 原因: maven 环境变量问题, eclipse 没有自动更新下载 ...
- SpringBoot简单的REST风格例子
关于REST和RESTful的说明请移步至:怎样用通俗的语言解释REST,以及RESTful? 其实我自己也不是十分的理解,只是今天学SpringBoot时看到有个标着REST风格的简单例子,就记录一 ...