准备工作

操作系统

CentOS 7

软件环境

  1. JDK 1.7.0_79 下载地址
  2. SSH,正常来说是系统自带的,若没有请自行搜索安装方法

关闭防火墙

systemctl stop firewalld.service #停止firewall
systemctl disable firewalld.service #禁止firewall开机启动

设置HostName

[root@localhost ~]# hostname localhost

安装环境

安装JDK

[root@localhost ~]# tar -xzvf jdk-7u79-linux-x64.tar.gz

配置java环境变量

[root@localhost ~]# vi /etc/profile
#添加如下配置
JAVA_HOME=/root/jdk1.7.0_79
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JAVA_HOME
export PATH
export CLASSPATH

验证java

[root@localhost ~]# java -version
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

待输出以上内容时说明java已安装配置成功。

安装Hadoop

下载Hadoop 2.6.4

安装Hadoop 2.6.4

[root@localhost ~]# tar -xzvf hadoop-2.6.4.tar.gz

配置Hadoop环境变量

[root@localhost ~]# vim /etc/profile
#添加以下配置
export HADOOP_HOME=/root/hadoop-2.6.4
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin [root@localhost ~]# vim /root/hadoop-2.6.4/etc/hadoop/hadoop-env.sh
#修改以下配置
# The only required environment variable is JAVA_HOME. All others are
# optional. When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes. # The java implementation to use.
export JAVA_HOME=/root/jdk1.7.0_79

验证Hadoop

[root@localhost ~]# hadoop version
Hadoop 2.6.4
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc 2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using /root/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar

修改Hadoop配置文件

配置文件均存放在/root/hadoop-2.6.4/etc/hadoop

<!-- core-site.xml-->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration> <!-- hdfs-site.xml -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration> <!-- mapred-site.xml -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration> <!-- yarn-site.xml -->
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

SSH免密码登陆

[root@localhost ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
[root@localhost ~]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

输入以下命令,如果不要求输入密码则表示配置成功:

[root@localhost ~]# ssh localhost
Last login: Fri May 6 05:17:32 2016 from 192.168.154.1

执行Hadoop

格式化hdfs

[root@localhost ~]# hdfs namenode -format

启动NameNode,DataNode和YARN

[root@localhost ~]# start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /root/hadoop-2.6.4/logs/hadoop-root-namenode-localhost.out
localhost: starting datanode, logging to /root/hadoop-2.6.4/logs/hadoop-root-datanode-localhost.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /root/hadoop-2.6.4/logs/hadoop-root-secondarynamenode-localhost.out [root@localhost ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /root/hadoop-2.6.4/logs/yarn-root-resourcemanager-localhost.out
localhost: starting nodemanager, logging to /root/hadoop-2.6.4/logs/yarn-root-nodemanager-localhost.out

向hdfs上传测试文件

首先在/root/test中建立test1.txt和test2.txt,分别输入“hello world”和“hello hadoop”并保存。

使用如下命令将文件上传至hdfs的input目录中:

[root@localhost ~]# hadoop fs -put /root/test/ input
[root@localhost ~]# hadoop fs -ls input
Found 2 items
-rw-r--r-- 1 root supergroup 12 2016-05-06 06:35 input/test1.txt
-rw-r--r-- 1 root supergroup 13 2016-05-06 06:35 input/test2.txt

执行wordcount demo

输入以下命令并等待执行结果:

[root@localhost ~]# hadoop jar /root/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount input output
16/05/06 06:44:15 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/06 06:44:16 INFO input.FileInputFormat: Total input paths to process : 2
16/05/06 06:44:17 INFO mapreduce.JobSubmitter: number of splits:2
16/05/06 06:44:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1462530786445_0001
16/05/06 06:44:18 INFO impl.YarnClientImpl: Submitted application application_1462530786445_0001
16/05/06 06:44:18 INFO mapreduce.Job: The url to track the job: http://server1:8088/proxy/application_1462530786445_0001/
16/05/06 06:44:18 INFO mapreduce.Job: Running job: job_1462530786445_0001
16/05/06 06:44:33 INFO mapreduce.Job: Job job_1462530786445_0001 running in uber mode : false
16/05/06 06:44:33 INFO mapreduce.Job: map 0% reduce 0%
16/05/06 06:44:52 INFO mapreduce.Job: map 50% reduce 0%
16/05/06 06:44:53 INFO mapreduce.Job: map 100% reduce 0%
16/05/06 06:45:03 INFO mapreduce.Job: map 100% reduce 100%
16/05/06 06:45:03 INFO mapreduce.Job: Job job_1462530786445_0001 completed successfully
16/05/06 06:45:04 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=55
FILE: Number of bytes written=320242
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=249
HDFS: Number of bytes written=25
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=34487
Total time spent by all reduces in occupied slots (ms)=7744
Total time spent by all map tasks (ms)=34487
Total time spent by all reduce tasks (ms)=7744
Total vcore-milliseconds taken by all map tasks=34487
Total vcore-milliseconds taken by all reduce tasks=7744
Total megabyte-milliseconds taken by all map tasks=35314688
Total megabyte-milliseconds taken by all reduce tasks=7929856
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=41
Map output materialized bytes=61
Input split bytes=224
Combine input records=4
Combine output records=4
Reduce input groups=3
Reduce shuffle bytes=61
Reduce input records=4
Reduce output records=3
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=364
CPU time spent (ms)=3990
Physical memory (bytes) snapshot=515538944
Virtual memory (bytes) snapshot=2588155904
Total committed heap usage (bytes)=296755200
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=25
File Output Format Counters
Bytes Written=25

查看执行结果

[root@localhost ~]# hadoop fs -ls output
Found 2 items
-rw-r--r-- 1 root supergroup 0 2016-05-06 06:45 output/_SUCCESS
-rw-r--r-- 1 root supergroup 25 2016-05-06 06:45 output/part-r-00000
[root@localhost ~]# hadoop fs -cat output/part-r-00000
hadoop 1
hello 2
world 1

至此,Pseudo-Distributed就已经完成了。

完全分布式可参考这里

原创文章,转载请注明: 转载自xdlysk的博客

本文链接地址: 搭建Hadoop伪分布式[http://www.xdlysk.com/article/572c956642c817300e0f7ab1]

搭建Hadoop2.6.4伪分布式的更多相关文章

  1. 在Win7虚拟机下搭建Hadoop2.6.0伪分布式环境

    近几年大数据越来越火热.由于工作需要以及个人兴趣,最近开始学习大数据相关技术.学习过程中的一些经验教训希望能通过博文沉淀下来,与网友分享讨论,作为个人备忘. 第一篇,在win7虚拟机下搭建hadoop ...

  2. CentOS5.4 搭建Hadoop2.5.2伪分布式环境

    简介: Hadoop是处理大数据的主要工具,其核心部分是HDFS.MapReduce.为了学习的方便,我在虚拟机上搭建了一个伪分布式环境,来进行开发学习. 一.安装前准备: 1)linux服务器:Vm ...

  3. Docker中搭建Hadoop-2.6单机伪分布式集群

    1 获取一个简单的Docker系统镜像,并建立一个容器. 1.1 这里我选择下载CentOS镜像 docker pull centos 1.2 通过docker tag命令将下载的CentOS镜像名称 ...

  4. ubuntu14.04搭建Hadoop2.9.0伪分布式环境

    本文主要参考 给力星的博文——Hadoop安装教程_单机/伪分布式配置_Hadoop2.6.0/Ubuntu14.04 一些准备工作的基本步骤和步骤具体说明本文不再列出,文章中提到的“见参考”均指以上 ...

  5. Dockerfile完成Hadoop2.6的伪分布式搭建

    在 <Docker中搭建Hadoop-2.6单机伪分布式集群>中在容器中操作来搭建伪分布式的Hadoop集群,这一节中将主要通过Dokcerfile 来完成这项工作. 1 获取一个简单的D ...

  6. Hadoop2.5.0伪分布式环境搭建

    本章主要介绍下在Linux系统下的Hadoop2.5.0伪分布式环境搭建步骤.首先要搭建Hadoop伪分布式环境,需要完成一些前置依赖工作,包括创建用户.安装JDK.关闭防火墙等. 一.创建hadoo ...

  7. 琐碎-hadoop2.2.0伪分布式和完全分布式安装(centos6.4)

    环境是centos6.4-32,hadoop2.2.0 伪分布式文档:http://pan.baidu.com/s/1kTrAcWB 完全分布式文档:http://pan.baidu.com/s/1s ...

  8. 32位Ubuntu12.04搭建Hadoop2.5.1完全分布式环境

    准备工作 1.准备安装环境: 4台PC,均安装32位Ubuntu12.04操作系统,统一用户名和密码 交换机1台 网线5根,4根分别用于PC与交换机相连,1根网线连接交换机和实验室网口 2.使用ifc ...

  9. 摘要: CentOS 6.5搭建Redis3.2.8伪分布式集群

    from https://my.oschina.net/ososchina/blog/856678     摘要: CentOS 6.5搭建Redis3.2.8伪分布式集群 前言 最近在服务器上搭建了 ...

随机推荐

  1. 配置L2TP IPsec VPN (CentOS 6.5)

    1. 安装相关包 yum install -y ppp iptables make gcc gmp-devel xmlto bison flex libpcap-devel lsof vim-enha ...

  2. PHPCMS企业站制作

    安装 将下载好的文件放到www目录下 地址栏中输入 http://localhost:8080/phpcms/install_package/install 打开安装页面 进行安装即可. 如果出现: ...

  3. 描述性统计分析-用脚本将统计量函数批量化&分步骤逐一写出

    计算各种描述性统计量函数脚本(myDescriptStat.R)如下: myDescriptStat <- function(x){ n <- length(x) #样本数据个数 m &l ...

  4. jq菜单折叠效果

    <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8" ...

  5. 未解决的问题,登录163邮箱http://mail.163.com/,用xpath的方式定位密码输入框的时候,总是报找不到该元素

    退出的时候出现: xpath定位方法: 注意xpath路径写的太长,如果层级全部写完定位不到,就尝试去掉一些层级

  6. Openstack+Kubernetes+Docker微服务实践之路--弹性扩容

    服务上线就要顶的住压力.扛的住考验,不然挨说的还是我们这帮做事的兄弟,还记得上图这个场景吗 老办法是服务集群部署,但总归有个上限,之前跟阿里合作的时候他们有个弹性计算可以通过设置CPU的阀值来动态扩展 ...

  7. [bzoj1122][POI2008]账本BBB

    1122: [POI2008]账本BBB Time Limit: 10 Sec  Memory Limit: 162 MBSubmit: 402  Solved: 202[Submit][Status ...

  8. Js Map 实现

    /* * MAP对象,实现MAP功能 * * 接口: * size() 获取MAP元素个数 * isEmpty() 判断MAP是否为空 * clear() 删除MAP所有元素 * put(key, v ...

  9. C语言局部变量和全局变量的区别。——Arvin

    局变量是使用相同的内存块在整个类中存储一个值. 全局变量的存在主要有以下一些原因:  1,使用全局变量会占用更多的内存(因为其生命期长),不过在计算机配置很高的今天,这个不成为什么问题,除非使用的是巨 ...

  10. WPF中RDLC报表的钻取实现

    1.新建wpf项目,并引入3个程序集: Microsoft.ReportViewer.WinForms WindowsFormsIntegration System.Windows.Forms 如果无 ...