编写不易,转载请注明(http://shihlei.iteye.com/blog/2084711)!

说明

本文搭建Hadoop CDH5.0.1 分布式系统,包括NameNode ,ResourceManger HA,忽略了Web Application Proxy 和Job HistoryServer。

一概述

(一)HDFS

1)基础架构

(1)NameNode(Master)

  • 命名空间管理:命名空间支持对HDFS中的目录、文件和块做类似文件系统的创建、修改、删除、列表文件和目录等基本操作。
  • 块存储管理

(2)DataNode(Slaver)

namenode和client的指令进行存储或者检索block,并且周期性的向namenode节点报告它存了哪些文件的block

2)HA架构

使用Active NameNode,Standby NameNode 两个结点解决单点问题,两个结点通过JounalNode共享状态,通过ZKFC 选举Active ,监控状态,自动备援。

(1)Active NameNode:

接受client的RPC请求并处理,同时写自己的Editlog和共享存储上的Editlog,接收DataNode的Block report, block location updates和heartbeat;

(2)Standby NameNode:

同样会接到来自DataNode的Block report, block location updates和heartbeat,同时会从共享存储的Editlog上读取并执行这些log操作,使得自己的NameNode中的元数据(Namespcae information + Block locations map)都是和Active NameNode中的元数据是同步的。所以说Standby模式的NameNode是一个热备(Hot
Standby NameNode),一旦切换成Active模式,马上就可以提供NameNode服务

(3)JounalNode:

用于Active NameNode , Standby NameNode 同步数据,本身由一组JounnalNode结点组成,该组结点基数个,支持Paxos协议,保证高可用,是CDH5唯一支持的共享方式(相对于CDH4 促在NFS共享方式)

(4)ZKFC:

监控NameNode进程,自动备援。

(二)YARN

1)基础架构

(1)ResourceManager(RM)

接收客户端任务请求,接收和监控NodeManager(NM)的资源情况汇报,负责资源的分配与调度,启动和监控ApplicationMaster(AM)。

(2)NodeManager

节点上的资源管理,启动Container运行task计算,上报资源、container情况给RM和任务处理情况给AM。

(3)ApplicationMaster

单个Application(Job)的task管理和调度,向RM进行资源的申请,向NM发出launch Container指令,接收NM的task处理状态信息。NodeManager

(4)Web Application Proxy

用于防止Yarn遭受Web攻击,本身是ResourceManager的一部分,可通过配置独立进程。ResourceManager Web的访问基于守信用户,当Application Master运行于一个非受信用户,其提供给ResourceManager的可能是非受信连接,Web Application Proxy可以阻止这种连接提供给RM。

(5)Job History Server

NodeManager在启动的时候会初始化LogAggregationService服务, 该服务会在把本机执行的container log (在container结束的时候)收集并存放到hdfs指定的目录下. ApplicationMaster会把jobhistory信息写到hdfs的jobhistory临时目录下, 并在结束的时候把jobhisoty移动到最终目录, 这样就同时支持了job的recovery.History会启动web和RPC服务,
用户可以通过网页或RPC方式获取作业的信息

2)HA架构

ResourceManager HA 由一对Active,Standby结点构成,通过RMStateStore存储内部数据和主要应用的数据及标记。目前支持的可替代的RMStateStore实现有:基于内存的MemoryRMStateStore,基于文件系统的FileSystemRMStateStore,及基于zookeeper的ZKRMStateStore。

ResourceManager HA的架构模式同NameNode HA的架构模式基本一致,数据共享由RMStateStore,而ZKFC成为 ResourceManager进程的一个服务,非独立存在。

二 规划

(一)版本

组件名

版本

说明

JRE

java version "1.7.0_60"

Java(TM) SE Runtime Environment (build 1.7.0_60-b19)

Java HotSpot(TM) Client VM (build 24.60-b09, mixed mode)

Hadoop

hadoop-2.3.0-cdh5.0.1.tar.gz

主程序包

Zookeeper

zookeeper-3.4.5-cdh5.0.1.tar.gz

热切,Yarn 存储数据使用的协调服务

(二)主机规划

IP

Host

部署模块

进程

8.8.8.11

Hadoop-NN-01

NameNode

ResourceManager

NameNode

DFSZKFailoverController

ResourceManager

8.8.8.12

Hadoop-NN-02

NameNode

ResourceManager

NameNode

DFSZKFailoverController

ResourceManager

8.8.8.13

Hadoop-DN-01

Zookeeper-01

DataNode

NodeManager

Zookeeper

DataNode

NodeManager

JournalNode

QuorumPeerMain

8.8.8.14

Hadoop-DN-02

Zookeeper-02

DataNode

NodeManager

Zookeeper

DataNode

NodeManager

JournalNode

QuorumPeerMain

8.8.8.15

Hadoop-DN-03

Zookeeper-03

DataNode

NodeManager

Zookeeper

DataNode

NodeManager

JournalNode

QuorumPeerMain

各个进程解释:

  • NameNode
  • ResourceManager
  • DFSZKFC:DFS Zookeeper Failover Controller 激活Standby NameNode
  • DataNode
  • NodeManager
  • JournalNode:NameNode共享editlog结点服务(如果使用NFS共享,则该进程和所有启动相关配置接可省略)。
  • QuorumPeerMain:Zookeeper主进程

(三)目录规划

名称

路径

$HADOOP_HOME

/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1

Data

$ HADOOP_HOME/data

Log

$ HADOOP_HOME/logs

三 环境准备

1)关闭防火墙

root 用户:

[root@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]# service iptables stop

iptables: Flushing firewall rules: [  OK  ]

iptables: Setting chains to policy ACCEPT: filter [  OK  ]

iptables: Unloading modules: [  OK  ]

验证:

[root@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]# service  iptables status

iptables: Firewall is not running.

2)安装JRE:略

3)安装Zookeeper :参见《Zookeeper-3.4.5-cdh5.0.1 单机模式、副本 ...

4)配置SSH互信:

(1)Hadoop-NN-01创建密钥:

[zero@CentOS-Cluster-01 ~]$ ssh-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/home/zero/.ssh/id_rsa):

Created directory '/home/zero/.ssh'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /home/zero/.ssh/id_rsa.

Your public key has been saved in /home/zero/.ssh/id_rsa.pub.

The key fingerprint is:

28:0a:29:1d:98:56:55:db:ec:83:93:56:8a:0f:6c:ea zero@CentOS-Cluster-01

The key's randomart image is:

+--[ RSA 2048]----+

|   .....         |

| o.     +        |

|o..    . +       |

|.o .. ..*        |

|+ . .=.*So       |

|.. .o.+ . .      |

|  ..   .         |

|  .              |

|   E             |

+-----------------+

(2)分发密钥:

[zero@CentOS-Cluster-01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub zero@Hadoop-NN-01

The authenticity of host 'hadoop-nn-01 (8.8.8.11)' can't be established.

RSA key fingerprint is a6:11:09:49:8c:fe:b2:fb:49:d5:01:fa:13:1b:32:24.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'hadoop-nn-01,8.8.8.11' (RSA) to the list of known hosts.

puppet@hadoop-nn-01's password:

Permission denied, please try again.

puppet@hadoop-nn-01's password:

[zero@CentOS-Cluster-01 ~]$  ssh-copy-id -i ~/.ssh/id_rsa.pub zero@Hadoop-NN-01

zero@hadoop-nn-01's password:

Now try logging into the machine, with "ssh 'zero@Hadoop-NN-01'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[zero@CentOS-Cluster-01 ~]$  ssh-copy-id -i ~/.ssh/id_rsa.pub zero@Hadoop-NN-02

(…略…)

[zero@CentOS-Cluster-01 ~]$  ssh-copy-id -i ~/.ssh/id_rsa.pub zero@Hadoop-DN-01

(…略…)

[zero@CentOS-Cluster-01 ~]$  ssh-copy-id -i ~/.ssh/id_rsa.pub zero@Hadoop-DN-02

(…略…)

[zero@CentOS-Cluster-01 ~]$  ssh-copy-id -i ~/.ssh/id_rsa.pub zero@Hadoop-DN-03

(…略…)

(3)验证:

[zero@CentOS-Cluster-01 ~]$ ssh Hadoop-NN-01

Last login: Sun Jun 22 19:56:23 2014 from 8.8.8.1

[zero@CentOS-Cluster-01 ~]$ exit

logout

Connection to Hadoop-NN-01 closed.

[zero@CentOS-Cluster-01 ~]$ ssh Hadoop-NN-02

Last login: Sun Jun 22 20:03:31 2014 from 8.8.8.1

[zero@CentOS-Cluster-03 ~]$ exit

logout

Connection to Hadoop-NN-02 closed.

[zero@CentOS-Cluster-01 ~]$ ssh Hadoop-DN-01

Last login: Mon Jun 23 02:00:07 2014 from centos_cluster_01

[zero@CentOS-Cluster-03 ~]$ exit

logout

Connection to Hadoop-DN-01 closed.

[zero@CentOS-Cluster-01 ~]$ ssh Hadoop-DN-02

Last login: Sun Jun 22 20:07:03 2014 from 8.8.8.1

[zero@CentOS-Cluster-04 ~]$ exit

logout

Connection to Hadoop-DN-02 closed.

[zero@CentOS-Cluster-01 ~]$ ssh Hadoop-DN-03

Last login: Sun Jun 22 20:07:05 2014 from 8.8.8.1

[zero@CentOS-Cluster-05 ~]$ exit

logout

Connection to Hadoop-DN-03 closed.

5)配置/etc/hosts并分发:

[root@CentOS-Cluster-01 zero]# vi /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

8.8.8.10 CentOS-StandAlone

8.8.8.11 CentOS-Cluster-01 Hadoop-NN-01

8.8.8.12 CentOS-Cluster-02 Hadoop-NN-02

8.8.8.13 CentOS-Cluster-03 Hadoop-DN-01 Zookeeper-01

8.8.8.14 CentOS-Cluster-04 Hadoop-DN-02 Zookeeper-02

8.8.8.15 CentOS-Cluster-05 Hadoop-DN-03 Zookeeper-03

6)配置环境变量:vi ~/.bashrc 然后 source ~/.bashrc

[zero@CentOS-Cluster-01 ~]$ vi ~/.bashrc

……

# hadoop cdh5

export HADOOP_HOME=/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1

export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

[zero@CentOS-Cluster-01 ~]$ source ~/.bashrc

四 安装

1)解压

[puppet@BigData-01 cdh4.4]$ tar -xvf hadoop-2.0.0-cdh4.4.0.tar.gz

2)修改配置文件

说明:

配置名称

类型

说明

hadoop-env.sh

Bash脚本

Hadoop运行环境变量设置

core-site.xml

xml

配置Hadoop core,如IO

hdfs-site.xml

xml

配置HDFS守护进程:NN、JN、DN

yarn-env.sh

Bash脚本

Yarn运行环境变量设置

yarn-site.xml

xml

Yarn框架配置环境

mapred-site.xml

xml

MR属性设置

capacity-scheduler.xml

xml

Yarn调度属性设置

container-executor.cfg

Cfg

Yarn Container配置

mapred-queues.xml

xml

MR队列设置

hadoop-metrics.properties

Java属性

Hadoop Metrics配置

hadoop-metrics2.properties

Java属性

Hadoop Metrics配置

slaves

Plain Text

DN节点配置

exclude

Plain Text

移除DN节点配置文件

log4j.properties

系统日志设置

configuration.xsl

(1)修改$HADOOP_HOME/etc/hadoop-env.sh:

Shell代码  
  1. #--------------------Java Env------------------------------
  2. export JAVA_HOME="/usr/runtime/java/jdk1.7.0_60"
  3. #--------------------Hadoop Env------------------------------
  4. #export HADOOP_PID_DIR=
  5. export HADOOP_PREFIX="/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1"
  6. #--------------------Hadoop Daemon Options-----------------
  7. #export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"
  8. #export HADOOP_DATANODE_OPTS=
  9. #--------------------Hadoop Logs---------------------------
  10. #export HADOOP_LOG_DIR=

(2)修改$HADOOP_HOME/etc/hadoop-site.xml

Xml代码  
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <configuration>
  4. <!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
  5. <property>
  6. <name>fs.defaultFS</name>
  7. <value>hdfs://mycluster</value>
  8. </property>
  9. <!--HDFS超级用户 -->
  10. <property>
  11. <name>dfs.permissions.superusergroup</name>
  12. <value>zero</value>
  13. </property>
  14. <!--==============================Trash机制======================================= -->
  15. <property>
  16. <!--多长时间创建CheckPoint NameNode截点上运行的CheckPointer 从Current文件夹创建CheckPoint;默认:0 由fs.trash.interval项指定 -->
  17. <name>fs.trash.checkpoint.interval</name>
  18. <value>0</value>
  19. </property>
  20. <property>
  21. <!--多少分钟.Trash下的CheckPoint目录会被删除,该配置服务器设置优先级大于客户端,默认:0 不删除 -->
  22. <name>fs.trash.interval</name>
  23. <value>1440</value>
  24. </property>
  25. </configuration>

(3)修改$HADOOP_HOME/etc/hdfs-site.xml

Xml代码  
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <configuration>
  4. <!--开启web hdfs -->
  5. <property>
  6. <name>dfs.webhdfs.enabled</name>
  7. <value>true</value>
  8. </property>
  9. <property>
  10. <name>dfs.namenode.name.dir</name>
  11. <value>/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/data/dfs/name</value>
  12. <description> namenode 存放name table(fsimage)本地目录(需要修改)</description>
  13. </property>
  14. <property>
  15. <name>dfs.namenode.edits.dir</name>
  16. <value>${dfs.namenode.name.dir}</value>
  17. <description>namenode粗放 transaction file(edits)本地目录(需要修改)</description>
  18. </property>
  19. <property>
  20. <name>dfs.datanode.data.dir</name>
  21. <value>/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/data/dfs/data</value>
  22. <description>datanode存放block本地目录(需要修改)</description>
  23. </property>
  24. <property>
  25. <name>dfs.replication</name>
  26. <value>1</value>
  27. </property>
  28. <!-- 块大小 (默认) -->
  29. <property>
  30. <name>dfs.blocksize</name>
  31. <value>268435456</value>
  32. </property>
  33. <!--======================================================================= -->
  34. <!--HDFS高可用配置 -->
  35. <!--nameservices逻辑名 -->
  36. <property>
  37. <name>dfs.nameservices</name>
  38. <value>mycluster</value>
  39. </property>
  40. <property>
  41. <!--设置NameNode IDs 此版本最大只支持两个NameNode -->
  42. <name>dfs.ha.namenodes.mycluster</name>
  43. <value>nn1,nn2</value>
  44. </property>
  45. <!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信地址 -->
  46. <property>
  47. <name>dfs.namenode.rpc-address.mycluster.nn1</name>
  48. <value>Hadoop-NN-01:8020</value>
  49. </property>
  50. <property>
  51. <name>dfs.namenode.rpc-address.mycluster.nn2</name>
  52. <value>Hadoop-NN-02:8020</value>
  53. </property>
  54. <!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信地址 -->
  55. <property>
  56. <name>dfs.namenode.http-address.mycluster.nn1</name>
  57. <value>Hadoop-NN-01:50070</value>
  58. </property>
  59. <property>
  60. <name>dfs.namenode.http-address.mycluster.nn2</name>
  61. <value>Hadoop-NN-02:50070</value>
  62. </property>
  63. <!--==================Namenode editlog同步 ============================================ -->
  64. <!--保证数据恢复 -->
  65. <property>
  66. <name>dfs.journalnode.http-address</name>
  67. <value>0.0.0.0:8480</value>
  68. </property>
  69. <property>
  70. <name>dfs.journalnode.rpc-address</name>
  71. <value>0.0.0.0:8485</value>
  72. </property>
  73. <property>
  74. <!--设置JournalNode服务器地址,QuorumJournalManager 用于存储editlog -->
  75. <!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address -->
  76. <name>dfs.namenode.shared.edits.dir</name>
  77. <value>qjournal://Hadoop-DN-01:8485;Hadoop-DN-02:8485;Hadoop-DN-03:8485/mycluster</value>
  78. </property>
  79. <property>
  80. <!--JournalNode存放数据地址 -->
  81. <name>dfs.journalnode.edits.dir</name>
  82. <value>/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/data/dfs/jn</value>
  83. </property>
  84. <!--==================DataNode editlog同步 ============================================ -->
  85. <property>
  86. <!--DataNode,Client连接Namenode识别选择Active NameNode策略 -->
  87. <name>dfs.client.failover.proxy.provider.mycluster</name>
  88. <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  89. </property>
  90. <!--==================Namenode fencing:=============================================== -->
  91. <!--Failover后防止停掉的Namenode启动,造成两个服务 -->
  92. <property>
  93. <name>dfs.ha.fencing.methods</name>
  94. <value>sshfence</value>
  95. </property>
  96. <property>
  97. <name>dfs.ha.fencing.ssh.private-key-files</name>
  98. <value>/home/zero/.ssh/id_rsa</value>
  99. </property>
  100. <property>
  101. <!--多少milliseconds 认为fencing失败 -->
  102. <name>dfs.ha.fencing.ssh.connect-timeout</name>
  103. <value>30000</value>
  104. </property>
  105. <!--==================NameNode auto failover base ZKFC and Zookeeper====================== -->
  106. <!--开启基于Zookeeper及ZKFC进程的自动备援设置,监视进程是否死掉 -->
  107. <property>
  108. <name>dfs.ha.automatic-failover.enabled</name>
  109. <value>true</value>
  110. </property>
  111. <property>
  112. <name>ha.zookeeper.quorum</name>
  113. <value>Zookeeper-01:2181,Zookeeper-02:2181,Zookeeper-03:2181</value>
  114. </property>
  115. <property>
  116. <!--指定ZooKeeper超时间隔,单位毫秒 -->
  117. <name>ha.zookeeper.session-timeout.ms</name>
  118. <value>2000</value>
  119. </property>
  120. </configuration>

(4)修改$HADOOP_HOME/etc/yarn-env.sh

Shell代码  
  1. -cdh5.0.1/logs”

(5)$HADOOP_HOEM/etc/mapred-site.xml

Xml代码  
  1. <configuration>
  2. <!-- 配置 MapReduce Applications -->
  3. <property>
  4. <name>mapreduce.framework.name</name>
  5. <value>yarn</value>
  6. </property>
  7. <!-- JobHistory Server ============================================================== -->
  8. <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
  9. <property>
  10. <name>mapreduce.jobhistory.address</name>
  11. <value>0.0.0.0:10020</value>
  12. </property>
  13. <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
  14. <property>
  15. <name>mapreduce.jobhistory.webapp.address</name>
  16. <value>0.0.0.0:19888</value>
  17. </property>
  18. </configuration>

(6)修改$HADOOP_HOME/etc/yarn-site.xml

Xml代码  
  1. <configuration>
  2. <!-- nodemanager 配置 ================================================= -->
  3. <property>
  4. <name>yarn.nodemanager.aux-services</name>
  5. <value>mapreduce_shuffle</value>
  6. </property>
  7. <property>
  8. <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  9. <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  10. </property>
  11. <property>
  12. <description>Address where the localizer IPC is.</description>
  13. <name>yarn.nodemanager.localizer.address</name>
  14. <value>0.0.0.0:23344</value>
  15. </property>
  16. <property>
  17. <description>NM Webapp address.</description>
  18. <name>yarn.nodemanager.webapp.address</name>
  19. <value>0.0.0.0:23999</value>
  20. </property>
  21. <!-- HA 配置 =============================================================== -->
  22. <!-- Resource Manager Configs -->
  23. <property>
  24. <name>yarn.resourcemanager.connect.retry-interval.ms</name>
  25. <value>2000</value>
  26. </property>
  27. <property>
  28. <name>yarn.resourcemanager.ha.enabled</name>
  29. <value>true</value>
  30. </property>
  31. <property>
  32. <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
  33. <value>true</value>
  34. </property>
  35. <!-- 使嵌入式自动故障转移。HA环境启动,与 ZKRMStateStore 配合 处理fencing -->
  36. <property>
  37. <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
  38. <value>true</value>
  39. </property>
  40. <!-- 集群名称,确保HA选举时对应的集群 -->
  41. <property>
  42. <name>yarn.resourcemanager.cluster-id</name>
  43. <value>yarn-cluster</value>
  44. </property>
  45. <property>
  46. <name>yarn.resourcemanager.ha.rm-ids</name>
  47. <value>rm1,rm2</value>
  48. </property>
  49. <!—这里RM主备结点需要单独指定,(可选)
  50. <property>
  51. <name>yarn.resourcemanager.ha.id</name>
  52. <value>rm2</value>
  53. </property>
  54. -->
  55. <property>
  56. <name>yarn.resourcemanager.scheduler.class</name>
  57. <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  58. </property>
  59. <property>
  60. <name>yarn.resourcemanager.recovery.enabled</name>
  61. <value>true</value>
  62. </property>
  63. <property>
  64. <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
  65. <value>5000</value>
  66. </property>
  67. <!-- ZKRMStateStore 配置 -->
  68. <property>
  69. <name>yarn.resourcemanager.store.class</name>
  70. <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  71. </property>
  72. <property>
  73. <name>yarn.resourcemanager.zk-address</name>
  74. <value>Zookeeper-01:2181,Zookeeper-02:2181,Zookeeper-03:2181</value>
  75. </property>
  76. <property>
  77. <name>yarn.resourcemanager.zk.state-store.address</name>
  78. <value>Zookeeper-01:2181,Zookeeper-02:2181,Zookeeper-03:2181</value>
  79. </property>
  80. <!-- Client访问RM的RPC地址 (applications manager interface) -->
  81. <property>
  82. <name>yarn.resourcemanager.address.rm1</name>
  83. <value>Hadoop-NN-01:23140</value>
  84. </property>
  85. <property>
  86. <name>yarn.resourcemanager.address.rm2</name>
  87. <value>Hadoop-NN-02:23140</value>
  88. </property>
  89. <!-- AM访问RM的RPC地址(scheduler interface) -->
  90. <property>
  91. <name>yarn.resourcemanager.scheduler.address.rm1</name>
  92. <value>Hadoop-NN-01:23130</value>
  93. </property>
  94. <property>
  95. <name>yarn.resourcemanager.scheduler.address.rm2</name>
  96. <value>Hadoop-NN-02:23130</value>
  97. </property>
  98. <!-- RM admin interface -->
  99. <property>
  100. <name>yarn.resourcemanager.admin.address.rm1</name>
  101. <value>Hadoop-NN-01:23141</value>
  102. </property>
  103. <property>
  104. <name>yarn.resourcemanager.admin.address.rm2</name>
  105. <value>Hadoop-NN-02:23141</value>
  106. </property>
  107. <!--NM访问RM的RPC端口 -->
  108. <property>
  109. <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
  110. <value>Hadoop-NN-01:23125</value>
  111. </property>
  112. <property>
  113. <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
  114. <value>Hadoop-NN-02:23125</value>
  115. </property>
  116. <!-- RM web application 地址 -->
  117. <property>
  118. <name>yarn.resourcemanager.webapp.address.rm1</name>
  119. <value>Hadoop-NN-01:23188</value>
  120. </property>
  121. <property>
  122. <name>yarn.resourcemanager.webapp.address.rm2</name>
  123. <value>Hadoop-NN-02:23188</value>
  124. </property>
  125. <property>
  126. <name>yarn.resourcemanager.webapp.https.address.rm1</name>
  127. <value>Hadoop-NN-01:23189</value>
  128. </property>
  129. <property>
  130. <name>yarn.resourcemanager.webapp.https.address.rm2</name>
  131. <value>Hadoop-NN-02:23189</value>
  132. </property>
  133. </configuration>

(7)修改slaves

[zero@CentOS-Cluster-01  hadoop]$ vi slaves

Hadoop-DN-01

Hadoop-DN-02

Hadoop-DN-03

3)分发程序

[zero@CentOS-Cluster-01 ~]$ scp -r /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1 zero@Hadoop-NN-02: /home/zero/hadoop/

…….

[zero@CentOS-Cluster-01 ~]$ scp -r /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1 zero@Hadoop-DN-01: /home/zero/hadoop/

…….

[zero@CentOS-Cluster-01 ~]$ scp -r /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1 zero@Hadoop-DN-02: /home/zero/hadoop/

…….

[zero@CentOS-Cluster-01 ~]$ scp -r /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1 zero@Hadoop-DN-03: /home/zero/hadoop/

…….

4)启动HDFS

(1)启动JournalNode

格式化前需要在JournalNode结点上启动JournalNode:

[zero@CentOS-Cluster-03 hadoop-2.3.0-cdh5.0.1]$ hadoop-daemon.sh start journalnode

starting journalnode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-journalnode-BigData-03.out

验证JournalNode:

[zero@CentOS-Cluster-03 hadoop-2.3.0-cdh5.0.1]$ jps

25918 QuorumPeerMain

16728 JournalNode

16772 Jps

(2)NameNode 格式化:

结点Hadoop-NN-01:hdfs namenode -format

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ hdfs namenode -format

14/06/23 21:02:49 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = CentOS-Cluster-01/8.8.8.11

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 2.3.0-cdh5.0.1

STARTUP_MSG:   classpath =…

STARTUP_MSG:   build =…

STARTUP_MSG:   java = 1.7.0_60

************************************************************/

(…略…)

14/06/23 21:02:57 INFO common.Storage: Storage directory /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/data/dfs/name has been successfully formatted. (…略…)

14/06/23 21:02:59 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at CentOS-Cluster-01/8.8.8.11

************************************************************/

(3)同步NameNode元数据:

同步Hadoop-NN-01元数据到Hadoop-NN-02

主要是:dfs.namenode.name.dir,dfs.namenode.edits.dir还应该确保共享存储目录下(dfs.namenode.shared.edits.dir ) 包含NameNode 所有的元数据。

[puppet@BigData-01 hadoop-2.0.0-cdh4.4.0]$ scp -r data/ zero@Hadoop-NN-02:/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1

seen_txid                           100%    2     0.0KB/s   00:00

VERSION                            100%  201     0.2KB/s   00:00

seen_txid                           100%    2     0.0KB/s   00:00

fsimage_0000000000000000000.md5    100%   62     0.1KB/s   00:00

fsimage_0000000000000000000         100%  121     0.1KB/s   00:00

VERSION                            100%  201     0.2KB/s   00:00

(4)初始化ZFCK:

创建ZNode,记录状态信息。

结点Hadoop-NN-01:hdfs zkfc -formatZK

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ hdfs zkfc -formatZK

14/06/23 23:22:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

14/06/23 23:22:28 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at Hadoop-NN-01/8.8.8.11:8020

14/06/23 23:22:28 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh5.0.1--1, built on 05/06/2014 18:50 GMT

14/06/23 23:22:28 INFO zookeeper.ZooKeeper: Client environment:host.name=CentOS-Cluster-01

14/06/23 23:22:28 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60

14/06/23 23:22:28 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation

14/06/23 23:22:28 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/runtime/java/jdk1.7.0_60/jre

14/06/23 23:22:28 INFO zookeeper.ZooKeeper: Client environment:java.class.path=...

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/lib/native

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:os.arch=i386

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-431.el6.i686

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:user.name=zero

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/zero

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/zero/hadoop/hadoop-2.3.0-cdh5.0.1

14/06/23 23:22:29 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=Zookeeper-01:2181,Zookeeper-02:2181,Zookeeper-03:2181 sessionTimeout=2000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@150c2b0

14/06/23 23:22:29 INFO zookeeper.ClientCnxn: Opening socket connection to server CentOS-Cluster-03/8.8.8.13:2181. Will not attempt to authenticate using SASL (unknown error)

14/06/23 23:22:29 INFO zookeeper.ClientCnxn: Socket connection established to CentOS-Cluster-03/8.8.8.13:2181, initiating session

14/06/23 23:22:29 INFO zookeeper.ClientCnxn: Session establishment complete on server CentOS-Cluster-03/8.8.8.13:2181, sessionid = 0x146cc0517b30002, negotiated timeout = 4000

14/06/23 23:22:29 INFO ha.ActiveStandbyElector: Session connected.

14/06/23 23:22:30 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.

14/06/23 23:22:30 INFO zookeeper.ZooKeeper: Session: 0x146cc0517b30002 closed

(5)启动

集群启动法:Hadoop-NN-01:

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ start-dfs.sh

14/04/23 01:54:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [Hadoop-NN-01 Hadoop-NN-02]

Hadoop-NN-01: starting namenode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-namenode-BigData-01.out

Hadoop-NN-02: starting namenode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-namenode-BigData-02.out

Hadoop-DN-01: starting datanode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-datanode-BigData-03.out

Hadoop-DN-02: starting datanode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-datanode-BigData-04.out

Hadoop-DN-03: starting datanode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-datanode-BigData-05.out

Starting journal nodes [Hadoop-DN-01 Hadoop-DN-02 Hadoop-DN-03]

Hadoop-DN-01: starting journalnode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-journalnode-BigData-03.out

Hadoop-DN-03: starting journalnode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-journalnode-BigData-05.out

Hadoop-DN-02: starting journalnode, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-journalnode-BigData-04.out

14/04/23 01:55:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting ZK Failover Controllers on NN hosts [Hadoop-NN-01 Hadoop-NN-02]

Hadoop-NN-01: starting zkfc, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-zkfc-BigData-01.out

Hadoop-NN-02: starting zkfc, logging to /home/puppet/hadoop/cdh4.4/hadoop-2.0.0-cdh4.4.0/logs/hadoop-puppet-zkfc-BigData-02.out

单进程启动法:

<1>NameNode(Hadoop-NN-01,Hadoop-NN-02):

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ hadoop-daemon.sh start namenode

<2>DataNode(Hadoop-DN-01,Hadoop-DN-02,Hadoop-DN-03):

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ hadoop-daemon.sh start datanode

<3>JournalNode(Hadoop-DN-01,Hadoop-DN-02,Hadoop-DN-03):

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ hadoop-daemon.sh start journalnode

<4>ZKFC(Hadoop-NN-01,Hadoop-NN-02):

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ hadoop-daemon.sh start zkfc

(6)验证

<1>进程
NameNode:

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ jps

4001 NameNode

4290 DFSZKFailoverController

4415 Jps

DataNode:

[zero@CentOS-Cluster-03 hadoop-2.3.0-cdh5.0.1]$ jps

25918 QuorumPeerMain

19217 JournalNode

19143 DataNode

19351 Jps

<2>页面:

Active结点:http://8.8.8.10:50070



 JSP:



 StandBy结点:



 JSP:



 (7)停止:stop-dfs.sh

5)启动Yarn

(1)启动

<1>集群启动

Hadoop-NN-01启动Yarn,命令所在目录:$HADOOP_HOME/sbin

[zero@CentOS-Cluster-01 hadoop]$ start-yarn.sh

starting yarn daemons

starting resourcemanager, logging to /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/logs/yarn-zero-resourcemanager-CentOS-Cluster-01.out

Hadoop-DN-02: starting nodemanager, logging to /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/logs/yarn-zero-nodemanager-CentOS-Cluster-04.out

Hadoop-DN-03: starting nodemanager, logging to /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/logs/yarn-zero-nodemanager-CentOS-Cluster-05.out

Hadoop-DN-01: starting nodemanager, logging to /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/logs/yarn-zero-nodemanager-CentOS-Cluster-03.out

Hadoop-NN-02备机启动RM:

[zero@CentOS-Cluster-02 hadoop]$ yarn-daemon.sh start resourcemanager

starting resourcemanager, logging to /home/zero/hadoop/hadoop-2.3.0-cdh5.0.1/logs/yarn-zero-resourcemanager-CentOS-Cluster-02.o

<2>单进程启动

ResourceManager(Hadoop-NN-01,Hadoop-NN-02):

[zero@CentOS-Cluster-01 hadoop-2.3.0-cdh5.0.1]$ yarn-daemon.sh start resourcemanager

DataNode(Hadoop-DN-01,Hadoop-DN-02,Hadoop-DN-03):

[zero@CentOS-Cluster-03 hadoop-2.3.0-cdh5.0.1]$ yarn-daemon.sh start nodemanager

(2)验证

<1>进程:

JobTracker:Hadoop-NN-01,Hadoop-NN-02

[zero@CentOS-Cluster-01 hadoop]$ jps

7454 NameNode

14684 Jps

14617 ResourceManager

7729 DFSZKFailoverController

TaskTracker:Hadoop-DN-01,Hadoop-DN-02,Hadoop-DN-03

[zero@CentOS-Cluster-03 ~]$ jps

8965 Jps

5228 JournalNode

5137 DataNode

2514 QuorumPeerMain

8935 NodeManager

<2>页面

ResourceManger(Active):8.8.8.11:23188

ResourceManager(Standby):8.8.8.12:23188



 

(3)停止

Hadoop-NN-01:stop-yarn.sh

Hadoop-NN-02:yarn-daeman.sh stop resourcemanager

分享到:  

Hadoop-2.3.0-cdh5.0.1完全分布式环境搭建(NameNode,ResourceManager HA)的更多相关文章

  1. Hadoop2.6.0实践:001 伪分布式环境搭建

    ##################### Centos6.4VM_01_os.rar ################################################准备工作/opt ...

  2. Hadoop入门(五) Hadoop2.7.5集群分布式环境搭建

    本文接上文内容继续: server01 192.168.8.118 jdk.www.fengshen157.com/ hadoop NameNode.DFSZKFailoverController(z ...

  3. Hadoop2.7.3+Spark2.1.0 完全分布式环境 搭建全过程

    一.修改hosts文件 在主节点,就是第一台主机的命令行下; vim /etc/hosts 我的是三台云主机: 在原文件的基础上加上; ip1 master worker0 namenode ip2 ...

  4. Hadoop2.5.0伪分布式环境搭建

    本章主要介绍下在Linux系统下的Hadoop2.5.0伪分布式环境搭建步骤.首先要搭建Hadoop伪分布式环境,需要完成一些前置依赖工作,包括创建用户.安装JDK.关闭防火墙等. 一.创建hadoo ...

  5. hive-2.2.0 伪分布式环境搭建

    一,实验环境: 1, ubuntu server 16.04 2, jdk,1.8 3, hadoop 2.7.4 伪分布式环境或者集群模式 4, apache-hive-2.2.0-bin.tar. ...

  6. QT5.6.0 VS2013 Win764位系统QT环境搭建过程

    QT5.6.0 VS2013 Win764位系统QT环境搭建过程 没用过QT自己跟同事要了安装包,按照同事指导方法操作安装部署开发环境结果遇到好多问题,错误网上搜遍了所有帖子也没有找到合适的解决方案. ...

  7. 《OD大数据实战》Hadoop伪分布式环境搭建

    一.安装并配置Linux 8. 使用当前root用户创建文件夹,并给/opt/下的所有文件夹及文件赋予775权限,修改用户组为当前用户 mkdir -p /opt/modules mkdir -p / ...

  8. hadoop ——完全分布式环境搭建

    hadoop 完全分布式环境搭建 1.虚拟机角色分配: 192.168.44.184 hadoop02 NameNode/DataNode ResourceManager/NodeManager 19 ...

  9. Hadoop学习笔记(3)——分布式环境搭建

    Hadoop学习笔记(3) ——分布式环境搭建 前面,我们已经在单机上把Hadoop运行起来了,但我们知道Hadoop支持分布式的,而它的优点就是在分布上突出的,所以我们得搭个环境模拟一下. 在这里, ...

  10. 【转】Hadoop HDFS分布式环境搭建

    原文地址  http://blog.sina.com.cn/s/blog_7060fb5a0101cson.html Hadoop HDFS分布式环境搭建 最近选择给大家介绍Hadoop HDFS系统 ...

随机推荐

  1. 数据库基本表创建 完整性约束 foreign Key

    理解以下几张表的内容,根据实际情况设计属性名.数据类型.及各种完整性约束(primary key.foreign key.not null.unique.check),用数据定义语言实现,然后设计实验 ...

  2. Python3 内置函数补充匿名函数

    Python3 匿名函数 定义一个函数与变量的定义非常相似,对于有名函数,必须通过变量名访问 def func(x,y,z=1): return x+y+z print(func(1,2,3)) 匿名 ...

  3. hd acm2045

    LELE的RPG难题 析: 假设有N个方格时的涂法是F[N]种.当前边n-1个方格成立时,再加第n种颜色无影响,此时有F[N-1]种涂法,当n-1个方格违法时,即有两个相邻的格子颜色相同,则有n-2个 ...

  4. 国内ADSL用户的带宽一般都是1M、2M、3M的,理论上的下载速度分别是128K/S、256K/S、384K/S。

    国内ADSL用户的带宽一般都是1M.2M.3M的,理论上的下载速度分别是128K/S.256K/S.384K/S. 1024/8===128      2048/8==256

  5. scp的用法

    scp是有Security的文件copy,基于ssh登录.命令基本格式:scp [OPTIONS] file_source file_target OPTIONS:-v 和大多数 linux 命令中的 ...

  6. 统计apachelog各访问状态个数(使用MapReduce)

    统计日志文件中各访问状态的个数. 1.将日志数据上传到hdfs 路径 /mapreduce/data/apachelog/in 中 内容如下 ::::::: - - [/Feb/::: +] :::: ...

  7. SpringMVC 文件上传及下载

    首先需要导入jar包 创建一个jsp页面 package cn.happy.Controller; import java.io.File; import javax.servlet.http.Htt ...

  8. 四 Django框架,models.py模块,数据库操作——创建表、数据类型、索引、admin后台,补充Django目录说明以及全局配置文件配置

    Django框架,models.py模块,数据库操作——创建表.数据类型.索引.admin后台,补充Django目录说明以及全局配置文件配置 数据库配置 django默认支持sqlite,mysql, ...

  9. E比昨天更多的棒棒糖(Easy+Hrad)(华师网络赛)(DP||母函数||背包优化)

    Time limit per test: 2.0 seconds Memory limit: 512 megabytes 唐纳德先生的某女性朋友最近与唐纳德先生同居.该女性朋友携带一 baby.该 b ...

  10. bzoj 4500: 矩阵 差分约束系统

    题目: Description 有一个n*m的矩阵,初始每个格子的权值都为0,可以对矩阵执行两种操作: 选择一行, 该行每个格子的权值加1或减1. 选择一列, 该列每个格子的权值加1或减1. 现在有K ...