Hadoop HA的搭建
1.首先添加hosts文件
- vim /etc/hosts
- 192.168.0.1 MSJTVL-DSJC-H01
- 192.168.0.2 MSJTVL-DSJC-H03
- 192.168.0.3 MSJTVL-DSJC-H05
- 192.168.0.4 MSJTVL-DSJC-H02
- 192.168.0.5 MSJTVL-DSJC-H04
2.几台机器做互信
- Setup passphraseless ssh
- Now check that you can ssh to the localhost without a passphrase:
- $ ssh localhost
- If you cannot ssh to localhost without a passphrase, execute the following commands:
- $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
- $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
把其他几台机器的秘钥文件复制到MSJTVL-DSJC-H01的authorized_keys文件中
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H02:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub2
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H03:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub3
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H04:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub4
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H05:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub5
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub2 >> ~/.ssh/authorized_keys
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub3 >> ~/.ssh/authorized_keys
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub4 >> ~/.ssh/authorized_keys
- [hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub5 >> ~/.ssh/authorized_keys
以上操作实现了MSJTVL-DSJC-H02,3,4,5对MSJTVL-DSJC-H01的无密码登录
要是实现MSJTVL-DSJC-H01-5的全部互信则把MSJTVL-DSJC-H01上的authorized_keys文件COPY到其他机器上去
- [hadoop@MSJTVL-DSJC-H02 ~]$ scp hadoop@MSJTVL-DSJC-H01:/hadoop/.ssh/authorized_keys /hadoop/.ssh/authorized_keys
下载相应的tar包
- wget http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz
解压tar包并且建立相应的软链接
- [hadoop@MSJTVL-DSJC-H01 ~]$ tar -zxvf hadoop-2.6.4.tar.gz
- [hadoop@MSJTVL-DSJC-H01 ~]$ ln -sf hadoop-2.6.4 hadoop
进到hadoop相应的配置文件路径,修改hadoop-env.sh的内容
- [hadoop@MSJTVL-DSJC-H01 ~]$ cd hadoop/etc/hadoop/
- [hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hadoop-env.sh
修改hadoop-env.sh里java_home的参数信息
接下来修改hdfs-site.xml中的相关内容,来源http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
首先配置一个匿名服务dfs.nameservices
- [hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hdfs-site.xml
- <configuration>
- //配置服务的名称,可以进行相应的修改
- <property>
- <name>dfs.nameservices</name>
- <value>mycluster</value>
- </property>
- //配置namenode的名称,mycluster需要和前面的保持一致,nn1和nn2只是名称无所谓叫啥
- <property>
- <name>dfs.ha.namenodes.mycluster</name>
- <value>nn1,nn2</value>
- </property>
- //配置RPC协议的端口,两个namenode的RPC协议和端口,需要修改servicesname和value中的主机名称,MSJTVL-DSJC-H01和MSJTVL-DSJC-H02是两个namenode的主机名称
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn1</name>
- <value>MSJTVL-DSJC-H01:8020</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn2</name>
- <value>MSJTVL-DSJC-H02:8020</value>
- </property>
- //配置下面是http的主机和端口
- <property>
- <name>dfs.namenode.http-address.mycluster.nn1</name>
- <value>MSJTVL-DSJC-H01:50070</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.mycluster.nn2</name>
- <value>MSJTVL-DSJC-H02:50070</value>
- </property>
- //接下来配置的是JournalNodes的URL地址
- <property>
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://MSJTVL-DSJC-H03:8485;MSJTVL-DSJC-H04:8485;MSJTVL-DSJC-H05:8485/mycluster</value>
- </property>
- //然后是固定的一个客户端使用的类(需要修改serversname的名称),客户端通过这个类找到
- <property>
- <name>dfs.client.failover.proxy.provider.mycluster</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
- //sshfence - SSH to the Active NameNode and kill the process,注意为hadoop下.ssh目录中生成的秘钥文件
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>sshfence</value>
- </property>
- <property>
- <name>dfs.ha.fencing.ssh.private-key-files</name>
- <value>/hadoop/.ssh/id_dsa</value>
- </property>
- //JournalNodes的工作目录
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/hadoop/jn/data</value>
- </property>
- //开启自动切换namenode
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
- </configuration>
接下来编辑core-site.xml的配置文件
- //首先配置namenode的入口,同样注意serversname的名称
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://mycluster</value>
- </property>
- //配置zookeeper的集群
- <property>
- <name>ha.zookeeper.quorum</name>
- <value>MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181</value>
- </property>
- //hadoop的临时目录
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/hadoop/tmp</value>
- </property>
配置slaves
- MSJTVL-DSJC-H03
- MSJTVL-DSJC-H04
- MSJTVL-DSJC-H05
安装zookeeper
直接解压
修改相应的配置文件
- [zookeeper@MSJTVL-DSJC-H03 conf]$ vim zoo.cfg
- //修改dataDir=/opt/zookeeper/data,不要放到tmp下
- dataDir=/opt/zookeeper/data
- #autopurge.purgeInterval=1
- server.1=MSJTVL-DSJC-H03:2888:3888
- server.2=MSJTVL-DSJC-H04:2888:3888
- server.3=MSJTVL-DSJC-H05:2888:3888
- 在/opt/zookeeper/data下建立myid里面存储跟server一样的数字
启动zookeeper(zkServer.sh start),jps查看启动状态
启动HA集群
1.首先启动JournalNodes,到sbin目录下
./hadoop-daemon.sh start journalnode
- [hadoop@MSJTVL-DSJC-H03 sbin]$ ./hadoop-daemon.sh start journalnode
- starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out
- [hadoop@MSJTVL-DSJC-H03 sbin]$ jps
- 3204 JournalNode
- 3252 Jps
- [hadoop@MSJTVL-DSJC-H03 sbin]$
2.在一台namenode上进行格式化
- [hadoop@MSJTVL-DSJC-H01 bin]$ ./hdfs namenode -format
初始化之后会在/hadoop/tmp/dfs/name/current下产生相应的元数据文件
- [hadoop@MSJTVL-DSJC-H01 ~]$ cd tmp/
- [hadoop@MSJTVL-DSJC-H01 tmp]$ ll
- 总用量 4
- drwxr-xr-x. 3 hadoop hadoop 4096 9月 6 16:54 dfs
- [hadoop@MSJTVL-DSJC-H01 tmp]$ cd dfs/
- [hadoop@MSJTVL-DSJC-H01 dfs]$ ll
- 总用量 4
- drwxr-xr-x. 3 hadoop hadoop 4096 9月 6 16:54 name
- [hadoop@MSJTVL-DSJC-H01 dfs]$ cd name/
- [hadoop@MSJTVL-DSJC-H01 name]$ ll
- 总用量 4
- drwxr-xr-x. 2 hadoop hadoop 4096 9月 6 16:54 current
- [hadoop@MSJTVL-DSJC-H01 name]$ cd current/
- [hadoop@MSJTVL-DSJC-H01 current]$ ll
- 总用量 16
- -rw-r--r--. 1 hadoop hadoop 352 9月 6 16:54 fsimage_0000000000000000000
- -rw-r--r--. 1 hadoop hadoop 62 9月 6 16:54 fsimage_0000000000000000000.md5
- -rw-r--r--. 1 hadoop hadoop 2 9月 6 16:54 seen_txid
- -rw-r--r--. 1 hadoop hadoop 201 9月 6 16:54 VERSION
- [hadoop@MSJTVL-DSJC-H01 current]$ pwd
- /hadoop/tmp/dfs/name/current
- [hadoop@MSJTVL-DSJC-H01 current]$
3.把初始化的元数据文件COPY到其他的namenode上去,COPY之前需要先启动格式化的namenode
- [hadoop@MSJTVL-DSJC-H01 sbin]$ ./hadoop-daemon.sh start namenode
- starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out
- [hadoop@MSJTVL-DSJC-H01 sbin]$ jps
- 3324 NameNode
- 3396 Jps
- [hadoop@MSJTVL-DSJC-H01 sbin]$
然后在没有格式化的namenode上执行hdfs namenode -bootstrapStandby,执行完后查看元数据文件是一样的表示成功。
- [hadoop@MSJTVL-DSJC-H02 bin]$ hdfs namenode -bootstrapStandby
4.初始化ZKFC,在任意一台机器上执行hdfs zkfc -formatZK初始化ZKFC
5.重启整个HDFS集群
- [hadoop@MSJTVL-DSJC-H01 sbin]$ ./start-dfs.sh
- 16/09/06 17:10:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- Starting namenodes on [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]
- MSJTVL-DSJC-H02: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H02.out
- MSJTVL-DSJC-H01: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out
- MSJTVL-DSJC-H03: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H03.out
- MSJTVL-DSJC-H04: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H04.out
- MSJTVL-DSJC-H05: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H05.out
- Starting journal nodes [MSJTVL-DSJC-H03 MSJTVL-DSJC-H04 MSJTVL-DSJC-H05]
- MSJTVL-DSJC-H03: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out
- MSJTVL-DSJC-H04: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H04.out
- MSJTVL-DSJC-H05: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H05.out
- 16/09/06 17:10:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- Starting ZK Failover Controllers on NN hosts [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]
- MSJTVL-DSJC-H02: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H02.out
- MSJTVL-DSJC-H01: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H01.out
- [hadoop@MSJTVL-DSJC-H01 sbin]$ jps
- 4345 Jps
- 4279 DFSZKFailoverController
- 3993 NameNode
6.创建一个目录
- ./hdfs dfs -mkdir -p /usr/file
./hdfs dfs -put /hadoop/tian.txt /usr/file
放上一个文件可以在网页中查看相应的文件。
MR高可用
配置yarn-site.xml
- <configuration>
- <!--启用RM高可用-->
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
- <!--RM集群标识符-->
- <property>
- <name>yarn.resourcemanager.cluster-id</name>
- <value>rm-cluster</value>
- </property>
- <property>
- <!--指定两台RM主机名标识符-->
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>rm1,rm2</value>
- </property>
- <!--RM故障自动切换-->
- <property>
- <name>yarn.resourcemanager.ha.automatic-failover.recover.enabled</name>
- <value>true</value>
- </property>
- <!--RM故障自动恢复-->
- <property>
- <name>yarn.resourcemanager.recovery.enabled</name>
- <value>true</value>
- </property> -->
- <!--RM主机1-->
- <property>
- <name>yarn.resourcemanager.hostname.rm1</name>
- <value>MSJTVL-DSJC-H01</value>
- </property>
- <!--RM主机2-->
- <property>
- <name>yarn.resourcemanager.hostname.rm2</name>
- <value>MSJTVL-DSJC-H02</value>
- </property>
- <!--RM状态信息存储方式,一种基于内存(MemStore),另一种基于ZK(ZKStore)-->
- <property>
- <name>yarn.resourcemanager.store.class</name>
- <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
- </property>
- <!--使用ZK集群保存状态信息-->
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181</value>
- </property>
- <!--向RM调度资源地址-->
- <property>
- <name>yarn.resourcemanager.scheduler.address.rm1</name>
- <value>MSJTVL-DSJC-H01:8030</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address.rm2</name>
- <value>MSJTVL-DSJC-H02:8030</value>
- </property>
- <!--NodeManager通过该地址交换信息-->
- <property>
- <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
- <value>MSJTVL-DSJC-H01:8031</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
- <value>MSJTVL-DSJC-H02:8031</value>
- </property>
- <!--客户端通过该地址向RM提交对应用程序操作-->
- <property>
- <name>yarn.resourcemanager.address.rm1</name>
- <value>MSJTVL-DSJC-H01:8032</value>
- </property>
- <property>
- <name>yarn.resourcemanager.address.rm2</name>
- <value>MSJTVL-DSJC-H02:8032</value>
- </property>
- <!--管理员通过该地址向RM发送管理命令-->
- <property>
- <name>yarn.resourcemanager.admin.address.rm1</name>
- <value>MSJTVL-DSJC-H01:8033</value>
- </property>
- <property>
- <name>yarn.resourcemanager.admin.address.rm2</name>
- <value>MSJTVL-DSJC-H02:8033</value>
- </property>
- <!--RM HTTP访问地址,查看集群信息-->
- <property>
- <name>yarn.resourcemanager.webapp.address.rm1</name>
- <value>MSJTVL-DSJC-H01:8088</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address.rm2</name>
- <value>MSJTVL-DSJC-H02:8088</value>
- </property>
- </configuration>
配置mapred-site.xml
- //指定mr框架为yarn方式
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
standby的MR需要手动启动
- [hadoop@MSJTVL-DSJC-H02 sbin]$ yarn-daemon.sh start resourcemanager
- starting resourcemanager, logging to /hadoop/hadoop-2.6.4/logs/yarn-hadoop-resourcemanager-MSJTVL-DSJC-H02.out
- [hadoop@MSJTVL-DSJC-H02 sbin]$ jps
- 3000 ResourceManager
- 2812 NameNode
- 3055 Jps
- 2922 DFSZKFailoverController
- [hadoop@MSJTVL-DSJC-H02 sbin]$
Hadoop HA的搭建的更多相关文章
- HBase HA + Hadoop HA 搭建
HBase 使用的是 1.2.9 的版本. Hadoop HA 的搭建见我的另外一篇:Hadoop 2.7.3 HA 搭建及遇到的一些问题 以下目录均为 HBase 解压后的目录. 1. 修改 co ...
- Spark HA 的搭建
接hadoop HA的搭建,因为你zookeeper已经部署完成,所以直接安装spark就可以 tar –xzf spark-1.6.1-bin-hadoop2.6.tgz -C ../service ...
- Hadoop_33_Hadoop HA的搭建
Hadoop HA的搭建,可参考链接:https://blog.csdn.net/mrbcy/article/details/64939623 说明: 1.在hadoop2.0中通常由两个Nam ...
- 攻城狮在路上(陆)-- hadoop分布式环境搭建(HA模式)
一.环境说明: 操作系统:Centos6.5 Linux node1 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 ...
- Hadoop HA高可用集群搭建(Hadoop+Zookeeper+HBase)
声明:作者原创,转载注明出处. 作者:帅气陈吃苹果 一.服务器环境 主机名 IP 用户名 密码 安装目录 master188 192.168.29.188 hadoop hadoop /home/ha ...
- hadoop HA分布式集群搭建
概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...
- hadoop完全分布式搭建HA(高可用)
2018年03月25日 16:25:26 D调的Stanley 阅读数:2725 标签: hadoop HAssh免密登录hdfs HA配置hadoop完全分布式搭建zookeeper 配置 更多 个 ...
- Hadoop生产环境搭建(含HA、Federation)
Hadoop生产环境搭建 1. 将安装包hadoop-2.x.x.tar.gz存放到某一目录下,并解压. 2. 修改解压后的目录中的文件夹etc/hadoop下的配置文件(若文件不存在,自己创建.) ...
- 1、hadoop HA分布式集群搭建
概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...
随机推荐
- python logging模块使用
近来再弄一个小项目,已经到收尾阶段了.希望加入写log机制来增加程序出错后的判断分析.尝试使用了python logging模块. #-*- coding:utf-8 -*- import loggi ...
- JS 操作Dom节点之CURD
许多优秀的Javascript库,已经封装好了丰富的Dom操作函数,这可以加快项目开发效率.但是对于非常注重网页性能的项目来说,使用Dom的原生操作方法还是必要的. 1. 查找节点 document. ...
- 转:Zend Framework 2.0 分析
文章来自于:http://bbs.phpchina.com/thread-268362-1-1.html ZF2已经发布,与ZF1相比,MVC这一模块内部的实现机制可谓大相径庭,许多用过ZF1的PHP ...
- 最近国外很拉风的,,基于.net 的一个手表
site:http://agentwatches.com/ 这个项目是一个国外工作室,筹集资金 创立的. 直接用c# 代码编译显示在手机上.能和智能手机通信等. 并且是开源的. 很酷 其次.它提供了. ...
- poj 1364
http://poj.org/problem?id=1364 #include<cstdio> #include<cstring> #include<algorithm& ...
- COM实践经验
1. COM不能单独建立,必须有一个Delphi工程的实体,EXE或者DLL都行 2. 自动生成Project1_TLB.pas文件 3. 自动生成Unit2.pas文件,其中最重要的包含内容有: i ...
- VMware 11安装Mac OS X 10.10
http://jingyan.baidu.com/article/ff411625b9011212e48237b4.html
- HDOJ(HDU) 1877 又一版 A+B(进制、、)
Problem Description 输入两个不超过整型定义的非负10进制整数A和B(<=231-1),输出A+B的m (1 < m <10)进制数. Input 输入格式:测试输 ...
- 【转】SqlLite .Net 4.0 System.IO.FileLoadException”类型的未经处理的异常出现在XXX
原文地址:http://www.csharpcity.com/2010/sqlite-ado-net-c-4-0/ ---------------------- 解决方法: Paste the fol ...
- 【用PS3手柄在安卓设备上玩游戏系列】谈安卓游戏对手柄的支持
不同的游戏对于手柄的支持程度是不一样的,对应所需要进行的手柄设置也不尽相同.我没有这样的时间和精力,针对每一款游戏去写博客,但找出不同游戏中的共同点,针对同一类的游戏去写博客,应该是可行的.我把安卓上 ...