1.首先添加hosts文件

vim /etc/hosts

192.168.0.1  MSJTVL-DSJC-H01
192.168.0.2 MSJTVL-DSJC-H03
192.168.0.3 MSJTVL-DSJC-H05
192.168.0.4 MSJTVL-DSJC-H02
192.168.0.5 MSJTVL-DSJC-H04

2.几台机器做互信

Setup passphraseless ssh

Now check that you can ssh to the localhost without a passphrase:

  $ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands: $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

把其他几台机器的秘钥文件复制到MSJTVL-DSJC-H01的authorized_keys文件中

[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H02:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub2
[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H03:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub3
[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H04:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub4
[hadoop@MSJTVL-DSJC-H01 .ssh]$ scp hadoop@MSJTVL-DSJC-H05:/hadoop/.ssh/id_dsa.pub ./id_dsa.pub5 [hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub2 >> ~/.ssh/authorized_keys
[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub3 >> ~/.ssh/authorized_keys
[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub4 >> ~/.ssh/authorized_keys
[hadoop@MSJTVL-DSJC-H01 .ssh]$ cat ~/.ssh/id_dsa.pub5 >> ~/.ssh/authorized_keys

以上操作实现了MSJTVL-DSJC-H02,3,4,5对MSJTVL-DSJC-H01的无密码登录

要是实现MSJTVL-DSJC-H01-5的全部互信则把MSJTVL-DSJC-H01上的authorized_keys文件COPY到其他机器上去

[hadoop@MSJTVL-DSJC-H02 ~]$ scp hadoop@MSJTVL-DSJC-H01:/hadoop/.ssh/authorized_keys /hadoop/.ssh/authorized_keys

 

下载相应的tar包

wget http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

解压tar包并且建立相应的软链接

[hadoop@MSJTVL-DSJC-H01 ~]$ tar -zxvf hadoop-2.6.4.tar.gz
[hadoop@MSJTVL-DSJC-H01 ~]$ ln -sf hadoop-2.6.4 hadoop

进到hadoop相应的配置文件路径,修改hadoop-env.sh的内容

[hadoop@MSJTVL-DSJC-H01 ~]$ cd hadoop/etc/hadoop/
[hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hadoop-env.sh

修改hadoop-env.sh里java_home的参数信息

接下来修改hdfs-site.xml中的相关内容,来源http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

首先配置一个匿名服务dfs.nameservices

[hadoop@MSJTVL-DSJC-H01 hadoop]$ vim hdfs-site.xml
<configuration>
//配置服务的名称,可以进行相应的修改
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property> //配置namenode的名称,mycluster需要和前面的保持一致,nn1和nn2只是名称无所谓叫啥
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property> //配置RPC协议的端口,两个namenode的RPC协议和端口,需要修改servicesname和value中的主机名称,MSJTVL-DSJC-H01和MSJTVL-DSJC-H02是两个namenode的主机名称
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>MSJTVL-DSJC-H01:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>MSJTVL-DSJC-H02:8020</value>
</property> //配置下面是http的主机和端口
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>MSJTVL-DSJC-H01:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>MSJTVL-DSJC-H02:50070</value>
</property> //接下来配置的是JournalNodes的URL地址
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://MSJTVL-DSJC-H03:8485;MSJTVL-DSJC-H04:8485;MSJTVL-DSJC-H05:8485/mycluster</value>
</property> //然后是固定的一个客户端使用的类(需要修改serversname的名称),客户端通过这个类找到
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property> //sshfence - SSH to the Active NameNode and kill the process,注意为hadoop下.ssh目录中生成的秘钥文件
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/hadoop/.ssh/id_dsa</value>
</property> //JournalNodes的工作目录
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hadoop/jn/data</value>
</property> //开启自动切换namenode
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>

接下来编辑core-site.xml的配置文件

//首先配置namenode的入口,同样注意serversname的名称
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property> //配置zookeeper的集群
<property>
<name>ha.zookeeper.quorum</name>
<value>MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181</value>
</property> //hadoop的临时目录
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
</property>

配置slaves

MSJTVL-DSJC-H03
MSJTVL-DSJC-H04
MSJTVL-DSJC-H05

安装zookeeper

直接解压

修改相应的配置文件

[zookeeper@MSJTVL-DSJC-H03 conf]$ vim zoo.cfg
//修改dataDir=/opt/zookeeper/data,不要放到tmp下
dataDir=/opt/zookeeper/data #autopurge.purgeInterval=1
server.1=MSJTVL-DSJC-H03:2888:3888
server.2=MSJTVL-DSJC-H04:2888:3888
server.3=MSJTVL-DSJC-H05:2888:3888 在/opt/zookeeper/data下建立myid里面存储跟server一样的数字

启动zookeeper(zkServer.sh start),jps查看启动状态

启动HA集群

1.首先启动JournalNodes,到sbin目录下

./hadoop-daemon.sh start journalnode

[hadoop@MSJTVL-DSJC-H03 sbin]$ ./hadoop-daemon.sh start journalnode
starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out
[hadoop@MSJTVL-DSJC-H03 sbin]$ jps
3204 JournalNode
3252 Jps
[hadoop@MSJTVL-DSJC-H03 sbin]$

2.在一台namenode上进行格式化

[hadoop@MSJTVL-DSJC-H01 bin]$ ./hdfs namenode -format

初始化之后会在/hadoop/tmp/dfs/name/current下产生相应的元数据文件

[hadoop@MSJTVL-DSJC-H01 ~]$ cd tmp/
[hadoop@MSJTVL-DSJC-H01 tmp]$ ll
总用量 4
drwxr-xr-x. 3 hadoop hadoop 4096 9月 6 16:54 dfs
[hadoop@MSJTVL-DSJC-H01 tmp]$ cd dfs/
[hadoop@MSJTVL-DSJC-H01 dfs]$ ll
总用量 4
drwxr-xr-x. 3 hadoop hadoop 4096 9月 6 16:54 name
[hadoop@MSJTVL-DSJC-H01 dfs]$ cd name/
[hadoop@MSJTVL-DSJC-H01 name]$ ll
总用量 4
drwxr-xr-x. 2 hadoop hadoop 4096 9月 6 16:54 current
[hadoop@MSJTVL-DSJC-H01 name]$ cd current/
[hadoop@MSJTVL-DSJC-H01 current]$ ll
总用量 16
-rw-r--r--. 1 hadoop hadoop 352 9月 6 16:54 fsimage_0000000000000000000
-rw-r--r--. 1 hadoop hadoop 62 9月 6 16:54 fsimage_0000000000000000000.md5
-rw-r--r--. 1 hadoop hadoop 2 9月 6 16:54 seen_txid
-rw-r--r--. 1 hadoop hadoop 201 9月 6 16:54 VERSION
[hadoop@MSJTVL-DSJC-H01 current]$ pwd
/hadoop/tmp/dfs/name/current
[hadoop@MSJTVL-DSJC-H01 current]$

3.把初始化的元数据文件COPY到其他的namenode上去,COPY之前需要先启动格式化的namenode

[hadoop@MSJTVL-DSJC-H01 sbin]$ ./hadoop-daemon.sh start namenode
starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out
[hadoop@MSJTVL-DSJC-H01 sbin]$ jps
3324 NameNode
3396 Jps
[hadoop@MSJTVL-DSJC-H01 sbin]$

然后在没有格式化的namenode上执行hdfs namenode -bootstrapStandby,执行完后查看元数据文件是一样的表示成功。

[hadoop@MSJTVL-DSJC-H02 bin]$ hdfs namenode -bootstrapStandby

  

4.初始化ZKFC,在任意一台机器上执行hdfs zkfc -formatZK初始化ZKFC

5.重启整个HDFS集群

[hadoop@MSJTVL-DSJC-H01 sbin]$ ./start-dfs.sh
16/09/06 17:10:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]
MSJTVL-DSJC-H02: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H02.out
MSJTVL-DSJC-H01: starting namenode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-namenode-MSJTVL-DSJC-H01.out
MSJTVL-DSJC-H03: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H03.out
MSJTVL-DSJC-H04: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H04.out
MSJTVL-DSJC-H05: starting datanode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-datanode-MSJTVL-DSJC-H05.out
Starting journal nodes [MSJTVL-DSJC-H03 MSJTVL-DSJC-H04 MSJTVL-DSJC-H05]
MSJTVL-DSJC-H03: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H03.out
MSJTVL-DSJC-H04: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H04.out
MSJTVL-DSJC-H05: starting journalnode, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-journalnode-MSJTVL-DSJC-H05.out
16/09/06 17:10:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [MSJTVL-DSJC-H01 MSJTVL-DSJC-H02]
MSJTVL-DSJC-H02: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H02.out
MSJTVL-DSJC-H01: starting zkfc, logging to /hadoop/hadoop-2.6.4/logs/hadoop-hadoop-zkfc-MSJTVL-DSJC-H01.out
[hadoop@MSJTVL-DSJC-H01 sbin]$ jps
4345 Jps
4279 DFSZKFailoverController
3993 NameNode

6.创建一个目录

./hdfs dfs -mkdir -p /usr/file
./hdfs dfs -put /hadoop/tian.txt /usr/file

放上一个文件可以在网页中查看相应的文件。

MR高可用

配置yarn-site.xml

<configuration>
<!--启用RM高可用--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--RM集群标识符--> <property> <name>yarn.resourcemanager.cluster-id</name> <value>rm-cluster</value> </property> <property> <!--指定两台RM主机名标识符--> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!--RM故障自动切换--> <property> <name>yarn.resourcemanager.ha.automatic-failover.recover.enabled</name> <value>true</value> </property> <!--RM故障自动恢复--> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> --> <!--RM主机1--> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>MSJTVL-DSJC-H01</value> </property> <!--RM主机2--> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>MSJTVL-DSJC-H02</value> </property> <!--RM状态信息存储方式,一种基于内存(MemStore),另一种基于ZK(ZKStore)--> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <!--使用ZK集群保存状态信息--> <property> <name>yarn.resourcemanager.zk-address</name> <value>MSJTVL-DSJC-H03:2181,MSJTVL-DSJC-H04:2181,MSJTVL-DSJC-H05:2181</value> </property> <!--向RM调度资源地址--> <property> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>MSJTVL-DSJC-H01:8030</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>MSJTVL-DSJC-H02:8030</value> </property> <!--NodeManager通过该地址交换信息--> <property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>MSJTVL-DSJC-H01:8031</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>MSJTVL-DSJC-H02:8031</value> </property> <!--客户端通过该地址向RM提交对应用程序操作--> <property> <name>yarn.resourcemanager.address.rm1</name> <value>MSJTVL-DSJC-H01:8032</value> </property> <property> <name>yarn.resourcemanager.address.rm2</name> <value>MSJTVL-DSJC-H02:8032</value> </property> <!--管理员通过该地址向RM发送管理命令--> <property> <name>yarn.resourcemanager.admin.address.rm1</name> <value>MSJTVL-DSJC-H01:8033</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm2</name> <value>MSJTVL-DSJC-H02:8033</value> </property> <!--RM HTTP访问地址,查看集群信息--> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>MSJTVL-DSJC-H01:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>MSJTVL-DSJC-H02:8088</value> </property> </configuration>

  

配置mapred-site.xml

//指定mr框架为yarn方式
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

standby的MR需要手动启动

[hadoop@MSJTVL-DSJC-H02 sbin]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /hadoop/hadoop-2.6.4/logs/yarn-hadoop-resourcemanager-MSJTVL-DSJC-H02.out
[hadoop@MSJTVL-DSJC-H02 sbin]$ jps
3000 ResourceManager
2812 NameNode
3055 Jps
2922 DFSZKFailoverController
[hadoop@MSJTVL-DSJC-H02 sbin]$

  

  

 

Hadoop HA的搭建的更多相关文章

  1. HBase HA + Hadoop HA 搭建

    HBase 使用的是 1.2.9 的版本.  Hadoop HA 的搭建见我的另外一篇:Hadoop 2.7.3 HA 搭建及遇到的一些问题 以下目录均为 HBase 解压后的目录. 1. 修改 co ...

  2. Spark HA 的搭建

    接hadoop HA的搭建,因为你zookeeper已经部署完成,所以直接安装spark就可以 tar –xzf spark-1.6.1-bin-hadoop2.6.tgz -C ../service ...

  3. Hadoop_33_Hadoop HA的搭建

    Hadoop HA的搭建,可参考链接:https://blog.csdn.net/mrbcy/article/details/64939623 说明:    1.在hadoop2.0中通常由两个Nam ...

  4. 攻城狮在路上(陆)-- hadoop分布式环境搭建(HA模式)

    一.环境说明: 操作系统:Centos6.5 Linux node1 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 ...

  5. Hadoop HA高可用集群搭建(Hadoop+Zookeeper+HBase)

    声明:作者原创,转载注明出处. 作者:帅气陈吃苹果 一.服务器环境 主机名 IP 用户名 密码 安装目录 master188 192.168.29.188 hadoop hadoop /home/ha ...

  6. hadoop HA分布式集群搭建

    概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...

  7. hadoop完全分布式搭建HA(高可用)

    2018年03月25日 16:25:26 D调的Stanley 阅读数:2725 标签: hadoop HAssh免密登录hdfs HA配置hadoop完全分布式搭建zookeeper 配置 更多 个 ...

  8. Hadoop生产环境搭建(含HA、Federation)

    Hadoop生产环境搭建 1. 将安装包hadoop-2.x.x.tar.gz存放到某一目录下,并解压. 2. 修改解压后的目录中的文件夹etc/hadoop下的配置文件(若文件不存在,自己创建.) ...

  9. 1、hadoop HA分布式集群搭建

    概述 hadoop2中NameNode可以有多个(目前只支持2个).每一个都有相同的职能.一个是active状态的,一个是standby状态的.当集群运行时,只有active状态的NameNode是正 ...

随机推荐

  1. websphere内存溢出,手动导出was的phd和javacore文件

    网上有很多方法,ibm官方也提供了.但是,好奇怪,好像只有百度博客的一片文章提出要先设置环境条目或定制属性,否则命令不生效. 所以,转载博客的时候,你最好自己尝试一下,要不然你就是在害人害己!我测试了 ...

  2. php 使用phpqrcode类生成带有logo的二维码 使logo不失真(透明)

    在开发中 发现phpqrcode类在加入logo时,如果 logo 是 png 图像带有透明区域时,二维码上都无法正常完美的显示出来 解决方法便是:修改phpqrcode文件中的 QRimage类下的 ...

  3. python种的builtin函数详解-第三篇

    exec_stmt ::= "exec" or_expr ["in" expression ["," expression]] eval(e ...

  4. python中的formatter的详细用法

    今天抽空学习了一下python中的string service中的formatter的相关用法,主要是为了让自己的代码看起来更加和谐,因为很多java或者c语言过来的开发者都不怎么爱使用python的 ...

  5. 省队集训day6 C

    Description 给定平面上的 N 个点, 其中有一些是红的, 其他是蓝的.现在让你找两条平行的直线, 使得在保证    不存在一个蓝色的点 被夹在两条平行线之间,不经过任何一个点, 不管是蓝色 ...

  6. 『Python』爬行搜索引擎结果获得指定主机二级域名及IP信息

    0x 00 前言 前天自己在玩的时候,自己通过百度搜索主机的二级域名感觉好麻烦,自已要一页页的去翻 而且人工识别是否是重复的二级域名也够蛋疼的,正好最近在学正则表达式,权当练手了 0x 00 代码 # ...

  7. AnimationDrawable 资源

    AnimationDrawable代表一个动画,Android 既支持传统的逐帧动画(类 似于电影方式,一张图片.一张图片地切换),也支持通过平移.变换计算出来的补间动画. 下面以补间动画为例来介绍如 ...

  8. 7.1.1.关闭WebSocket连接

    7.1.定义 7.1.1.关闭WebSocket连接 为_关闭WebSocket连接_,端点需关闭底层TCP连接.端点应该使用一个方法完全地关闭TCP连接,以及TLS会话,如果合适,丢弃任何可能已经接 ...

  9. 那些SQL语句

    根据book_id,class_id确定老师uid select user_id from lessons left join book on lessons.lesson_id = book.les ...

  10. C#之VS2010开发Web Service

    一:创建web service vs2010软件默认的framework是4.0版本,所以想创建web服务的时候压根看不到web服务应用程序.网上有人说vs2010的web service 跟wcf合 ...