hadoop HA架构安装部署(QJM HA)
###################HDFS High Availability Using the Quorum Journal Manager################################
规划集群
db01 db02 db03 db04 db05
namenode namenode
journalnode journalnode journalnode
datanode datanode datanode datanode datanode
编辑配置文件core-site.xml:
[hadoop@db01 hadoop]$ cat core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.5.0/data/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>7000</value>
</property>
</configuration>
编辑配置文件hdfs-site.xml:
[hadoop@db01 hadoop]$ cat hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>db01:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>db02:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>db01:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>db02:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://db01:8485;db02:8485;db03:8485/ns1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/local/hadoop-2.5.0/data/dfs/jn</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_dsa</value>
</property>
</configuration>
同步文件:
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db02:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1140 1.1KB/s 00:00
hdfs-site.xml 100% 2067 2.0KB/s 00:00
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db03:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1140 1.1KB/s 00:00
hdfs-site.xml 100% 2067 2.0KB/s 00:00
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db04:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1140 1.1KB/s 00:00
hdfs-site.xml 100% 2067 2.0KB/s 00:00
[hadoop@db01 hadoop]$ scp core-site.xml hdfs-site.xml hadoop@db05:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1140 1.1KB/s 00:00
hdfs-site.xml
启动集群:
1)启动journalnode服务
[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db01.out
[hadoop@db01 hadoop-2.5.0]$ jps
738 Jps
688 JournalNode
[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out
[hadoop@db02 hadoop-2.5.0]$ jps
16813 Jps
16763 JournalNode
[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out
[hadoop@db02 hadoop-2.5.0]$ jps
16813 Jps
16763 JournalNode
2)格式化hdfs文件系统
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs namenode -format
在nn1上启动namenode:
[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db01.out
3)在nn2节点上同步nn1节点元数据(也可以直接cp元数据)
[hadoop@db02 hadoop-2.5.0]$ bin/hdfs namenode -bootstrapStandby
4)启动nn2上的namenode服务
[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db02.out
5)启动所有的datanode服务
[hadoop@db01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db01.out
[hadoop@db01 hadoop-2.5.0]$ jps
1255 DataNode
1001 NameNode
1339 Jps
688 JournalNode
[hadoop@db02 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db02.out
[hadoop@db02 hadoop-2.5.0]$ jps
17112 DataNode
17193 Jps
16763 JournalNode
16956 NameNode
[hadoop@db03 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db03.out
[hadoop@db03 hadoop-2.5.0]$ jps
15813 JournalNode
15995 Jps
15921 DataNode
[hadoop@db04 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db04.out
[hadoop@db04 hadoop-2.5.0]$ jps
14660 DataNode
14734 Jps
[hadoop@db05 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db05.out
[hadoop@db05 hadoop-2.5.0]$ jps
22165 DataNode
22239 Jps
6)将nn1切换成active状态
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToActive nn1
17/03/12 03:04:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 03:06:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 03:06:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$
测试hfds文件系统:
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p /user/hadoop/conf
17/03/12 03:14:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@db01 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -ls -R /
17/03/12 03:14:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x - hadoop supergroup 0 2017-03-12 03:11 /user
drwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoop
drwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoop/conf
[hadoop@db01 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -put etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml /user/hadoop/conf/
17/03/12 03:14:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -ls -R /
17/03/12 03:15:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
drwxr-xr-x - hadoop supergroup 0 2017-03-12 03:11 /user
drwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoop
drwxr-xr-x - hadoop supergroup 0 2017-03-12 03:14 /user/hadoop/conf
-rw-r--r-- 3 hadoop supergroup 1140 2017-03-12 03:14 /user/hadoop/conf/core-site.xml
-rw-r--r-- 3 hadoop supergroup 2061 2017-03-12 03:14 /user/hadoop/conf/hdfs-site.xml
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/hadoop/conf/core-site.xml
17/03/12 03:16:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.5.0/data/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>7000</value>
</property>
</configuration>
手工方式验证active和standby的相互转换:
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToStandby nn1
17/03/12 03:20:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -transitionToActive nn2
17/03/12 03:20:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 03:21:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 03:21:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/hadoop/conf/core-site.xml
17/03/12 03:22:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.5.0/data/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>7000</value>
</property>
</configuration>
使用zookeeper实现hadoop ha的自动故障转移(failover)功能
hdfs-site.xml file, add:
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
core-site.xml file, add:
<property>
<name>ha.zookeeper.quorum</name>
<value>db01:2181,db02:2181,db03:2181,db04:2181,db05:2181</value>
</property>
关闭hdfs集群,并且同步文件:
[hadoop@db01 hadoop-2.5.0]$ sbin/stop-dfs.sh
17/03/12 13:07:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [db01 db02]
db01: stopping namenode
db02: stopping namenode
db01: stopping datanode
db02: stopping datanode
db05: stopping datanode
db04: stopping datanode
db03: stopping datanode
Stopping journal nodes [db01 db02 db03]
db01: stopping journalnode
db03: stopping journalnode
db02: stopping journalnode
17/03/12 13:07:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping ZK Failover Controllers on NN hosts [db01 db02]
db01: no zkfc to stop
db02: no zkfc to stop
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db02:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1269 1.2KB/s 00:00
hdfs-site.xml 100% 2158 2.1KB/s 00:00
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db03:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1269 1.2KB/s 00:00
hdfs-site.xml 100% 2158 2.1KB/s 00:00
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db04:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1269 1.2KB/s 00:00
hdfs-site.xml 100% 2158 2.1KB/s 00:00
[hadoop@db01 hadoop-2.5.0]$ scp -r etc/hadoop/core-site.xml etc/hadoop/hdfs-site.xml hadoop@db05:/usr/local/hadoop-2.5.0/etc/hadoop/
core-site.xml 100% 1269 1.2KB/s 00:00
hdfs-site.xml 100% 2158 2.1KB/s 00:00
启动zookeeper集群:
[hadoop@db01 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db01 hadoop-2.5.0]$ jps
10341 Jps
10319 QuorumPeerMain
[hadoop@db02 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... jpSTARTED
[hadoop@db02 hadoop-2.5.0]$ jps
22296 QuorumPeerMain
22320 Jps
[hadoop@db03 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db03 hadoop-2.5.0]$ jps
17290 QuorumPeerMain
17325 Jps
[hadoop@db04 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db04 hadoop-2.5.0]$ jps
15908 Jps
15877 QuorumPeerMain
[hadoop@db05 hadoop-2.5.0]$ ../zookeeper-3.4.5/bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@db05 hadoop-2.5.0]$ jps
23412 Jps
23379 QuorumPeerMain
hadoop初始化zk:
[zk: localhost:2181(CONNECTED) 1] ls /
[zookeeper]
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs zkfc -formatZK
[zk: localhost:2181(CONNECTED) 3] ls /
[hadoop-ha, zookeeper]
启动hdfs集群:
[hadoop@db01 hadoop-2.5.0]$ sbin/start-dfs.sh
17/03/12 13:19:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [db01 db02]
db01: starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db01.out
db02: starting namenode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-namenode-db02.out
db01: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db01.out
db05: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db05.out
db02: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db02.out
db04: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db04.out
db03: starting datanode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-datanode-db03.out
Starting journal nodes [db01 db02 db03]
db02: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db02.out
db01: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db01.out
db03: starting journalnode, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-journalnode-db03.out
17/03/12 13:19:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [db01 db02]
db02: starting zkfc, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-zkfc-db02.out
db01: starting zkfc, logging to /usr/local/hadoop-2.5.0/logs/hadoop-hadoop-zkfc-db01.out
[hadoop@db01 hadoop-2.5.0]$ jps
8382 Jps
7931 DataNode
8125 JournalNode
32156 QuorumPeerMain
7816 NameNode
8315 DFSZKFailoverController
测试自动故障转移功能:
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 13:51:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 13:51:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db02 hadoop-2.5.0]$ jps
22296 QuorumPeerMain
22377 NameNode
22458 DataNode
22775 Jps
22691 DFSZKFailoverController
22553 JournalNode
[hadoop@db02 hadoop-2.5.0]$
[hadoop@db02 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:23:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:23:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$
[hadoop@db02 hadoop-2.5.0]$ kill -9 25121
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:24:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:24:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/12 14:24:51 INFO ipc.Client: Retrying connect to server: db02/192.168.100.232:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From db01/192.168.100.231 to db02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
--------------------------------------------------------------------------------
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:28:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:28:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[hadoop@db01 hadoop-2.5.0]$ jps
16276 Jps
15675 JournalNode
15363 NameNode
15871 DFSZKFailoverController
10319 QuorumPeerMain
15478 DataNode
[hadoop@db01 hadoop-2.5.0]$ kill -9 15363
[hadoop@db01 hadoop-2.5.0]$
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn1
17/03/12 14:28:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/12 14:28:49 INFO ipc.Client: Retrying connect to server: db01/192.168.100.231:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From db01/192.168.100.231 to db01:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
[hadoop@db01 hadoop-2.5.0]$ bin/hdfs haadmin -getServiceState nn2
17/03/12 14:28:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
------------------------------------------------------------------------------------------------------
##############################zk ha自动切换可能遇到错误##############################################################
2017-03-12 13:58:54,210 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN
java.net.ConnectException: Call From db01/192.168.100.231 to db02:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1415)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:139)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:411)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
at org.apache.hadoop.ipc.Client.call(Client.java:1382)
... 11 more
问题原因:
我的问题找到了 在 HDFS的配置文件中 我的 fencing的选的密钥文件不对,我的是 dsa 不是 rsa的加密类型。改一下就OK 了
##############################zk ha自动切换可能遇到错误##############################################################
hadoop HA架构安装部署(QJM HA)的更多相关文章
- Hadoop分布式HA的安装部署
Hadoop分布式HA的安装部署 前言 单机版的Hadoop环境只有一个namenode,一般namenode出现问题,整个系统也就无法使用,所以高可用主要指的是namenode的高可用,即存在两个n ...
- 第06讲:Flink 集群安装部署和 HA 配置
Flink系列文章 第01讲:Flink 的应用场景和架构模型 第02讲:Flink 入门程序 WordCount 和 SQL 实现 第03讲:Flink 的编程模型与其他框架比较 第04讲:Flin ...
- 2 Hadoop集群安装部署准备
2 Hadoop集群安装部署准备 集群安装前需要考虑的几点硬件选型--CPU.内存.磁盘.网卡等--什么配置?需要多少? 网络规划--1 GB? 10 GB?--网络拓扑? 操作系统选型及基础环境-- ...
- 1.Hadoop集群安装部署
Hadoop集群安装部署 1.介绍 (1)架构模型 (2)使用工具 VMWARE cenos7 Xshell Xftp jdk-8u91-linux-x64.rpm hadoop-2.7.3.tar. ...
- Hadoop2.X HA架构与部署
HDFS-HA原理及配置 1.HDFS-HA架构原理介绍 hadoop2.x之后,Clouera提出了QJM/Qurom Journal Manager,这是一个基于Paxos算法实现的HDFS HA ...
- 新闻实时分析系统-Hadoop2.X HA架构与部署
1.HDFS-HA架构原理介绍 hadoop2.x之后,Clouera提出了QJM/Qurom Journal Manager,这是一个基于Paxos算法实现的HDFS HA方案,它给出了一种较好的解 ...
- 新闻网大数据实时分析可视化系统项目——5、Hadoop2.X HA架构与部署
1.HDFS-HA架构原理介绍 hadoop2.x之后,Clouera提出了QJM/Qurom Journal Manager,这是一个基于Paxos算法实现的HDFS HA方案,它给出了一种较好的解 ...
- Apache Hadoop集群安装(NameNode HA + SPARK + 机架感知)
1.主机规划 序号 主机名 IP地址 角色 1 nn-1 192.168.9.21 NameNode.mr-jobhistory.zookeeper.JournalNode 2 nn-2 ).HA的集 ...
- Apache Hadoop集群安装(NameNode HA + YARN HA + SPARK + 机架感知)
1.主机规划 序号 主机名 IP地址 角色 1 nn-1 192.168.9.21 NameNode.mr-jobhistory.zookeeper.JournalNode 2 nn-2 192.16 ...
随机推荐
- ios开发之--调整UISearchBar的输入框的背景颜色
遍历UISearchBar的子视图,找到输入框坐在的view,添加背景颜色即可. 代码如下: UISearchBar *searchBar = [[UISearchBar alloc] initWit ...
- 删除ORACLE目录OCI.dll文件无法删除 (转)
删除ORACLE目录OCI.dll文件无法删除 今天准备把虚拟机里的10g卸载安装11g来研究一些新特性 卸载没有用自带的UnInstall工具之前看warehouse的讲课视频凭记忆手动卸载了下删除 ...
- passport登录问题:passport.use 方法没有被调用
写passport登录验证时,无论如何passport.use 方法都没有被调用,最后在同事的帮助下,才找到问题: 我是用form提交登陆数据的, input type:"text" ...
- [Windows] Windows 8.1 取消在任务栏显示应用商店的应用
- C++ template —— template metaprogram(九)
metaprogramming含有“对一个程序进行编程”的意思.换句话说,编程系统将会执行我们所写的代码,来生成新的代码,而这些新代码才真正实现了我们所期望的功能.通常而言,metaprogrammi ...
- Lua 正确的尾调用(proper tail call)
Lua支持“尾调用消除(tail-call elimination)”.尾调用(tail call):当一个函数调用是另一个函数的最后一个动作时,该调用才算是一条“尾调用”.例如,下面的代码就是一条“ ...
- cp自动创建层级结构的例子
一个拷贝命令的技巧,不仅拷贝文件,而且拷贝目录结构.记录下来. *拷贝的时候,自动创建参数中源文件的路径:#cp --parents parentdir1/parentdir2/sourcefile ...
- C++中成员初始化列表的使用
C++在类的构造函数中,可以两种方式初始化成员数据(data member). 1,在构造函数的实现中,初始类的成员数据.诸如: class point{private: int x,y;public ...
- 关于丢失或者损坏/etc/fstab文件后的一些探讨
1.模仿,假设不小心删除了/etc/fstab文件:大家都知道,Linux系统启动的时候会读取该文件来挂载分区,如果缺失该文件,系统一般不能正常启动. 2.采用reboot命令或者alt+ctrl+d ...
- PHP中常见问题总结
[1]页面之间无法传递变量 get,post,session在最新的php版本中自动全局变量是关闭的,所以要从上一页面取得提交过来得变量要使用$_GET['foo'],$_POST['foo'],$_ ...