hadoop namenode格式化问题汇总

（持续更新）

0 Hadoop集群环境

3台rhel6.4，2个namenode+2个zkfc, 3个journalnode+zookeeper-server 组成一个最简单的HA集群方案。

1) hdfs-site.xml配置如下：

<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Quorum Journal Manager HA:
  http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
-->
<configuration>
    <!-- Quorum Journal Manager HA -->
    <property>
        <name>dfs.nameservices</name>
        <value>hacl</value>
        <description>unique identifiers for each NameNode in the nameservice.</description>
    </property>

    <property>
        <name>dfs.ha.namenodes.hacl</name>
        <value>hn1,hn2</value>
        <description>Configure with a list of comma-separated NameNode IDs.</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacl.hn1</name>
        <value>hacl-node1.pepstack.com:8020</value>
        <description>the fully-qualified RPC address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacl.hn2</name>
        <value>hacl-node2.pepstack.com:8020</value>
        <description>the fully-qualified RPC address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.http-address.hacl.hn1</name>
        <value>hacl-node1.pepstack.com:50070</value>
        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.http-address.hacl.hn2</name>
        <value>hacl-node2.pepstack.com:50070</value>
        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hacl-node1.pepstack.com:8485;hacl-node2.pepstack.com:8485;hacl-node3.pepstack.com:8485/hacl</value>
        <description>the URI which identifies the group of JNs where the NameNodes will write or read edits.</description>
    </property>

    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/hacl/data/dfs/jn</value>
        <description>the path where the JournalNode daemon will store its local state.</description>
    </property>

    <property>
        <name>dfs.client.failover.proxy.provider.hacl</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        <description>the Java class that HDFS clients use to contact the Active NameNode.</description>
    </property>

    <!-- Automatic failover adds two new components to an HDFS deployment:
        - a ZooKeeper quorum;
        - the ZKFailoverController process (abbreviated as ZKFC).
        Configuring automatic failover:
    -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
        <description>a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.</description>
    </property>

    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>
        <description>The sshfence option SSHes to the target node and uses fuser to kill the process
          listening on the service's TCP port. In order for this fencing option to work, it must be
          able to SSH to the target node without providing a passphrase. Thus, one must also configure the
          dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files.
             logon namenode machine:
             cd /var/lib/hadoop-hdfs
             su hdfs
             ssh-keygen -t dsa
        </description>
    </property>
    <!-- Optionally, one may configure a non-standard username or port to perform the SSH.
      One may also configure a timeout, in milliseconds, for the SSH, after which this
      fencing method will be considered to have failed. It may be configured like so:
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence([[username][:port]])</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
    //-->

    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <property>
        <name>dfs.ha.automatic-failover.enabled.hacl</name>
        <value>true</value>
    </property>

    <!-- Configurations for NameNode: -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/hacl/data/dfs/nn</value>
        <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
    </property>

    <property>
        <name>dfs.blocksize</name>
        <value>268435456</value>
        <description>HDFS blocksize of 256MB for large file-systems.</description>
    </property>

    <property>
        <name>dfs.replication</name>
        <value>3</value>
        <description></description>
    </property>

    <property>
        <name>dfs.namenode.handler.count</name>
        <value>100</value>
        <description>More NameNode server threads to handle RPCs from large number of DataNodes.</description>
    </property>

    <!-- Configurations for DataNode: -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/hacl/data/dfs/dn</value>
        <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
    </property>
</configuration>

2) core-site.xml配置如下：

<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hacl</value>
    </property>

    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/hdfs/data/tmp</value>
        <description>chown -R hdfs:hdfs hadoop_tmp_dir</description>
    </property>

    <!-- Configuring automatic failover -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hacl-node1.pepstack.com:2181,hacl-node2.pepstack.com:2181,hacl-node3.pepstack.com:2181</value>
        <description>This lists the host-port pairs running the ZooKeeper service.</description>
    </property>

    <!-- Securing access to ZooKeeper -->

</configuration>

1. namenode格式化过程如下：

1) 启动所有journalnode，必须3个节点的JN都正确启动。关闭所有的namenode:

# service hadoop-hdfs-journalnode start
# service hadoop-hdfs-namenode stop

2) namenode格式化。hacl-pepstack-com是我给集群起的名字，可以忽略。su - hdfs -c "..." 表示以hdfs用户格式化。

hdfs-site.xml和core-site.xml上指定的所有目录都必须赋予正确的权限：

# chown -R hdfs:hdfs /hacl/data/dfs

然后在任何一个namenode上格式化，比如在hn1上执行

########## hn1
# su - hdfs -c "hdfs namenode -format -clusterid hacl-pepstack-com -force"
# service hadoop-hdfs-namenode start   ##### hn1

首先必须把刚格式化好的hn1启动，然后在另一个namenode上(hn2)执行:

########## hn2
# su - hdfs -c "hdfs namenode -bootstrapStandby -force"
# service hadoop-hdfs-namenode start   ##### hn2

至此，2个namenode都格式化并且启动好了。

hadoop namenode格式化问题汇总的更多相关文章

Hadoop源码：namenode格式化和启动过程实现
body { margin: 0 auto; font: 13px / 1 Helvetica, Arial, sans-serif; color: rgba(68, 68, 68, 1); padd ...
hdfs格式化hadoop namenode -format错误
在对HDFS格式化,执行hadoop namenode -format命令时,出现未知的主机名的问题,异常信息如下所示: [shirdrn@localhost bin]$ hadoop namenod ...
hadoop namenode多次格式化后，导致datanode启动不了
jps hadoop namenode -format dfs directory : /home/hadoop/dfs --data --current/VERSION #Wed Jul :: CS ...
Hadoop笔记——技术点汇总
目录 · 概况 · Hadoop · 云计算 · 大数据 · 数据挖掘 · 手工搭建集群 · 引言 · 配置机器名 · 调整时间 · 创建用户 · 安装JDK · 配置文件 · 启动与测试 · Clo ...
Hadoop namenode无法启动
最近遇到了一个问题,执行start-all.sh的时候发现JPS一下namenode没有启动每次开机都得重新格式化一下namenode才可以其实问题就出在tmp文件,默 ...
namenode无法启动（namenode格式化失败）
格式化namenode root@node04 bin]# sudo -u hdfs hdfs namenode –format 16/11/14 10:56:51 INFO namenode.Nam ...
Hadoop重新格式化HDFS的方法
1.查看hdfs-site.xml: <property> <name>dfs.name.dir</name> <value>/home/hadoop/ ...
Hadoop记录-Hadoop NameNode 高可用 (High Availability) 实现解析
Hadoop NameNode 高可用 (High Availability) 实现解析 NameNode 高可用整体架构概述在 Hadoop 1.0 时代,Hadoop 的两大核心组件 HDF ...
对hadoop namenode -format执行过程的探究
引言本文出于一个疑问:hadoop namenode -format到底在我的linux系统里面做了些什么? 步骤第1个文件bin/hadoop Hadoop脚本位于hadoop根目录下的bi ...

随机推荐

Dubbo框架应用之（三）--Zookeeper注册中心、管理控制台的安装及讲解
我是在linux下使用dubbo-2.3.3以上版本的zookeeper注册中心客户端.Zookeeper是Apache Hadoop的子项目,强度相对较好,建议生产环境使用该注册中心.Dubbo未对 ...
CentOS7.2安装Weblogic12c出现的问题
Weblogic12c安装到步骤:Prerequisite Checks 时,会进行操作系统版本的校验,即checking operating system certification. 此处 ...
WebService案例入门（基础篇）
[版权申明:本文系作者原创,转载请注明出处] 文章出处:http://blog.csdn.net/sdksdk0/article/details/52106690 作者:朱培 ID:sdksdk0 邮 ...
RxJava操作符(06-错误处理)
转载请标明出处: http://blog.csdn.net/xmxkf/article/details/51658235 本文出自:[openXu的博客] 目录: Catch Retry 源码下载 1 ...
Linux--DNS服务器
DNS是Internet上使用最普遍,也是最重要的服务之一,通过DNS我们才可以访问丰富多彩的网络,而DNS服务器就是为了实现域名解析功能而搭建的. 域名系统采用层次结构,按地理区域或机构区域 ...
详解EBS接口开发之供应商导入补充-供应商地点增加实例
DECLARE --v_org_id number; v_vendor_interface_id NUMBER; v_vendor_site_interface_id NUMBER; --接口表的id ...
剑指offer面试题5 从头到尾打印链表（java）
注:(1)这里体现了java数据结构与C语言的不同之处 (2)栈的操作直接利用stack进行 package com.xsf.SordForOffer; import java.util.Stack; ...
Objc中处理数组越界的一种办法
大熊猫猪·侯佩原创或翻译作品.欢迎转载,转载请注明出处. 如果觉得写的不好请多提意见,如果觉得不错请多多支持点赞.谢谢! hopy ;) Objc的数组如果在访问时索引非法,则会抛出NSRangeEx ...
查找maven中的groupId，artifactId，version等信息的方式
可以查看:http://search.maven.org/ 输入要想找的东西
iOS开发之三：常用控件--UILabel的使用
UILabel 一般用来显示文本内容. 常用的属性如下: @property(nonatomic,copy) NSString *text; // 文本的内容,默认为 nil @property(no ...

hadoop namenode格式化问题汇总

hadoop namenode格式化问题汇总

0 Hadoop集群环境

1) hdfs-site.xml配置如下：

2) core-site.xml配置如下：

1. namenode格式化过程如下：

hadoop namenode格式化问题汇总的更多相关文章

随机推荐

热门专题