hadoop namenode格式化问题汇总

(持续更新)

0 Hadoop集群环境

3台rhel6.4,2个namenode+2个zkfc, 3个journalnode+zookeeper-server 组成一个最简单的HA集群方案。

1) hdfs-site.xml配置如下:

<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Quorum Journal Manager HA:
  http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
-->
<configuration>
    <!-- Quorum Journal Manager HA -->
    <property>
        <name>dfs.nameservices</name>
        <value>hacl</value>
        <description>unique identifiers for each NameNode in the nameservice.</description>
    </property>

    <property>
        <name>dfs.ha.namenodes.hacl</name>
        <value>hn1,hn2</value>
        <description>Configure with a list of comma-separated NameNode IDs.</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacl.hn1</name>
        <value>hacl-node1.pepstack.com:8020</value>
        <description>the fully-qualified RPC address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.rpc-address.hacl.hn2</name>
        <value>hacl-node2.pepstack.com:8020</value>
        <description>the fully-qualified RPC address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.http-address.hacl.hn1</name>
        <value>hacl-node1.pepstack.com:50070</value>
        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.http-address.hacl.hn2</name>
        <value>hacl-node2.pepstack.com:50070</value>
        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>
    </property>

    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hacl-node1.pepstack.com:8485;hacl-node2.pepstack.com:8485;hacl-node3.pepstack.com:8485/hacl</value>
        <description>the URI which identifies the group of JNs where the NameNodes will write or read edits.</description>
    </property>

    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/hacl/data/dfs/jn</value>
        <description>the path where the JournalNode daemon will store its local state.</description>
    </property>

    <property>
        <name>dfs.client.failover.proxy.provider.hacl</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        <description>the Java class that HDFS clients use to contact the Active NameNode.</description>
    </property>

    <!-- Automatic failover adds two new components to an HDFS deployment:
        - a ZooKeeper quorum;
        - the ZKFailoverController process (abbreviated as ZKFC).
        Configuring automatic failover:
    -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
        <description>a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.</description>
    </property>

    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>
        <description>The sshfence option SSHes to the target node and uses fuser to kill the process
          listening on the service's TCP port. In order for this fencing option to work, it must be
          able to SSH to the target node without providing a passphrase. Thus, one must also configure the
          dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files.
             logon namenode machine:
             cd /var/lib/hadoop-hdfs
             su hdfs
             ssh-keygen -t dsa
        </description>
    </property>
    <!-- Optionally, one may configure a non-standard username or port to perform the SSH.
      One may also configure a timeout, in milliseconds, for the SSH, after which this
      fencing method will be considered to have failed. It may be configured like so:
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence([[username][:port]])</value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
    //-->

    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <property>
        <name>dfs.ha.automatic-failover.enabled.hacl</name>
        <value>true</value>
    </property>

    <!-- Configurations for NameNode: -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/hacl/data/dfs/nn</value>
        <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
    </property>

    <property>
        <name>dfs.blocksize</name>
        <value>268435456</value>
        <description>HDFS blocksize of 256MB for large file-systems.</description>
    </property>

    <property>
        <name>dfs.replication</name>
        <value>3</value>
        <description></description>
    </property>

    <property>
        <name>dfs.namenode.handler.count</name>
        <value>100</value>
        <description>More NameNode server threads to handle RPCs from large number of DataNodes.</description>
    </property>

    <!-- Configurations for DataNode: -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/hacl/data/dfs/dn</value>
        <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
    </property>
</configuration>

2) core-site.xml配置如下:

<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hacl</value>
    </property>

    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
    </property>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/hdfs/data/tmp</value>
        <description>chown -R hdfs:hdfs hadoop_tmp_dir</description>
    </property>

    <!-- Configuring automatic failover -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hacl-node1.pepstack.com:2181,hacl-node2.pepstack.com:2181,hacl-node3.pepstack.com:2181</value>
        <description>This lists the host-port pairs running the ZooKeeper service.</description>
    </property>

    <!-- Securing access to ZooKeeper -->

</configuration>

1. namenode格式化过程如下:

1) 启动所有journalnode,必须3个节点的JN都正确启动。关闭所有的namenode:

# service hadoop-hdfs-journalnode start
# service hadoop-hdfs-namenode stop

2) namenode格式化。hacl-pepstack-com是我给集群起的名字,可以忽略。su - hdfs -c "..." 表示以hdfs用户格式化。

hdfs-site.xml和core-site.xml上指定的所有目录都必须赋予正确的权限:

# chown -R hdfs:hdfs /hacl/data/dfs

然后在任何一个namenode上格式化,比如在hn1上执行

########## hn1
# su - hdfs -c "hdfs namenode -format -clusterid hacl-pepstack-com -force"
# service hadoop-hdfs-namenode start   ##### hn1

首先必须把刚格式化好的hn1启动,然后在另一个namenode上(hn2)执行:

########## hn2
# su - hdfs -c "hdfs namenode -bootstrapStandby -force"
# service hadoop-hdfs-namenode start   ##### hn2

至此,2个namenode都格式化并且启动好了。

hadoop namenode格式化问题汇总的更多相关文章

  1. Hadoop源码:namenode格式化和启动过程实现

    body { margin: 0 auto; font: 13px / 1 Helvetica, Arial, sans-serif; color: rgba(68, 68, 68, 1); padd ...

  2. hdfs格式化hadoop namenode -format错误

    在对HDFS格式化,执行hadoop namenode -format命令时,出现未知的主机名的问题,异常信息如下所示: [shirdrn@localhost bin]$ hadoop namenod ...

  3. hadoop namenode多次格式化后,导致datanode启动不了

    jps hadoop namenode -format dfs directory : /home/hadoop/dfs --data --current/VERSION #Wed Jul :: CS ...

  4. Hadoop笔记——技术点汇总

    目录 · 概况 · Hadoop · 云计算 · 大数据 · 数据挖掘 · 手工搭建集群 · 引言 · 配置机器名 · 调整时间 · 创建用户 · 安装JDK · 配置文件 · 启动与测试 · Clo ...

  5. Hadoop namenode无法启动

    最近遇到了一个问题,执行start-all.sh的时候发现JPS一下namenode没有启动        每次开机都得重新格式化一下namenode才可以        其实问题就出在tmp文件,默 ...

  6. namenode无法启动(namenode格式化失败)

    格式化namenode root@node04 bin]# sudo -u hdfs hdfs namenode –format 16/11/14 10:56:51 INFO namenode.Nam ...

  7. Hadoop重新格式化HDFS的方法

    1.查看hdfs-site.xml: <property> <name>dfs.name.dir</name> <value>/home/hadoop/ ...

  8. Hadoop记录-Hadoop NameNode 高可用 (High Availability) 实现解析

    Hadoop NameNode 高可用 (High Availability) 实现解析   NameNode 高可用整体架构概述 在 Hadoop 1.0 时代,Hadoop 的两大核心组件 HDF ...

  9. 对hadoop namenode -format执行过程的探究

      引言 本文出于一个疑问:hadoop namenode -format到底在我的linux系统里面做了些什么? 步骤 第1个文件bin/hadoop Hadoop脚本位于hadoop根目录下的bi ...

随机推荐

  1. android MultiDex multiDex原理(一)

    android MultiDex 原理(一) Android分包MultiDex原理详解 转载请注明:http://blog.csdn.net/djy1992/article/details/5116 ...

  2. Programming In Scala笔记-第四章、类和对象

    类似于Java,Scala中也有类和对象的概念. 一.类.属性和方法 1.类 类是对一类事物的抽象,当一个类被定义后,就可以以该定义为模板,定义该类的一系列对象.比如说有以下一个模板 人类: 有姓名: ...

  3. Android的Intent机制详解

    Intent 是一个消息传递对象,您可以使用它从其他应用组件请求操作.尽管 Intent 可以通过多种方式促进组件之间的通信,但其 基本用例主要包括以下三个: 启动 Activity: Activit ...

  4. 重写方法的利器-super

    重写方法的利器-super class ilist(list): def __init__(self,dft=None,r=list()): super(ilist, self).__init__(r ...

  5. Android自定义控件及自定义属性

    Android自定义控件及自定义属性 自定义控件 创建自定义控件 自定义一个类,继承View 继承View还是哪个类,取决于你要实现一个什么样的控件 如果你要实现的是一个线性布局的组合控件,就可以继承 ...

  6. SQL语句常见问题的总结(持续更新)

    语言问题 修改语言注册表\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432\ORACLE\KEY_DevSuitHome1中的NLS_LANG修改为AMERICAN_AMERIC ...

  7. VisualStudio2010配置OpenCV的一种一劳永逸的方法

    用VS使用OpenCV的时候,又不想全局配置,每次都要配置简直烦死了. 鉴于此,给大家介绍一种简便一点的方法. 配置环境的前提是:成功安装了OpenCV合适的版本. 我这里用的是OpenCV2.4.9 ...

  8. 【学习笔记】启动Nginx、查看nginx进程、查看nginx服务主进程的方式、Nginx服务可接受的信号、nginx帮助命令、Nginx平滑重启、Nginx服务器的升级

     1.启动nginx的方式: cd /usr/local/nginx ls ./nginx -c nginx.conf 2.查看nginx的进程方式: [root@localhost nginx] ...

  9. springMVC源码分析--HandlerInterceptor拦截器调用过程(二)

    在上一篇博客springMVC源码分析--HandlerInterceptor拦截器(一)中我们介绍了HandlerInterceptor拦截器相关的内容,了解到了HandlerInterceptor ...

  10. 并发计算模型BSP与SEDA

    1    BSP批量同步并行计算 BSP(Bulk Synchronous Parallel)批量同步并行计算用来解决并发编程难的问题.名字听起来有点矛盾,又是同步又是并行的.因为计算被分组成一个个超 ...