启动之后发现slave上正常启动了DataNode,DataManager,但是过了几秒后发现DataNode被关闭

以slave1上错误日期为例查看错误信息:

  1. more /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave1.log

找到错误信息:

  1. -- ::, WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/opt/hadoop-2.9./dfs/data/
  2. java.io.IOException: Incompatible clusterIDs in /opt/hadoop-2.9./dfs/data: namenode clusterID = CID-f1195fc7-ca7c-4a2a-b32f-211131a5d699; datanode clusterID = CID-292293a6-9c34-4de7-aecd-d72657a26dd5
  3. at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:)
  4. at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:)
  5. at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:)
  6. at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:)
  7. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:)
  8. at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:)
  9. at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:)
  10. at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:)
  11. at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:)
  12. at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:)
  13. at java.lang.Thread.run(Thread.java:)
  14. -- ::, ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed) service to master/
  15. .168.0.:. Exiting.
  16. java.io.IOException: All specified directories have failed to load.
  17. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:)
  18. at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:)
  19. at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:)
  20. at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:)
  21. at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:)
  22. at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:)
  23. at java.lang.Thread.run(Thread.java:)
  24. -- ::, WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed) service to master
  25. /192.168.0.120:
  26. -- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid f4badff3-7a0b-4db0-bd77-83b370f67eed)
  27. -- ::, WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
  28. -- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
  29. /************************************************************
  30. SHUTDOWN_MSG: Shutting down DataNode at slave1/192.168.0.121
  31. ************************************************************/

解决方案

错误问题原因:多次格式化导致的。

1)在master执行sbin/stop-all.sh,关闭hadoop:

  1. cd /opt/hadoop-2.9.
  2. sbin/stop-all.sh

2)依次在master,slave1,slave2,slave3上执行以下命令:

  1. cd /opt/hadoop-2.9.
  2. rm -r dfs
  3. rm -r logs
  4. rm -r tmp

3)在master上重新格式化hadoop,重新启动hadoop

  1. cd /opt/hadoop-2.9. #进入hadoop目录
  2. bin/hadoop namenode -format #格式化namenode
  3. sbin/start-all.sh #启动dfs
  4. [spark@master hadoop-2.9.]$ cd /opt/hadoop-2.9. #进入hadoop目录
  5. [spark@master hadoop-2.9.]$ bin/hadoop namenode -format #格式化namenode
  6. sbin/start-all.sh #启动dfs
  7. DEPRECATED: Use of this script to execute hdfs command is deprecated.
  8. Instead use the hdfs command for it.
  9.  
  10. // :: INFO namenode.NameNode: STARTUP_MSG:
  11. /************************************************************
  12. STARTUP_MSG: Starting NameNode
  13. STARTUP_MSG: host = master/192.168.0.120
  14. STARTUP_MSG: args = [-format]
  15. STARTUP_MSG: version = 2.9.0
  16. STARTUP_MSG: classpath = /opt/hadoop-2.9.0/etc/hadoop:/opt/hadoop-2.9.0/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/opt/hadoop-2.9.0/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/opt/hadoop-...
  17. STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 756ebc8394e473ac25feac05fa493f6d612e6c50; compiled by 'arsuresh' on 2017-11-13T23:15Z
  18. STARTUP_MSG: java = 1.8.0_171
  19. ************************************************************/
  20. // :: INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
  21. // :: INFO namenode.NameNode: createNameNode [-format]
  22. Formatting using clusterid: CID-d4e2f108-de3c--9eeb-abbbb1024fe8
  23. // :: INFO namenode.FSEditLog: Edit logging is async:true
  24. // :: INFO namenode.FSNamesystem: KeyProvider: null
  25. // :: INFO namenode.FSNamesystem: fsLock is fair: true
  26. // :: INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
  27. // :: INFO namenode.FSNamesystem: fsOwner = spark (auth:SIMPLE)
  28. // :: INFO namenode.FSNamesystem: supergroup = supergroup
  29. // :: INFO namenode.FSNamesystem: isPermissionEnabled = true
  30. // :: INFO namenode.FSNamesystem: HA Enabled: false
  31. // :: INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to . Disabling file IO profiling
  32. // :: INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=, counted=, effected=
  33. // :: INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
  34. // :: INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to :::00.000
  35. // :: INFO blockmanagement.BlockManager: The block deletion will start around Jun ::
  36. // :: INFO util.GSet: Computing capacity for map BlocksMap
  37. // :: INFO util.GSet: VM type = -bit
  38. // :: INFO util.GSet: 2.0% max memory MB = 17.8 MB
  39. // :: INFO util.GSet: capacity = ^ = entries
  40. // :: INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
  41. // :: WARN conf.Configuration: No unit for dfs.namenode.safemode.extension() assuming MILLISECONDS
  42. // :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
  43. // :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes =
  44. // :: INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension =
  45. // :: INFO blockmanagement.BlockManager: defaultReplication =
  46. // :: INFO blockmanagement.BlockManager: maxReplication =
  47. // :: INFO blockmanagement.BlockManager: minReplication =
  48. // :: INFO blockmanagement.BlockManager: maxReplicationStreams =
  49. // :: INFO blockmanagement.BlockManager: replicationRecheckInterval =
  50. // :: INFO blockmanagement.BlockManager: encryptDataTransfer = false
  51. // :: INFO blockmanagement.BlockManager: maxNumBlocksToLog =
  52. // :: INFO namenode.FSNamesystem: Append Enabled: true
  53. // :: INFO util.GSet: Computing capacity for map INodeMap
  54. // :: INFO util.GSet: VM type = -bit
  55. // :: INFO util.GSet: 1.0% max memory MB = 8.9 MB
  56. // :: INFO util.GSet: capacity = ^ = entries
  57. // :: INFO namenode.FSDirectory: ACLs enabled? false
  58. // :: INFO namenode.FSDirectory: XAttrs enabled? true
  59. // :: INFO namenode.NameNode: Caching file names occurring more than times
  60. // :: INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: falseskipCaptureAccessTimeOnlyChange: false
  61. // :: INFO util.GSet: Computing capacity for map cachedBlocks
  62. // :: INFO util.GSet: VM type = -bit
  63. // :: INFO util.GSet: 0.25% max memory MB = 2.2 MB
  64. // :: INFO util.GSet: capacity = ^ = entries
  65. // :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets =
  66. // :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users =
  67. // :: INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = ,,
  68. // :: INFO namenode.FSNamesystem: Retry cache on namenode is enabled
  69. // :: INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is millis
  70. // :: INFO util.GSet: Computing capacity for map NameNodeRetryCache
  71. // :: INFO util.GSet: VM type = -bit
  72. // :: INFO util.GSet: 0.029999999329447746% max memory MB = 273.1 KB
  73. // :: INFO util.GSet: capacity = ^ = entries
  74. // :: INFO namenode.FSImage: Allocated new BlockPoolId: BP--192.168.0.120-
  75. // :: INFO common.Storage: Storage directory /opt/hadoop-2.9./dfs/name has been successfully formatted.
  76. // :: INFO namenode.FSImageFormatProtobuf: Saving image file /opt/hadoop-2.9./dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
  77. // :: INFO namenode.FSImageFormatProtobuf: Image file /opt/hadoop-2.9./dfs/name/current/fsimage.ckpt_0000000000000000000 of size bytes saved in seconds.
  78. // :: INFO namenode.NNStorageRetentionManager: Going to retain images with txid >=
  79. // :: INFO namenode.NameNode: SHUTDOWN_MSG:
  80. /************************************************************
  81. SHUTDOWN_MSG: Shutting down NameNode at master/192.168.0.120
  82. ************************************************************/
  83. [spark@master hadoop-2.9.]$ sbin/start-all.sh #启动dfs
  84. This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
  85. Starting namenodes on [master]
  86. master: starting namenode, logging to /opt/hadoop-2.9./logs/hadoop-spark-namenode-master.out
  87. slave1: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave1.out
  88. slave3: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave3.out
  89. slave2: starting datanode, logging to /opt/hadoop-2.9./logs/hadoop-spark-datanode-slave2.out
  90. Starting secondary namenodes [master]
  91. master: starting secondarynamenode, logging to /opt/hadoop-2.9./logs/hadoop-spark-secondarynamenode-master.out
  92. starting yarn daemons
  93. starting resourcemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-resourcemanager-master.out
  94. slave2: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave2.out
  95. slave3: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave3.out
  96. slave1: starting nodemanager, logging to /opt/hadoop-2.9./logs/yarn-spark-nodemanager-slave1.out

4)过30s后,查看master,slave1,slave2,slave3是否启动成功

查看master是否启动成功:

  1. [spark@master hadoop-2.9.]$ jps
  2. Jps
  3. ResourceManager
  4. NameNode
  5. SecondaryNameNode
  6. [spark@master hadoop-2.9.]$

在slave1,slave2,slave3分别jps查看是否都启动了DataNode,DataManager进程:
以slave1为例:

  1. [spark@slave1 hadoop-2.9.]$ jps
  2. Jps
  3. NodeManager
  4. DataNode
  5. [spark@slave1 hadoop-2.9.]$

参考《https://blog.csdn.net/magggggic/article/details/52503502》

Kafka:ZK+Kafka+Spark Streaming集群环境搭建(五)针对hadoop2.9.0启动之后发现slave上正常启动了DataNode,DataManager,但是过了几秒后发现DataNode被关闭的更多相关文章

  1. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二十一)NIFI1.7.1安装

    一.nifi基本配置 1. 修改各节点主机名,修改/etc/hosts文件内容. 192.168.0.120 master 192.168.0.121 slave1 192.168.0.122 sla ...

  2. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  3. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十二)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网。

    Centos7出现异常:Failed to start LSB: Bring up/down networking. 按照<Kafka:ZK+Kafka+Spark Streaming集群环境搭 ...

  4. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十一)定制一个arvo格式文件发送到kafka的topic,通过Structured Streaming读取kafka的数据

    将arvo格式数据发送到kafka的topic 第一步:定制avro schema: { "type": "record", "name": ...

  5. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十)安装hadoop2.9.0搭建HA

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  6. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(九)安装kafka_2.11-1.1.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  7. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(八)安装zookeeper-3.4.12

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  8. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(三)安装spark2.2.1

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

  9. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(二)安装hadoop2.9.0

    如何搭建配置centos虚拟机请参考<Kafka:ZK+Kafka+Spark Streaming集群环境搭建(一)VMW安装四台CentOS,并实现本机与它们能交互,虚拟机内部实现可以上网.& ...

随机推荐

  1. Azure虚机磁盘容量警报(邮件提醒)

    上周有个客户提出这样的需求:根据虚拟机磁盘的实际使用量,当达到某一阈值时设置邮件提醒. 在这个需求中我们只需要解决两点问题: 计算虚拟机磁盘实际使用量 发送邮件 使用VS新建一个名为Calculate ...

  2. 一些收集的MikroTik RouterOS破解版虚拟机VMware

    会不定期更新,也许后续自己来做,现在是收集.持续关注这个分享链接即可. 链接:https://pan.baidu.com/s/1j7ciesPm1yAgCJ26dLJ8sg  密码:i64w

  3. [Go] 反射 - reflect.ValueOf()

    类型 和 接口 由于反射是基于类型系统(type system)的,所以先简单了解一下类型系统. 首先 Golang 是一种静态类型的语言,在编译时每一个变量都有一个类型对应,例如:int, floa ...

  4. mysql故障

    1.服务器上是的电不要随边乱断,一定要保存,然后断电,不要在服务器插座版上乱插其他电器,导致非法断电, 2.出现断电后,检查MYSQL数据库文件是否损坏,可以看WINDOWS 应用程序程序管理日志,提 ...

  5. delphi ribbon使用

    http://blog.csdn.net/davinciyxw/article/details/5604209 1.TextEditor(barEditItem)取文本 string editValu ...

  6. 移植Python2到TQ2440

    环境 Python:2.7.13 开发板: TQ2440 工具链: arm-none-linux-gnueabi-gcc 4.8.3 概述 前面已经把Python3移植到TQ2440上面的,现在我们移 ...

  7. Flex+blazeds实现与mySQL数据库的连接(已成功实现此文的例子)

    http://bdk82924.iteye.com/blog/1067285 几个下载地址 blazeds_turnkey_3-0-0-544.zip 下载地址:http://download.mac ...

  8. .NET对象的创建、垃圾回收、非托管资源的手动处理

    本篇用来梳理对象的创建.垃圾的回收,以及非托管资源的手动处理. →首先运行应用程序,创建一个Windows进程. →CLR创建一块连续的虚拟地址空间,这个地址空间就是托管堆.而且,这个地址空间最初并没 ...

  9. 利用npm 安装删除模块

    转自 涵一原文 利用npm 安装删除模块 1. npm安装模块 [npm install xxx]利用 npm 安装xxx模块到当前命令行所在目录:[npm install -g xxx]利用npm安 ...

  10. Lombok使用详解(转)

    本文转自https://blog.csdn.net/u010695794/article/details/70441432 2017年04月22日 15:17:00 阅读数:10394 Lombok使 ...