hadoop集群崩溃,因为tmp下/tmp/hadoop-hadoop/dfs/name文件误删除
hadoop执行start-all后,显示正常启动。
starting namenode, logging to /opt/hadoop-0.20.2-cdh3u0/logs/hadoop-hadoop-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /opt/hadoop-0.20.2-cdh3u0/bin/../logs/hadoop-hadoop-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to /opt/hadoop-0.20.2-cdh3u0/bin/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /opt/hadoop-0.20.2-cdh3u0/logs/hadoop-hadoop-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to /opt/hadoop-0.20.2-cdh3u0/bin/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out
但却不能使用,执行hadoop命令显示
13/07/19 14:23:29 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).
13/07/19 14:23:30 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
13/07/19 14:23:31 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s).
13/07/19 14:23:32 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s).
13/07/19 14:23:33 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s).
13/07/19 14:23:34 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s).
13/07/19 14:23:36 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s).
13/07/19 14:23:37 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s).
13/07/19 14:23:38 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s).
13/07/19 14:23:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s).
Bad connection to FS. command aborted. exception: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused
jps发现只有
11885 Jps
11456 DataNode
11586 SecondaryNameNode
说明namenode没有启动,
用ps -aux和ps -e查了相关进程,没有什么能看出来
去看logs里,tail -1000 hadoop-hadoop-datanode-localhost.localdomain.log,内容显示的都是连接不上。
hadoop-hadoop-namenode-localhost.localdomain.log中,
2013-07-19 14:14:18,083 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = localhost.localdomain/127.0.0.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.20.2-cdh3u0
STARTUP_MSG: build = -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14; compiled by 'hudson' on Fri Mar 25 19:56:23 PDT 2011
************************************************************/
2013-07-19 14:14:18,249 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
2013-07-19 14:14:18,252 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
2013-07-19 14:14:18,267 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 64-bit
2013-07-19 14:14:18,268 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 17.77875 MB
2013-07-19 14:14:18,268 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^21 = 2097152 entries
2013-07-19 14:14:18,268 INFO org.apache.hadoop.hdfs.util.GSet: recommended=2097152, actual=2097152
2013-07-19 14:14:18,284 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop
2013-07-19 14:14:18,284 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2013-07-19 14:14:18,284 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2013-07-19 14:14:18,288 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=1000
2013-07-19 14:14:18,288 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2013-07-19 14:14:18,437 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
2013-07-19 14:14:18,460 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name does not exist.
2013-07-19 14:14:18,462 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:305)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:347)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:321)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:267)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:461)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1208)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1217)
2013-07-19 14:14:18,463 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-hadoop/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:305)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:347)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:321)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:267)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:461)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1208)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1217)2013-07-19 14:14:18,463 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
原来在tmp下的name文件,被我非法关机后给弄丢了。网上说的hadoop崩溃恢复方法是secondnamenode及namenode分开在不同两台机器运行,可以在集群崩溃时=从secondnamenode恢复数据,但我的不行了,就只能hadoop namenode -format了.
如果你secondnamenode没问题,可以用如下方法恢复
1. 删除 namenode主节点的metadata配置目录
rm -fr /data/hadoop-tmp/hadoop-hadoop/dfs/name
2. 启动secondnamenode
使用start-all.sh命令启动secondnamenode,namenode的启动不了不管
3. 从secondnamenode恢复
使用命令: hadoop namenode -importCheckpoint
hadoop集群崩溃,因为tmp下/tmp/hadoop-hadoop/dfs/name文件误删除的更多相关文章
- hadoop集群篇--从0到1搭建hadoop集群
一.前述 本来有套好好的集群,可是不知道为什么虚拟机镜像文件损坏,结果导致集群不能用.所以不得不重新搭套集群,借此机会顺便再重新搭套吧,顺便提醒一句大家,自己虚拟机的集群一定要及时做好快照,最好装完每 ...
- 运行基准测试hadoop集群中的问题:org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /benchmarks/TestDFSIO/io_data/test_
在master(即:host2)中执行 hadoop jar hadoop-test-1.1.2.jar DFSCIOTest -write -nrFiles 12 -fileSize 10240 - ...
- Eclipse/MyEclipse连接Hadoop集群出现:Unable to ... ... org.apache.hadoop.security.AccessControlExceptiom:Permission denied问题
问题详细如下: 解决办法: <property> <name>dfs.premissions</name> <value>false</value ...
- Linux环境下Hadoop集群搭建
Linux环境下Hadoop集群搭建 前言: 最近来到了武汉大学,在这里开始了我的研究生生涯.昨天通过学长们的耐心培训,了解了Hadoop,Hdfs,Hive,Hbase,MangoDB等等相关的知识 ...
- 关于hadoop集群下Datanode和Namenode无法访问的解决方案
HDFS架构 HDFS也是按照Master和Slave的结构,分namenode,secondarynamenode,datanode这几个角色. Namenode:是maseter节点,是大领导.管 ...
- 【Big Data】HADOOP集群的配置(一)
Hadoop集群的配置(一) 摘要: hadoop集群配置系列文档,是笔者在实验室真机环境实验后整理而得.以便随后工作所需,做以知识整理,另则与博客园朋友分享实验成果,因为笔者在学习初期,也遇到不少问 ...
- Hadoop集群datanode磁盘不均衡的解决方案
一.引言: Hadoop的HDFS集群非常容易出现机器与机器之间磁盘利用率不平衡的情况,比如集群中添加新的数据节点,节点与节点之间磁盘大小不一样等等.当hdfs出现不平衡状况的时候,将引发很多问题,比 ...
- Hadoop集群安装配置教程_Hadoop2.6.0_Ubuntu/CentOS
摘自:http://www.powerxing.com/install-hadoop-cluster/ 本教程讲述如何配置 Hadoop 集群,默认读者已经掌握了 Hadoop 的单机伪分布式配置,否 ...
- hadoop集群环境的搭建
hadoop集群环境的搭建 今天终于把hadoop集群环境给搭建起来了,能够运行单词统计的示例程序了. 集群信息如下: 主机名 Hadoop角色 Hadoop jps命令结果 Hadoop用户 Had ...
随机推荐
- Linux上rpm实战搭建FTP服务器
1.检测是否已安装FTP服务 # rpm -qa|grep vsftpd 2.未安装ftp服务的前提进行使用rpm安装 # yum install vsftpd -y Loaded plugins: ...
- 日历类和日期类转换 并发修改异常 泛型的好处 *各种排序 成员和局部变量 接口和抽象类 多态 new对象内存中的变化
day07 ==和equals的区别? ==用于比较两个数值 或者地址值是否相同. equals 用于比较两个对象的内容是否相同 String,StringBuffer.StringBuilde ...
- C++笔记007:易犯错误模型——类中为什么需要成员函数
先看源码,在VS2010环境下无法编译通过,在VS2013环境下可以编译通过,并且可以运行,只是运行结果并不是我们期待的结果. 最初通过MyCircle类定义对象c1时,为对象分配内存空间,r没有初始 ...
- JSON 封装函数
var eventUtil = { addHandler:function(element,type,handler) { //添加句柄 if(element.addEventListener) { ...
- Linux(十七)动态监控进程
17.1 介绍 top与ps命令很相似.它们都用来显示正在执行的进程.top与ps最大的不同之处,在于top在执行一段时间可以更新正在运行的进程 17.2 语法 top [选项] 常用选项 ...
- Bootstrap3 排版-页面主体
Bootstrap 将全局 font-size 设置为 14px,line-height 设置为 1.428.这些属性直接赋予 元素和所有段落元素.另外,<p> (段落)元素还被设置了等于 ...
- Android音频焦点处理相关的方法
有这么一种场景:你打开qq音乐.优酷客户端.视频播放的时候.这个时候突然来电显示了,此时所有的MediaPlayer相关的服务或者响应都进入"休眠"状态.那么,这个功能是怎么实现的 ...
- ROS新闻 Towards ROS-native drones 无人机支持方案
PX4/Firmware:https://github.com/PX4/Firmware PXFmini An open autopilot daughter-board for the Raspbe ...
- Dynamics CRM The difference between UserId and InitiatingUserId in Plugin
对于这两者的不同,MSDN的解释如下 • IExecutionContext.UserId Property: Gets the global unique identifier of the sys ...
- [nginx]统计文件下载是否完整思路(flask)
有一个需求是统计文件是否被用户完整下载,因为是web应用,用js没有找到实现方案,于是搜索下nginx的实现方案,把简单的探索过程记录下. 实验一 最原始的思路,查看日志,下载了一个文件之后我们看日志 ...