hbase报Dead Region Servers
问题描述:
16010端口启动成功,16020未启动。
hbase-root-regionserver-hbase2.log日志:
2019-08-14 16:45:10,552 WARN [Thread-37] hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/default/tsdb/3f2398c5b49b581c09687c49a739b007/recovered.edits/0000000000006253152-hbase2%2C16020%2C1562822459462.1565198820284.temp could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2121)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:295)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2702)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1413)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy18.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:372)
at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1603)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1388)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:554)
2019-08-14 16:45:10,568 ERROR [RS_LOG_REPLAY_OPS-regionserver/hbase2:16020-1-Writer-2] wal.WALSplitter: Got while writing log entry to log
java.io.IOException: File /hbase/default/tsdb/3f2398c5b49b581c09687c49a739b007/recovered.edits/0000000000006253152-hbase2%2C16020%2C1562822459462.1565198820284.temp could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2121)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:295)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2702)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.appendBuffer(WALSplitter.java:1601)
at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1559)
at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1084)
at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1076)
at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1046)
16010上日志:
原因:参考网址https://issues.apache.org/jira/browse/HBASE-12426
Description
I initially had a set of 5 region servers which had a single table which was pre split into 30 regions and was evenly distributed to all the regions with data.I then went ahead and removed/decommissioned a coupe of region servers,so in the end I have 3 region servers.Ran hbase hbck and verified there were 0 inconsistencies.However when 'status' command is issued is from hbase shell it shows a dead region server and the same is displayed in master UI as well.Fail over of hbase master did not fix the issue.On investigation we could see some WAL entries which was still pointing to the old region server.
/hbase/WALs/myserver,60020,1406745344969-splitting After removing these orphan entries from hdfs and master failover the dead region servers went away.I wonder if this could have caused any replication issues in the cluster.
解决:删除/hbase/WALs上的分割文件
<!--查看--> hdfs dfs -ls -R /hbase/WALs <!--删除-->
hdfs dfs -rm -R /hbase/WALs/*
<!--重启hbase-->
stop-hbase.sh
start-hbase.sh
16010上验证:
hbase报Dead Region Servers的更多相关文章
- HBase管理与监控——Dead Region Servers
[问题描述] 在持续批量写入HBase的情况下,出现了Dead Region Servers的情况.集群会把dead掉节点上的region自动分发到另外2个节点上,集群还能继续运行,只是少了1个节点. ...
- 应用程序连接hbase报错:java.net.SocketTimeoutException: callTimeout=60000
背景说明: 今天对生产环境hbase增加了节点,下午的时候一个同事反馈,应用程序后台报错,如下: Tue Feb 26 17:35:35 CST 2019, null, java.net.Socket ...
- HBase原理–所有Region切分的细节都在这里了
本文由 网易云发布. 作者:范欣欣(本篇文章仅限内部分享,如需转载,请联系网易获取授权.) Region自动切分是HBase能够拥有良好扩张性的最重要因素之一,也必然是所有分布式系统追求无限 ...
- Eclipse连接HBase 报错:org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
在eclipse中连接到HBase报错org.apache.hadoop.hbase.PleaseHoldException: Master is initializing,搜索了好久,网上其它人说的 ...
- hbase hbck及region RIT处理
hbase hbck主要用来检查hbase集群region的状态以及对有问题的region进行修复. hbase hbck :检查hbase所有表的一致性,如果正常,就会Print OK hbase ...
- hbase集群region数量和大小的影响
1.Region数量的影响 通常较少的region数量可使群集运行的更加平稳,官方指出每个RegionServer大约100个regions的时候效果最好,理由如下: 1)Hbase的一个特性MSLA ...
- 读者来信-5 | 如果你家HBase集群Region太多请点进来看看,这个问题你可能会遇到
前言:<读者来信>是HBase老店开设的一个问答专栏,旨在能为更多的小伙伴解决工作中常遇到的HBase相关的问题.老店会尽力帮大家解决这些问题或帮你发出求救贴,老店希望这会是一个互帮互助的 ...
- 读者来信 | 如果你家HBase集群Region太多请点进来看看,这个问题你可能会遇到
前言:<读者来信>是HBase老店开设的一个问答专栏,旨在能为更多的小伙伴解决工作中常遇到的HBase相关的问题.老店会尽力帮大家解决这些问题或帮你发出求救贴,老店希望这会是一个互帮互助的 ...
- java连接hbase报错
报错信息如下: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the va ...
随机推荐
- jsp常问面试题集
1.Servlet总结 在Java Web程序中,Servlet主要负责接收用户请求 HttpServletRequest,在doGet(),doPost()中做相应的处理,并将回应HttpServl ...
- Oracle数据库的特点与工作原理
Oracle数据库的特点 1.开放性: Oracle能在所有主流平台上运行(包括Windows),完全支持所有的工业标准,采用完全开放策略,可以使客户选择最适合的解决方案,对开发商全力支持. 2.可伸 ...
- Arcmap10.7连接oracle,但不装oracle客户端的配置
环境:arcgis 10.7,oracle服务端12cR1.理论上其他版本方法一样 使用情况:一般开发人员不安装oracle服务端,甚至oracle客户端也不装,此时要用arcmap连oracle需要 ...
- PS使用记录:人像(证件照)更换背景
PS使用记录:人像(证件照)更换背景 参考:非常干净的抠羽毛ps教程抠图羽毛 (1)准备2019PS,原图 (2)选择人像:选择->主体 (3)边缘处理:选择 ->选择并遮住 ->选 ...
- 【推荐系统】知乎live入门
参考链接: 知乎推荐系统live:姚凯飞推荐系统live 目录 1.推荐概览与框架 2.细节补充 3.召回 4.排序 5.常用技能与日常工作 5.用户画像-特征工程 6.相关经验 7.推荐考试拿分路径 ...
- tcp 建立连接三次握手
众所周知,tcp是安全的,可靠的,但是为什么呢.要理解这一点,首先先了解tcp的建立连接的原理. 三次握手 第一次握手:由客户端向服务器发送请求,SYN 表示请求连接,seq是序列号(随机选取). 第 ...
- [冲昏头脑]IDEA中的maven项目中学习log4j的日志操作
第一,你要有log4j的对应的包,由于我用的maven,所以直接在pom.xml文件依赖下载则可,如你尚为有此包,请自行百度下载导入,或上http://www.mvnrepository.com/搜索 ...
- 第一次写的MySQLHelper
一. 第一次写MysqlHelper,用来管理城市的数据库 二.MySQLHelper源代码 using MySql.Data.MySqlClient; using System; using Sys ...
- [POI2006]ORK-Ploughing(贪心,枚举)
[POI2006]ORK-Ploughing 题目描述 Byteasar, the farmer, wants to plough his rectangular field. He can begi ...
- 实现memcpy()函数及过程总结
1.为什么会写memcpy 在之前的应聘笔试上遇到一道笔试题,题目要求实现一个my_memcpy函数.函数原型:void * my_memcpy(void *dst, const void *src, ...