hbase从集群中有8台regionserver服务器,已稳定运行了5个多月,8月15号,发现集群中4个datanode进程死了,经查原因是内存 outofMemory了(因为这几台机器上部署了spark,给spark开的-Xmx是32g),然后对从集群进行了恢复并进行了补数据,写负载比较 重,又运行了几天,发现从集群写不进去数据了
①、regionserver端            
            regionserver端现象一、
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region table_version,hour_search_860010-1118000000_2014010418,1403685954922.640fc829f767a4e33e296fc4f4cca4a4. after a delay of 13125
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_hotstatic,860010-0507010000_2014071711_0_entry_00000008749,1406860400351.bcb13556daad6bda72b3c84df5ec912e. after a delay of 10066
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_screen,860010-2288050100_2014030419_0_00000000920,1402321410433.da4ff8fe84325e7da075b0fba1f3c3c9. after a delay of 11767
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_hotstatic,860010-1119060300_2014040422_0_bounce_ratio_00000000867,1402022490696.4fcfd303cff4211de61ff55f77d46317. after a delay of 10256
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_url,860010-0204020100_2014010607_0_8c54e33efae9da957548659c5b96f04e,1403329534827.b1c3733f5a8deade785bd71ee8660268. after a delay of 16628
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_hotstatic,860010-0335010000_2014041011_0_exit_00000000000,1399606854480.b1f83e693e0fdb18e168943d282cb6b0. after a delay of 18889
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_main,860010-2014041100_2014060513,1402472695828.c3cd5c3a1fcc01e0493a8043e376e948. after a delay of 21727
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_screen,,1396924866983.e3f0096984896efa77348dc4f89a9f3c. after a delay of 17782
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_area,860010-2316230100_2014031222_0_pv_00000000005,1395829898129.c426c025521dd8facd291f1a8ba15f13. after a delay of 6147
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_stay,860010-0604100000_2014031918_0_00000000006,1395349588239.e592ebe99f412b565381f6649bbf857f. after a delay of 16294
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_hotstatic,860010-0307010000_2014070100_0_entry_00000001023,1405881888126.055c3c19009c6822e00def0b7431d0d8. after a delay of 20105
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_hotstatic,860010-0506000000_2014072817_0_bounce_ratio_00000047803,1407729791396.22b0d3234c1173859992d231d2f2d427. after a delay of 7105
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_stay,860010-2328010100_2014010616_0_00000000011,1401896532036.547015d92a9021e31bac69909979f4ac. after a delay of 5485
2014-08-21 15:03:31,011 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020.periodicFlusher requesting flush for region hour_flash,860010-0521010000_2014030620_0_00000000007,1407471178069.aa4f5e7e7f8e3dd150666ae1205ebbcf. after a delay of 11484


        regionserver端现象二、
2014-08-21 10:30:43,384 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=79, maxlogs=32; forcing flush of 1 regions(s): 12663e173854886463edfe8c6495dca0
2014-08-21 10:31:53,456 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=65, maxlogs=32; forcing flush of 9 regions(s): 192e3fcd5afce28ea2abc8bbb895163d, 2149c6216b259083a6743c61ec7f62b1, 214aac4a7f31cfc346889aabdbdbadd3, 2248c5c76b0fd55fe11d428a77330e6b, 2f5d56a3c17fd8e4f6f6f62d0fbcda69, 2ff390bdbb79cb8dc8ba05b4e56c26ea, 398376b87a43d83d84e96169dadb7865, b5431ef4a70fb2a244d83ae3316506f9, f34c16e000e648988bc00692bc6c7cea
2014-08-21 10:33:25,657 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=66, maxlogs=32; forcing flush of 4 regions(s): 192e3fcd5afce28ea2abc8bbb895163d, 2f5d56a3c17fd8e4f6f6f62d0fbcda69, b5431ef4a70fb2a244d83ae3316506f9, f34c16e000e648988bc00692bc6c7cea
2014-08-21 10:33:55,418 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=60, maxlogs=32; forcing flush of 4 regions(s): 352e2b4a2a42438d5ecb735de1c9e9f4, 5d08d2713d809334514be9ec7e2512cb, 981285a02ae3af797b10e621e76eccf8, f9a55c4661a1ee2f16e3c1e6ec978595
2014-08-21 10:35:02,013 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=51, maxlogs=32; forcing flush of 3 regions(s): a6064be87ca7005a4e4ab607501d9f5a, cc84289443f2478105bd8078df2bccd3, f533780eb2913bf8819cecea52bbeb43
2014-08-21 10:39:05,129 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=35, maxlogs=32; forcing flush of 1 regions(s): 5b0d0af8b9b684237373e941238bdfa2
2014-08-21 11:34:41,619 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=33, maxlogs=32; forcing flush of 1 regions(s): 2149c6216b259083a6743c61ec7f62b1
2014-08-21 11:36:53,437 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=33, maxlogs=32; forcing flush of 1 regions(s): eec50ffaa2639f7c0fbd7ac727c16f16
2014-08-21 11:37:46,667 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=34, maxlogs=32; forcing flush of 1 regions(s): eec50ffaa2639f7c0fbd7ac727c16f16
2014-08-21 11:38:09,366 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=35, maxlogs=32; forcing flush of 1 regions(s): eec50ffaa2639f7c0fbd7ac727c16f16
2014-08-21 11:38:57,140 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=35, maxlogs=32; forcing flush of 15 regions(s): 0c223074833c6a3e2835feb5f9640298, 0f461ff6911b932c013e8d5f57d110d9, 2846b752106aa8079f49e784666c17a8, 53e7a57b2028e32e90040071014b13be, 5f2053770878cfc4ae4e1849f3e128b8, 66fd00187ab38d3253fd2b440ea1a082, 6e3c2282edaebdb1bda15d49fe22df6f, 7e45f8f49ff6b697dc36d988f15a1643, a625182cd59e5ae87ead3113b3a89aaa, b77403d41440cda21e92e4d20d1dc4bc, ba2bdc3cdc3a748c5fbc4d19cdda1bbf, bab28f8f990d3aed73a982964f5731f9, e8c5bd8150ee49d0ba13ee77633d1936, f5064874556aca3c45a67463b2ad37d5, f9961ca861361ab0913f6e05571d45b5
2014-08-21 11:40:02,163 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=36, maxlogs=32; forcing flush of 15 regions(s): 0c223074833c6a3e2835feb5f9640298, 0f461ff6911b932c013e8d5f57d110d9, 2846b752106aa8079f49e784666c17a8, 53e7a57b2028e32e90040071014b13be, 5f2053770878cfc4ae4e1849f3e128b8, 66fd00187ab38d3253fd2b440ea1a082, 6e3c2282edaebdb1bda15d49fe22df6f, 7e45f8f49ff6b697dc36d988f15a1643, a625182cd59e5ae87ead3113b3a89aaa, b77403d41440cda21e92e4d20d1dc4bc, ba2bdc3cdc3a748c5fbc4d19cdda1bbf, bab28f8f990d3aed73a982964f5731f9, e8c5bd8150ee49d0ba13ee77633d1936, f5064874556aca3c45a67463b2ad37d5, f9961ca861361ab0913f6e05571d45b5
2014-08-21 11:40:47,301 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=37, maxlogs=32; forcing flush of 14 regions(s): 0c223074833c6a3e2835feb5f9640298, 0f461ff6911b932c013e8d5f57d110d9, 2846b752106aa8079f49e784666c17a8, 53e7a57b2028e32e90040071014b13be, 5f2053770878cfc4ae4e1849f3e128b8, 66fd00187ab38d3253fd2b440ea1a082, 6e3c2282edaebdb1bda15d49fe22df6f, a625182cd59e5ae87ead3113b3a89aaa, b77403d41440cda21e92e4d20d1dc4bc, ba2bdc3cdc3a748c5fbc4d19cdda1bbf, bab28f8f990d3aed73a982964f5731f9, e8c5bd8150ee49d0ba13ee77633d1936, f5064874556aca3c45a67463b2ad37d5, f9961ca861361ab0913f6e05571d45b5
2014-08-21 11:41:23,446 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=37, maxlogs=32; forcing flush of 17 regions(s): 12663e173854886463edfe8c6495dca0, 25bc0f41f28710d047c7e3775f388e39, 2f5d56a3c17fd8e4f6f6f62d0fbcda69, 3619ffc85d19102863eafe36e6d3acf8, 3b4f4f57abec73084a22bd7392247d86, 42e4757fce922723831d29326540b177, 6c53f4fb301af91f54f0d1590a7c856f, a2e173875e2287bd9ac74b9cdd289fde, c02ca04051d2684b3138662803892dd3, cd6158fa98bf85d39118e450c454e93a, d75e31ed4e06b867652a70160cd90c71, e024920c26c08afe5004f5ae51f63d35, f34c16e000e648988bc00692bc6c7cea, f378e07ac843beb2becc57e79af0362a, f49dba00bbb0c359935146ffa52bdc70, f9a55c4661a1ee2f16e3c1e6ec978595, ff82c095987dc2f6becc66cd777c7970
2014-08-21 11:42:02,502 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs: logs=38, maxlogs=32; forcing flush of 17 regions(s): 12663e173854886463edfe8c6495dca0, 25bc0f41f28710d047c7e3775f388e39, 2f5d56a3c17fd8e4f6f6f62d0fbcda69, 3619ffc85d19102863eafe36e6d3acf8, 3b4f4f57abec73084a22bd7392247d86, 42e4757fce922723831d29326540b177, 6c53f4fb301af91f54f0d1590a7c856f, a2e173875e2287bd9ac74b9cdd289fde, c02ca04051d2684b3138662803892dd3, cd6158fa98bf85d39118e450c454e93a, d75e31ed4e06b867652a70160cd90c71, e024920c26c08afe5004f5ae51f63d35, f34c16e000e648988bc00692bc6c7cea, f378e07ac843beb2becc57e79af0362a, f49dba00bbb0c359935146ffa52bdc70, f9a55c4661a1ee2f16e3c1e6ec978595, ff82c095987dc2f6becc66cd777c7970
        
        regionserver端现象三(这个已经通过hdfs端和hbase端,配置同样的dfs.socket.timeout=900000修复):

2014-08-23 11:19:17,598 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block blk_-6884116396095947381_111959717java.net.SocketTimeoutException: 66000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.114:53194 remote=/10.130.136.114:50010]
        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readLong(DataInputStream.java:416)
        at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3127)

2014-08-23 11:19:17,599 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-4289533060700867612_111959745 bad datanode[0] 10.130.136.114:50010
2014-08-23 11:19:17,599 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-6884116396095947381_111959717 bad datanode[0] 10.130.136.114:50010
2014-08-23 11:19:17,599 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-4289533060700867612_111959745 in pipeline 10.130.136.114:50010, 10.130.136.115:50010: bad datanode 10.130.136.114:50010
2014-08-23 11:19:17,599 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-6884116396095947381_111959717 in pipeline 10.130.136.114:50010, 10.130.136.115:50010: bad datanode 10.130.136.114:50010
2014-08-23 11:22:27,624 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=681.33 MB, free=3.32 GB, max=3.99 GB, blocks=10035, accesses=44791415, hits=40264747, hitRatio=89.89%, , cachingAccesses=40274782, cachingHits=40264747, cachingHitsRatio=99.97%, , evictions=0, evicted=0, evictedPerRun=NaN

②.datanode端
        同时发现hdfs datanode里出现很多异常:
        datanode异常1:
        
        java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.114:50010 remote=/10.130.136.114:59516]
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.114:50010 remote=/10.130.136.114:59524]
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.114:50010 remote=/10.130.136.114:59520]
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.114:50010 remote=/10.130.136.114:59524]
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.114:50010 remote=/10.130.136.114:59520]
2014-08-23 21:26:25,292 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-3011273698174656346_113017023 received exception org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_-3011273698174656346_113017023 is valid, and cannot be written to.
org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_-3011273698174656346_113017023 is valid, and cannot be written to.

   datanode异常2:
2014-08-23 23:06:56,413 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.130.136.114:50010 java.io.IOException: Bad connect ack with firstBadLink as 10.130.136.119:50010
2014-08-23 23:06:56,895 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.130.136.114:50010 java.io.IOException: Bad connect ack with firstBadLink as 10.130.136.119:50010
2014-08-23 23:06:57,399 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.130.136.114:50010 java.io.IOException: Bad connect ack with firstBadLink as 10.130.136.119:50010
2014-08-23 23:06:57,548 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.130.136.114:50010 java.io.IOException: Bad connect ack with firstBadLink as 10.130.136.119:50010
2014-08-23 23:06:57,935 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream 10.130.136.114:50010 java.io.IOException: Bad connect ack with firstBadLink as 10.130.136.119:50010


  datanode异常3:
2014-08-24 22:15:21,714 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Error processing datanode Command
java.io.IOException: Error in deleting blocks.
        at org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:1967)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:1181)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:1143)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:980)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527)
        at java.lang.Thread.run(Thread.java:724)

  datanode异常4:
2014-08-24 16:45:35,855 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_2324951138767077684_113876340 received exception org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_2324951138767077684_113876340 is valid, and cannot be written to.
org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_2324951138767077684_113876340 is valid, and cannot be written to.
2014-08-24 16:45:42,861 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_2305069720503912789_113876452 received exception org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_2305069720503912789_113876452 is valid, and cannot be written to.
org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_2305069720503912789_113876452 is valid, and cannot be written to.
2014-08-24 16:45:43,713 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-318311590422520941_113876153 received exception org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_-318311590422520941_113876153 is valid, and cannot be written to.

java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.118:50010 remote=/10.130.136.116:34363]  (注:把dfs.datanode.socket.write.timeout=1800000,然后抛1800000 millis timeout while waiting for channel to be ready for write)

java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.118:50010 remote=/10.130.136.118:55147]
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/10.130.136.118:50010 remote=/10.130.136.118:55147]


③.namenode端
   namenode里出现大量如下日志,(现在每天的INFO级别以上的日志达到400多G,以前日志量很少):
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-707612696772368160 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_8944996150588918994_62583982 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_8944996150588918994 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_962585261283706817_105572114 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_962585261283706817 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-1886285939257877420_33867512 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-1886285939257877420 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-405662021725661377_23563134 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-405662021725661377 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-6831374360596453862_49890202 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-6831374360596453862 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-1458260851950313618_92180801 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-1458260851950313618 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_2754038012732967699_52183933 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_2754038012732967699 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-1651824977329564981_102396163 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-1651824977329564981 to 10.130.136.116:50010^C 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-8075220412997159517_101639855 on 10.130.136.116:50010 size 496 does not belong to any file. 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-8075220412997159517 to 10.130.136.116:50010 
2014-08-25 11:30:01,418 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_2245696672665686485_98393215 on 10.130.136.116:50010 size 496 does not belong to any file.

hbase集群写不进去数据的问题追踪过程的更多相关文章

  1. Hbase集群扩展

    当hbase集群节点不够用时,我们须要新增节点来对集群进行扩展.hbase集群的扩展是非常easy的,过程例如以下: 一.准备一台新机器作为扩展节点,这里是作为slaves15,该机子要先与maste ...

  2. Hadoop hbase集群断电数据块被破坏无法启动

    集群机器意外断电重启,导致hbase 无法正常启动,抛出reflect invocation异常,可能是正在执行的插入或合并等操作进行到一半时中断,导致部分数据文件不完整格式不正确或在hdfs上blo ...

  3. 使用Hbase快照将数据输出到互联网区测试环境的临时Hbase集群

    通过snapshot对内网测试环境Hbase生产集群的全量数据(包括原始数据和治理后数据)复制到互联网Hbase临时集群.工具及原理: 1)         Hbase自带镜像导出工具(snapsho ...

  4. 大数据中HBase集群搭建与配置

    hbase是分布式列式存储数据库,前提条件是需要搭建hadoop集群,需要Zookeeper集群提供znode锁机制,hadoop集群已经搭建,参考 Hadoop集群搭建 ,该文主要介绍Zookeep ...

  5. hbase集群安装与部署

    1.相关环境 centos7 hadoop2.6.5 zookeeper3.4.9 jdk1.8 hbase1.2.4 本篇文章仅涉及hbase集群的搭建,关于hadoop与zookeeper的相关部 ...

  6. Hbase集群搭建及所有配置调优参数整理及API代码运行

    最近为了方便开发,在自己的虚拟机上搭建了三节点的Hadoop集群与Hbase集群,hadoop集群的搭建与zookeeper集群这里就不再详细说明,原来的笔记中记录过.这里将hbase配置参数进行相应 ...

  7. 一条scan查询把HBase集群干趴下

    最近在给公司搭建CDH集群,在测试集群性能时,写了一条简单的scan查询语句竟然把hbase集群的所有regionserver干趴下了.这让我云里雾里的飘飘然. 背景介绍 CDH集群,2台主节点.3台 ...

  8. dfs.datanode.max.xcievers参数导致hbase集群报错

    2013/08/09 转发自http://bkeep.blog.163.com/blog/static/123414290201272644422987/ [案例]dfs.datanode.max.x ...

  9. hbase单机环境的搭建和完全分布式Hbase集群安装配置

    HBase 是一个开源的非关系(NoSQL)的可伸缩性分布式数据库.它是面向列的,并适合于存储超大型松散数据.HBase适合于实时,随机对Big数据进行读写操作的业务环境. @hbase单机环境的搭建 ...

随机推荐

  1. mysql 筛选重复名称

    CREATE TABLE `blur_article` ( `id` ) NOT NULL, `name` ) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=In ...

  2. python--基本类型之字符串

    String(字符串): 定义和创建字符串: 定义:字符串是一个有序的字符的集合,用于存储和表示基本的文本信息.注意:字符串的单引号和双引号都无法取消特殊字符的含义,如果想让引号内 var1='Hel ...

  3. 通过py2exe打包python程序的过程中,解决的一系列问题

    py2exe的使用方法参考<py2exe使用方法>. 注:程序可以在解释器中正常运行,一切问题都出在打包过程中. 问题1: 现象:RuntimeError: maximum recursi ...

  4. Hibernate-ORM:10.Hibernate中的分页

    ------------吾亦无他,唯手熟尔,谦卑若愚,好学若饥------------- 本篇博客讲述Hibernate中的分页 hibernate中的分页其实很好写,它通过操作对象的方式,来进行分页 ...

  5. https 通信流程和Charles 抓包原理

    1. https 通信流程 ①客户端的浏览器向服务器传送客户端SSL 协议的版本号,加密算法的种类,产生的随机数,以及其他服务器和客户端之间通讯所需要的各种信息.②服务器向客户端传送SSL 协议的版本 ...

  6. java存储位置经典例子

    String a="a";String b="b";String c="ab";String d="ab";String ...

  7. 菜鸟级appium 必看

    之所以写这个,因为太多新人,appium环境半天都搭建不好,版本问题,兼容问题等等. 自己的解决方案:1 官网下载nodejs,建议安装长期支持版 2 进入appium官网,点击下载,跳转到githu ...

  8. QC的使用学习(一)

    今天学习的时间很少,就利用睡前的一点时间来学习一下刚安装好的QC. 1.后台站点管理.主要是对八大选项的了解: site project:顾名思义,就站点项目管理,管理域和项目. site user: ...

  9. 【转】用ASP.NET Core 2.1 建立规范的 REST API -- 缓存和并发

    原文链接:https://www.cnblogs.com/cgzl/p/9165388.html 本文所需的一些预备知识可以看这里: http://www.cnblogs.com/cgzl/p/901 ...

  10. HDU 5794 A Simple Chess Lucas定理+dp

    题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=5794 题意概述: 给出一个N*M的网格.网格上有一些点是障碍,不能经过.行走的方式是向右下角跳马步.求 ...