来自:http://centoshowtos.org/hadoop/fix-corrupt-blocks-on-hdfs/

How do I know if my hadoop hdfs filesystem has corrupt blocks, and how do I fix it?

The easiest way to determine this is to run an fsck on the filesystem. If you have setup your hadoop environment variables you should be able to use a path of /, if not hdfs://ip.or.hostname:50070/.

  1. hdfs fsck /

or.

  1. hdfs fsck hdfs://ip.or.hostname:50070/

If the end of your output looks something like this, you have corrupt blocks on your fs.

  1. .............................Status: CORRUPT
    Total size: 3453345169348 B (Total open files size: 664 B)
    Total dirs: 15233
    Total files: 14029
    Total symlinks: 0 (Files currently being written: 8)
    Total blocks (validated): 40961 (avg. block size 84308126 B) (Total open file blocks (not validated): 8)
    ********************************
    CORRUPT FILES: 2
    MISSING BLOCKS: 2
    MISSING SIZE: 15731297 B
    CORRUPT BLOCKS: 2
    ********************************
    Corrupt blocks: 2
    Number of data-nodes: 12
    Number of racks: 2
    FSCK ended at Fri Mar 27 XX:03:21 UTC 201X in XXX milliseconds
  2.  
  3. The filesystem under path '/' is CORRUPT

How do I know which files have blocks that are corrupt?

The output of the fsck above will be very verbose, but it will mention which blocks are corrupt. We can do some grepping of the fsck above so that we aren't "reading through a firehose".

  1. hdfs fsck / | egrep -v '^\.+$' | grep -v replica | grep -v Replica

or

  1. hdfs fsck hdfs://ip.or.host:50070/ | egrep -v '^\.+$' | grep -v replica | grep -v Replica

This will list the affected files, and the output will not be a bunch of dots, and also files that might currently have under-replicated blocks (which isn't necessarily an issue). The output should include something like this with all your affected files.

  1. /path/to/filename.fileextension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305
  2.  
  3. /path/to/filename.fileextension: MISSING 1 blocks of total size 15620361 B

The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated?

If it's easy enough just to replace the file, that's the route I would take.

Remove the corrupted file from your hadoop cluster

This command will move the corrupted file to the trash.

  1. hdfs dfs -rm /path/to/filename.fileextension
  1. hdfs dfs -rm hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension

Or you can skip the trash to permanently delete (which is probably what you want to do)

  1. hdfs dfs -rm -skipTrash /path/to/filename.fileextension
  1. hdfs dfs -rm -skipTrash hdfs://ip.or.hostname.of.namenode:50070/path/to/filename.fileextension

How would I repair a corrupted file if it was not easy to replace?

This might or might not be possible, but the first step would be to gather information on the file's location, and blocks.

  1. hdfs fsck /path/to/filename/fileextension -locations -blocks -files
  1. hdfs fsck hdfs://ip.or.hostname.of.namenode:50070/path/to/filename/fileextension -locations -blocks -files

From this data, you can track down the node where the corruption is. On those nodes, you can look through logs and determine what the issue is. If a disk was replaced, i/o errors on the server, etc. If possible to recover on that machine and get the partition with the blocks online that would report back to hadoop and the file would be healthy again. If that isn't possible, you will unforunately have to find another way to regenerate.

Fix Corrupt Blocks on HDFS的更多相关文章

  1. How to fix corrupt HDFS FIles

    1 问题描述 HDFS在机器断电或意外崩溃的情况下,有可能出现正在写的数据(例如保存在DataNode内存的数据等)丢失的问题.再次重启HDFS后,发现hdfs无法启动,查看日志后发现,一直处于安全模 ...

  2. ORA-19566: exceeded limit of 0 corrupt blocks for file E:\xxxx\<datafilename>.ORA.

    How to Format Corrupted Block Not Part of Any Segment (Doc ID 336133.1) To BottomTo Bottom In this D ...

  3. 手动修复 under-replicated blocks in HDFS

    解决方式步骤: 1.进入hdfs的pod kubectl get pod -o wide | grep hdfs kubectl exec -ti hadoop-hdfs-namenode-hdfs1 ...

  4. hdfs 如何实现退役节点快速下线(也就是退役节点上的数据块快速迁移)speed up decommission blocks removal

    以下是选择复制源节点的代码 代码总结: A=datanode上要复制block的Queue size与 target datanode没被选出之前待处理复制工作数之和. 1. 优先选择退役中的节点,因 ...

  5. [bigdata] 使用Flume hdfs sink, hdfs文件未关闭的问题

    现象: 执行mapreduce任务时失败 通过hadoop fsck -openforwrite命令查看发现有文件没有关闭. [root@com ~]# hadoop fsck -openforwri ...

  6. hdfs 常用命令

    (2)bin/hdfs dfs -mkdir -p /home/雨渐渐 (3)scp /media/root/DCE28B65E28B432E/download/第2周/ChinaHadoop第二讲\ ...

  7. 【原创】大数据基础之HDFS(1)HDFS新创建文件如何分配Datanode

    HDFS中的File由Block组成,一个File包含一个或多个Block,当创建File时会创建一个Block,然后根据配置的副本数量(默认是3)申请3个Datanode来存放这个Block: 通过 ...

  8. Hadoop 2.7.4 HDFS+YRAN HA增加datanode和nodemanager

    当前集群 主机名称 IP地址 角色 统一安装目录 统一安装用户 sht-sgmhadoopnn-01 172.16.101.55 namenode,resourcemanager /usr/local ...

  9. hdfs fsck命令查看HDFS文件对应的文件块信息(Block)和位置信息(Locations)

    关键字:hdfs fsck.block.locations 在HDFS中,提供了fsck命令,用于检查HDFS上文件和目录的健康状态.获取文件的block信息和位置信息等. fsck命令必须由HDFS ...

随机推荐

  1. HttpClient + Testng实现接口测试

    HttpClient教程 : https://www.yeetrack.com/?p=779 一,所需要的环境: 1,testng .httpclient和相关的依赖包 二.使用HttpClient登 ...

  2. Promise 基础学习

    Promise 是ES6的特性之一,采用的是 Promise/A++ 规范,它抽象了异步处理的模式,是一个在JavaScript中实现异步执行的对象. 按照字面释意 Promise 具有"承 ...

  3. Educational Codeforces Round 26 E - Vasya's Function

    数论题还是好恶心啊. 题目大意:给你两个不超过1e12的数 x,y,定义一个f ( x, y ) 如果y==0 返回 0 否则返回1+ f ( x , y - gcd( x , y ) ); 思路:我 ...

  4. Python作业-选课系统

    目录 Python作业-选课系统 days6作业-选课系统: 1. 程序说明 2. 思路和程序限制 3. 选课系统程序目录结构 4. 测试帐户说明 5. 程序测试过程 title: Python作业- ...

  5. 078 Hbase中rowkey设计原则

    1.热点问题 在某一时间段,有大量的数据同时对一个region进行操作 2.原因 对rowkey的设计不合理 对rowkey的划分不合理 3.解决方式 rowkey是hbase的读写唯一标识 最大长度 ...

  6. 《Gradle权威指南》--自定义Android Gradle工程

    No1: minSdkVersion public void minSdkVersion(int minSdkVersion){ setMinSdkVersion(minSdkVersion); } ...

  7. P3420 [POI2005]SKA-Piggy Banks

    P3420 [POI2005]SKA-Piggy Banks套路题,a通过某种关系和其他的点建立关系.这种题不是环(dfs就可以了),就是并查集找连通块.这种题要建图,画图,就很清楚了. #inclu ...

  8. Python读取大文件的"坑“与内存占用检测

    python读写文件的api都很简单,一不留神就容易踩"坑".笔者记录一次踩坑历程,并且给了一些总结,希望到大家在使用python的过程之中,能够避免一些可能产生隐患的代码. 1. ...

  9. 一次webapck4 配置文件无效的解决历程

    前言 升级webpack4,一定要去看文档,特别是更新说明,不要自持用过原本webpack就自己开始折腾.折腾到后面,可能就默默流下眼泪了. webpack4的变化 webpack-cli抽离 web ...

  10. js基础梳理-究竟什么是执行上下文栈(执行栈),执行上下文(可执行代码)?

    日常在群里讨论一些概念性的问题,比如变量提升,作用域和闭包相关问题的时候,经常会听一些大佬们给别人解释的时候说执行上下文,调用上下文巴拉巴拉,总有点似懂非懂,不明觉厉的感觉.今天,就对这两个概念梳理一 ...