HDFS中大数据常见运维指令总结

一、查看HDFS下的参数信息

[root@master ~]# hdfs

Usage: hdfs [--config confdir] COMMAND

       where COMMAND is one of:

  dfs                  run a filesystem command on the file systems supported in Hadoop.

  namenode -format     format the DFS filesystem

  secondarynamenode    run the DFS secondary namenode

  namenode             run the DFS namenode

  journalnode          run the DFS journalnode

  zkfc                 run the ZK Failover Controller daemon

  datanode             run a DFS datanode

  dfsadmin             run a DFS admin client

  haadmin              run a DFS HA admin client

  fsck                 run a DFS filesystem checking utility

  balancer             run a cluster balancing utility

  jmxget               get JMX exported values from NameNode or DataNode.

  mover                run a utility to move block replicas across

                       storage types

  oiv                  apply the offline fsimage viewer to an fsimage

  oiv_legacy           apply the offline fsimage viewer to an legacy fsimage

  oev                  apply the offline edits viewer to an edits file

  fetchdt              fetch a delegation token from the NameNode

  getconf              get config values from configuration

  groups               get the groups which users belong to

  snapshotDiff         diff two snapshots of a directory or diff the

                       current directory contents with a snapshot

  lsSnapshottableDir   list all snapshottable dirs owned by the current user

                                                Use -help to see options

  portmap              run a portmap service

  nfs3                 run an NFS version 3 gateway

  cacheadmin           configure the HDFS cache

  crypto               configure HDFS encryption zones

  storagepolicies      get all the existing block storage policies

  version              print the version

Most commands print help when invoked w/o parameters.

二、hdfs与dfs结合使用的参数信息

[root@master ~]# hdfs dfs

Usage: hadoop fs [generic options]

        [-appendToFile <localsrc> ... <dst>]

        [-cat [-ignoreCrc] <src> ...]

        [-checksum <src> ...]

        [-chgrp [-R] GROUP PATH...]

        [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]

        [-chown [-R] [OWNER][:[GROUP]] PATH...]

        [-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]

        [-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]

        [-count [-q] [-h] <path> ...]

        [-cp [-f] [-p | -p[topax]] <src> ... <dst>]

        [-createSnapshot <snapshotDir> [<snapshotName>]]

        [-deleteSnapshot <snapshotDir> <snapshotName>]

        [-df [-h] [<path> ...]]

        [-du [-s] [-h] <path> ...]

        [-expunge]

        [-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]

        [-getfacl [-R] <path>]

        [-getfattr [-R] {-n name | -d} [-e en] <path>]

        [-getmerge [-nl] <src> <localdst>]

        [-help [cmd ...]]

        [-ls [-d] [-h] [-R] [<path> ...]]

        [-mkdir [-p] <path> ...]

        [-moveFromLocal <localsrc> ... <dst>]

        [-moveToLocal <src> <localdst>]

        [-mv <src> ... <dst>]

        [-put [-f] [-p] [-l] <localsrc> ... <dst>]

        [-renameSnapshot <snapshotDir> <oldName> <newName>]

        [-rm [-f] [-r|-R] [-skipTrash] <src> ...]

        [-rmdir [--ignore-fail-on-non-empty] <dir> ...]

        [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]

        [-setfattr {-n name [-v value] | -x name} <path>]

        [-setrep [-R] [-w] <rep> <path> ...]

        [-stat [format] <path> ...]

        [-tail [-f] <file>]

        [-test -[defsz] <path>]

        [-text [-ignoreCrc] <src> ...]

        [-touchz <path> ...]

        [-usage [cmd ...]]

Generic options supported are

-conf <configuration file>     specify an application configuration file

-D <property=value>            use value for given property

-fs <local|namenode:port>      specify a namenode

-jt <local|resourcemanager:port>    specify a ResourceManager

-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster

-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.

-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is

bin/hadoop command [genericOptions] [commandOptions]

其他一些操作命令

说明：仅记录用于学习的指令。

1、追加文件内容到hdfs文件系统中的文件

hdfs dfs -appendToFile testc.sh /top.sh

2、查看hadoop的Sequencefile文件内容

[root@master ~]# hdfs dfs -text /sparktest.squence

11 aa

22 bb

11 cc

3、使用df命令查看可用空间

[root@master ~]# hdfs dfs -df -h
Filesystem Size Used Available Use%
hdfs://master:9000 64.5 G 812.7 M 49.4 G 1%

4、降低复制因子（默认3副本）

[root@master ~]# hdfs dfs -setrep -w 2 /sparktest.txt

5、使用du命令查看已用空间

[root@master ~]# hdfs dfs -du -s -h /hbase

240.3 K  /hbase

三、hdfs与getconf结合使用

[root@master ~]# hdfs getconf

hdfs getconf is utility for getting configuration information from the config file.

hadoop getconf

        [-namenodes]                    gets list of namenodes in the cluster.

        [-secondaryNameNodes]                   gets list of secondary namenodes in the cluster.

        [-backupNodes]                  gets list of backup nodes in the cluster.

        [-includeFile]                  gets the include file path that defines the datanodes that can join the cluster.

        [-excludeFile]                  gets the exclude file path that defines the datanodes that need to decommissioned.

        [-nnRpcAddresses]                       gets the namenode rpc addresses

        [-confKey [key]]                        gets a specific key from the configuration

1、获取NameNode的节点名称

[root@master ~]# hdfs getconf -namenodes

master

2、获取hdfs最小块信息（默认大小为1M,即1048576字节，如果想要修改的话必须为512的倍数，因为HDFS底层传输数据是每512字节进行校验）

[root@master ~]# hdfs getconf -confKey dfs.namenode.fs-limits.min-block-size

1048576

3、查找hdfs的NameNode的RPC地址

[root@master ~]# hdfs getconf -nnRpcAddresses

master:9000

四、hdfs与dfsadmin结合使用

[root@master ~]# hdfs dfsadmin

Usage: hdfs dfsadmin

Note: Administrative commands can only be run as the HDFS superuser.

        [-report [-live] [-dead] [-decommissioning]]

        [-safemode <enter | leave | get | wait>]

        [-saveNamespace]

        [-rollEdits]

        [-restoreFailedStorage true|false|check]

        [-refreshNodes]

        [-setQuota <quota> <dirname>...<dirname>]

        [-clrQuota <dirname>...<dirname>]

        [-setSpaceQuota <quota> <dirname>...<dirname>]

        [-clrSpaceQuota <dirname>...<dirname>]

        [-finalizeUpgrade]

        [-rollingUpgrade [<query|prepare|finalize>]]

        [-refreshServiceAcl]

        [-refreshUserToGroupsMappings]

        [-refreshSuperUserGroupsConfiguration]

        [-refreshCallQueue]

        [-refresh <host:ipc_port> <key> [arg1..argn]

        [-reconfig <datanode|...> <host:ipc_port> <start|status>]

        [-printTopology]

        [-refreshNamenodes datanode_host:ipc_port]

        [-deleteBlockPool datanode_host:ipc_port blockpoolId [force]]

        [-setBalancerBandwidth <bandwidth in bytes per second>]

        [-fetchImage <local directory>]

        [-allowSnapshot <snapshotDir>]

        [-disallowSnapshot <snapshotDir>]

        [-shutdownDatanode <datanode_host:ipc_port> [upgrade]]

        [-getDatanodeInfo <datanode_host:ipc_port>]

        [-metasave filename]

        [-setStoragePolicy path policyName]

        [-getStoragePolicy path]

        [-triggerBlockReport [-incremental] <datanode_host:ipc_port>]

        [-help [cmd]]

Generic options supported are

-conf <configuration file>     specify an application configuration file

-D <property=value>            use value for given property

-fs <local|namenode:port>      specify a namenode

-jt <local|resourcemanager:port>    specify a ResourceManager

-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster

-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.

-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is

bin/hadoop command [genericOptions] [commandOptions]

1、查看指定命令的帮助信息

[root@master ~]# hdfs dfsadmin -help safemode

-safemode <enter|leave|get|wait>:  Safe mode maintenance command.

                Safe mode is a Namenode state in which it

                        1.  does not accept changes to the name space (read-only)

                        2.  does not replicate or delete blocks.

                Safe mode is entered automatically at Namenode startup, and

                leaves safe mode automatically when the configured minimum

                percentage of blocks satisfies the minimum replication

                condition.  Safe mode can also be entered manually, but then

                it can only be turned off manually as well.

2、查看当前的模式

[root@master ~]# hdfs dfsadmin -safemode get

Safe mode is OFF

3、进入安全模式

[root@master ~]# hdfs dfsadmin -safemode enter

4、离开安全模式

[root@master ~]# hdfs dfsadmin -safemode leave

5、安全模式的wait状态

[root@master ~]# hdfs dfsadmin -safemode wait

6、检查HDFS集群的状态

[root@master ~]# hdfs dfsadmin -report

Configured Capacity: 69209960448 (64.46 GB)     #此集群中HDFS已配置的容量

Present Capacity: 53855645696 (50.16 GB)　　　　　#现有的HFDS容量

DFS Remaining: 53003517952 (49.36 GB)　　　　　　　#剩余的HDFS容量

DFS Used: 852127744 (812.65 MB)　　　　　　　　　　#HDFS使用存储的统计信息，按照文件大小统计

DFS Used%: 1.58%　　　　　　　　　　　　　　　　　　　　#同上，这里按照的是百分比统计

Under replicated blocks: 156　　　　　　　　　　　　##显示是否有任何未充分复制的块

Blocks with corrupt replicas: 0　　　　　　　　　　　#显示是否有损坏的块　

Missing blocks: 0　　　　　　　　　　　　　　　　　　　　#显示是否有丢失的块

-------------------------------------------------

Live datanodes (3):　　　　　　　　　　　　　　　　　　#显示集群中有多少个DataNode是活动的并可用

Name: 192.168.200.102:50010 (slave02)

Hostname: slave02

Decommission Status : Normal　　　　　　　　　　　　　#当前节点的DataNode的状态（Normal表示正常）

Configured Capacity: 23069986816 (21.49 GB)　　　　#DataNOde的配置和使用的容量

DFS Used: 284041216 (270.88 MB)

Non DFS Used: 3754188800 (3.50 GB)

DFS Remaining: 19031756800 (17.72 GB)

DFS Used%: 1.23%

DFS Remaining%: 82.50%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)　　　　　　　　　　　　　　　　　　　　#缓存使用情况统计信息

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Mon Aug 12 10:30:19 CST 2019

Name: 192.168.200.100:50010 (master)

Hostname: master

Decommission Status : Normal

Configured Capacity: 23069986816 (21.49 GB)

DFS Used: 284045312 (270.89 MB)

Non DFS Used: 7988813824 (7.44 GB)

DFS Remaining: 14797127680 (13.78 GB)

DFS Used%: 1.23%

DFS Remaining%: 64.14%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Mon Aug 12 10:30:18 CST 2019

Name: 192.168.200.101:50010 (slave01)

Hostname: slave01

Decommission Status : Normal

Configured Capacity: 23069986816 (21.49 GB)

DFS Used: 284041216 (270.88 MB)

Non DFS Used: 3611312128 (3.36 GB)

DFS Remaining: 19174633472 (17.86 GB)

DFS Used%: 1.23%

DFS Remaining%: 83.12%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Mon Aug 12 10:30:19 CST 2019

7、获取某个namenode的节点状态

hdfs haadmin -getServiceState master

五、hdfs与fsck结合使用

1、查看hdfs文件系统信息

[root@master hadoop]# hdfs fsck /
 .........................................
 ........................................


 Total size:    279242984 B　　　　　　　　　　　　　　　　　　　　　　　　#代表根目录下文件总大小

 Total dirs:    342　　　　　　　　　　　　　　　　　　　　　　　　　　　　　#根目录下总共有多少目录

 Total files:   460　　　　　　　　　　　　　　　　　　　　　　　　　　　　　#代表检测的目录下总共有多少文件

 Total symlinks:                0　　　　　　　　　　　　　　　　　　　　　#代表检测下目录下有多少个符号链接

 Total blocks (validated):      434 (avg. block size 643417 B)      #代表检测的目录下有多少的block是有效的

 Minimally replicated blocks:   434 (100.0 %)                      #代表拷贝的最小block块数

 Over-replicated blocks:        0 (0.0 %)　　　　　　　　　　　　　　　　#代表当前副本数大于指定副本数的block数量

 Under-replicated blocks:       156 (35.944702 %)　　　　　　　　　　　　#代表当前副本数小于指定副本数的block数量

 Mis-replicated blocks:         0 (0.0 %)　　　　　　　　　　　　　　　　#代表丢失的block数量

 Default replication factor:    3　　　　　　　　　　　　　　　　　　　　　#代表默认的副本数（自身一份，默认拷贝两份）

 Average block replication:     3.0　　　　　　　　　　　　　　　　　　　　#代表块平均的副本数

 Corrupt blocks:                0　　　　　　　　　　　　　　　　　　　　　　#代表坏的block数，这个值不为0，说明当前集群有不可恢复的块，即数据丢失

 Missing replicas:              1092 (45.614037 %)　　　　　　　　　　　　#代表丢失的副本数　　


 Number of data-nodes:          3　　　　　　　　　　　　　　　　　　　　　　代表有多少个Datanode节点

 Number of racks:               1　　　　　　　　　　　　　　　　　　　　　　#代表有多少个机架

FSCK ended at Mon Aug 12 10:44:46 CST 2019 in 217 milliseconds

The filesystem under path '/' is HEALTHY　　　　　　　　　　　　　　#检测状态

2、fsck指令显示HDFS块信息

Status: HEALTHY

 Total size:    279242984 B

 Total dirs:    342

 Total files:   460

 Total symlinks:                0

[root@master hadoop]# hdfs fsck / -files -blocks
.............................................................
..................................................

 Total blocks (validated):      434 (avg. block size 643417 B)

 Minimally replicated blocks:   434 (100.0 %)

 Over-replicated blocks:        0 (0.0 %)

 Under-replicated blocks:       156 (35.944702 %)

 Mis-replicated blocks:         0 (0.0 %)

 Default replication factor:    3

 Average block replication:     3.0

 Corrupt blocks:                0

 Missing replicas:              1092 (45.614037 %)

 Number of data-nodes:          3

 Number of racks:               1

FSCK ended at Mon Aug 12 10:54:27 CST 2019 in 415 milliseconds

The filesystem under path '/' is HEALTHY

六、快照

快照可以迅速对文件(夹)进行备份，不产生新文件，使用差值存储，默认是禁用状态。因此，想要使用快照功能的话得先启用该功能！我们可以通过“hdfs dfsadmin” 命令来启动或者禁止快照管理。

1、启用快照

[root@master hadoop]# hdfs dfsadmin -allowSnapshot /sparkTest2

Allowing snaphot on /sparkTest2 succeeded

2、禁用快照

[root@master hadoop]# hdfs dfsadmin -disallowSnapShot /sparkTest2

Disallowing snaphot on /sparkTest2 succeeded

3、创建快照

[root@master hadoop]# hdfs dfs -createSnapshot  /sparkTest2 sparkTest2Snapshot

Created snapshot /sparkTest2/.snapshot/sparkTest2Snapshot

4、快照的重命名操作

[root@master hadoop]# hdfs dfs -renameSnapshot /sparkTest2 sparkTest2Snapshot  newSnapshot

5、快照的删除操作

[root@master hadoop]# hdfs dfs -deleteSnapshot /sparkTest2 newSnapshot

6、快照模块的注意点

1、创建快照时，会在对应目录下生成一个.snapshot的隐藏目录，该目录下生成了一个子目录，这个目录就是 快照的名称，该目录下存放的都是创建快照时间节点的数据

2、快照并不产生新的文件

    这个不产生新的文件指的是不完全克隆一份数据出来，而是将数据都指向了同一个存储的ID

3、修改数据源文件与快照无

    当我们修改源数据文件时，快照中保存的数据并不会受到影响，快照中保存的数据还是当时创建快照时的数据

HDFS中大数据常见运维指令总结的更多相关文章

rabbitmq常见运维命令和问题总结
常见运维命令作用: yum安装erlang的环境配置: ERLANG_HOME=/usr/lib64/erlang export PATH=$PATH:$ERLANG_HOME/bin 常见rabbi ...
HDFS datanode心跳与运维中的实际案例
分布式系统的节点之间常采用心跳来维护节点的健康状态,如yarn的rm与nm之间,hdfs的nn与dn之间.DataNode会定期(dfs.heartbeat.interval配置项配置,默认是3秒)向 ...
10大HBase常见运维工具整理
摘要:HBase自带许多运维工具,为用户提供管理.分析.修复和调试功能.本文将列举一些常用HBase工具,开发人员和运维人员可以参考本文内容,利用这些工具对HBase进行日常管理和运维. HBase组 ...
linux常见运维题
linux运维题一.填空题 1. 在Linux 系统中,以文件方式访问设备 . (linux下一切都是文件) 2. Linux 内核引导时,从文件/etc/fstab中读取要加载的文件系统 . ( ...
IT职业技能图谱：架构师、H5、DBA、移动、大数据、运维...
转载作者:StuQ 文章收藏自微信:InfoQ 时隔近5个月,StuQ的小伙伴们再次出品了IT职业技能图谱更新版.这回除更新之前版本外,还添加了架构师.HTML 5.DBA等新的职业技能图谱.正 ...
大数据HDFS相关的一些运维题
1.在 HDFS 文件系统的根目录下创建递归目录“1daoyun/file”,将附件中的BigDataSkills.txt 文件,上传到 1daoyun/file 目录中,使用相关命令查看文件系统中 ...
Hadoop大数据学习视频教程大数据hadoop运维之hadoop快速入门视频课程
Hadoop是一个能够对大量数据进行分布式处理的软件框架. Hadoop 以一种可靠.高效.可伸缩的方式进行数据处理适用人群有一定Java基础的学生或工作者课程简介 Hadoop是一个能够对大量数据进 ...
【redis使用全解析】常见运维操作
作者:gnuhpc 出处:http://www.cnblogs.com/gnuhpc/ 1.1 启动 1.1.1 启动redis $ redis-server redis.conf 常见选项: ./r ...
RocketMQ 运维指令
1.1. 控制台使用 RocketMQ 提供有控制台及一系列控制台命令,用于管理员对主题,集群,broker 等信息的管理登录控制台首先进入RocketMQ 工程,进入/RocketMQ/bin ...

随机推荐

JAVA环境安装及其配置
一.JAVA版本的选择我使用的是JAVA8,所以这次方法是JAVA8的安装过程. 这里我给出其下载地址,可以自行下载. 链接: https://pan.baidu.com/s/1k2Xydi6FJ2 ...
WEB安全漏洞挖掘向入坑指北
这个指北不会给出太多的网站和方向建议,因为博主相信读者能够从一个点从而了解全局,初期的时候就丢一大堆安全网址导航只会浇灭人的热情,而且我也不适合传道授业解惑hhh 安全论坛: 先知社区 freebuf ...
jupyterlab 增加新内核的方法ipykernel
参考: https://blog.csdn.net/C_chuxin/article/details/82690830
CSP-S 2020 游记
2020.10.11 初赛了,没怎么做题,之前在网上两次初赛模拟赛 95pts / 94pts,还白嫖了一本书,感觉挺好. 去考场,中途不舒服去了厕所,回来发现有点来不及,阅读程序最后两题不会瞎蒙. ...
sublime text3 将tab转换为2个或4个空格，并显示空格
有很多软件并不能解析tab,而往往有的程序员喜欢使用tab键进行对齐,sublime text可以很好的解决这个问题. 首先打开sublime text,点击preferences->setti ...
初入Nginx--配置篇
Nginx的主配置文件为/path/to/nginx/nginx.conf.Nginx.conf的配置文件结构主要由以下几个部分组成: ..... events{ .... } http{ .... ...
Python 学习笔记之 03 - 函数总结
函数总结最基本的一种代码抽象的方式. 定义函数使用def语句进行定义, return进行函数返回. 一旦执行导return,函数就执行完毕. 即使函数未指定retur ...
centos 7 配置 mysql 5.7 主从复制
centos 7 配置 mysql 5.7 主从复制主库:192.168.12.3 从库:192.168.12.2 1. 主库从库所在服务器关闭防火墙Systemctl stop firewalld ...
详解双向链表的基本操作(C语言)
@ 目录 1.双向链表的定义 2.双向链表的创建 3.双向链表的插入 4.双向链表的删除 5.双向链表更改节点数据 6.双向链表的查找 7.双向链表的打印 8.测试函数及结果 1.双向链表的定义上一 ...
Netty源码解析 -- PoolSubpage实现原理
前面文章说了PoolChunk如何管理Normal内存块,本文分享PoolSubpage如何管理Small内存块. 源码分析基于Netty 4.1.52 内存管理算法 PoolSubpage负责管理S ...

HDFS中大数据常见运维指令总结

HDFS中大数据常见运维指令总结的更多相关文章

随机推荐

热门专题