HDFS Snapshots
Overview
HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and disaster recovery.
HDFS 快照是文件系一个时间点的只读的副本。快照可以是部分文件系统,或者整个文件系统。一些场景使用快照的场景是数据备份,防止用户误操作和灾难恢复。
The implementation of HDFS Snapshots is efficient:
- Snapshot creation is instantaneous: the cost is O(1) excluding the inode lookup time.
- Additional memory is used only when modifications are made relative to a snapshot: memory usage is O(M), where M is the number of modified files/directories.
- Blocks in datanodes are not copied: the snapshot files record the block list and the file size. There is no data copying.
- Snapshots do not adversely affect regular HDFS operations: modifications are recorded in reverse chronological order so that the current data can be accessed directly. The snapshot data is computed by subtracting the modifications from the current data.
使用HDFS 快照是高效的:
· 快照创建是瞬间的:成本是0(1)排除查找信息节点的时间 。
· 额外的内存使用仅仅当对快照进行修改时产生:内存使用时0(M),M是修改文件/目录的数量。
· 在datanode中的块不会被拷贝:快照文件记录这些块列表和文件大小。不会产生数据拷贝。
· 快照不会对日常的HDFS操作产生不利的影响:修改被按反向时间排序记录,这样当前数据可以直接的访问。快照数据是由当前数据减去修改数据计算出来的。
Snapshottable Directories
Snapshots can be taken on any directory once the directory has been set as snapshottable. A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. There is no limit on the number of snapshottable directories. Administrators may set any directory to be snapshottable. If there are snapshots in a snapshottable directory, the directory can be neither deleted nor renamed before all the snapshots are deleted.
快照可以产生在任何被设置为snapshottable的目录中。一个snapshottable目录可以同时容纳65536个快照。snapshottable目录没有个数上限,管理员可以设置任意个snapshottable。如果一个snapshottable中存在快照,那么这个目录在删除所有快照之前,不能删除或改名。
Nested snapshottable directories are currently not allowed. In other words, a directory cannot be set to snapshottable if one of its ancestors/descendants is a snapshottable directory.
嵌套的snapshottable目录在现在并不支持。换句话说,如果一个目录的父目录/子目录是一个snapshottable目录的话,那么其不能设置为snapshottable。
Snapshot Paths
For a snapshottable directory, the path component ".snapshot" is used for accessing its snapshots. Suppose /foo is a snapshottable directory, /foo/bar is a file/directory in /foo, and /foo has a snapshot s0. Then, the path
/foo/.snapshot/s0/bar
对于一个snapshottable目录,”.snapshot”组件有利于访问其快照。假设/foo是一个snapshottable目录,/foo/bar是 /foo中的一个文件/目录,/foo有一个快照s0,那么这个路径
/foo/.snapshot/s0/bar
refers to the snapshot copy of /foo/bar. The usual API and CLI can work with the ".snapshot" paths. The following are some examples.
列出一个snapshottable目录中所有的快照:关联到快照副本/foo/bar。一般的API和CLI都可以在”.snapshot”路径上工作。下面是一些例子
- Listing all the snapshots under a snapshottable directory:
- 列出一个snapshottable目录下所有的快照:
hdfs dfs -ls /foo/.snapshot
- Listing the files in snapshot s0:
- 列出在快照s0中的所有文件:
hdfs dfs -ls /foo/.snapshot/s0
- Copying a file from snapshot s0:
- copy一个文件从快照s0:
hdfs dfs -cp -ptopax /foo/.snapshot/s0/bar /tmp
Note that this example uses the preserve option to preserve timestamps, ownership, permission, ACLs and XAttrs.
注意这个例子使用了保存选项来保存时间戳,所有权,权限,ACLS和XAttrs
Upgrading to a version of HDFS with snapshots
The HDFS snapshot feature introduces a new reserved path name used to interact with snapshots: .snapshot. When upgrading from an older version of HDFS, existing paths named .snapshot need to first be renamed or deleted to avoid conflicting with the reserved path. See the upgrade section in the HDFS user guide for more information.
HDFS快照特性引用了一个新的保留路径名,来进行快照交互:.snapshot。当HDFS从一个旧版本升级时,现存的路径名称.snapshot需要首先重命名或者删除,来避免保留路径的冲突。更多详细类容,参考HDFS用户指南升级部分。
Snapshot Operations
Administrator Operations
The operations described in this section require superuser privilege.
本节中描述的操作需要超级用户权限
Allow Snapshots
Allowing snapshots of a directory to be created. If the operation completes successfully, the directory becomes snapshottable.
允许一个快照目录被创建。如果这个操作成功完成,这个目录就变成snapshottable
- Command(命令):
hdfs dfsadmin -allowSnapshot <path>
- Arguments(参数):
path |
The path of the snapshottable directory. |
See also the corresponding Java API void allowSnapshot(Path path) in HdfsAdmin.
也可以参考Hdfsadmin中相关JAVA API void allowSnapshot(Path path)。
Disallow Snapshots
Disallowing snapshots of a directory to be created. All snapshots of the directory must be deleted before disallowing snapshots.
禁止快照目录创建。在禁止快照之前目录中的所有快照必须删除。
- Command(命令):
hdfs dfsadmin -disallowSnapshot <path>
- Arguments(参数):
path |
The path of the snapshottable directory. |
See also the corresponding Java API void disallowSnapshot(Path path) in HdfsAdmin.
也可以参考Hdfsadmin中相关JAVA API void disallowSnapshot(Path path)。
User Operations
The section describes user operations. Note that HDFS superuser can perform all the operations without satisfying the permission requirement in the individual operations.
本节介绍用户操作。注意HDFS超级用户,可以执行除了个人操作需要满足的安全权限之外的所有操作。
Create Snapshots
Create a snapshot of a snapshottable directory. This operation requires owner privilege of the snapshottable directory.
在snapshottable目录中创建一个一个快照。这个操作需要拥有snapshottabl目录所有者权限。
- Command(命令):
hdfs dfs -createSnapshot <path> [<snapshotName>]
- Arguments(参数):
path |
The path of the snapshottable directory. |
snapshotName |
The snapshot name, which is an optional argument. When it is omitted, a default name is generated using a timestamp with the format "'s'yyyyMMdd-HHmmss.SSS", e.g. "s20130412-151029.033". |
See also the corresponding Java API Path createSnapshot(Path path) and Path createSnapshot(Path path, String snapshotName) in FileSystem. The snapshot path is returned in these methods.
也可以参考文件系统中相关JAVA API Path createSanpshot(Path path)和Path createSnapshot(Path path,String snapshotName)。在这些方法中返回了快照路径。
Delete Snapshots
Delete a snapshot of from a snapshottable directory. This operation requires owner privilege of the snapshottable directory.
从一个snapshottable目录中删除快照。这个操作需要拥有snapshottabl目录所有者权限。
- Command:
hdfs dfs -deleteSnapshot <path> <snapshotName>
- Arguments:
path |
The path of the snapshottable directory. |
snapshotName |
The snapshot name. |
See also the corresponding Java API void deleteSnapshot(Path path, String snapshotName) in FileSystem.
Rename Snapshots
Rename a snapshot. This operation requires owner privilege of the snapshottable directory.
重命名一个快照。这个操作需要拥有snapshottabl目录所有者权限。
- Command:
hdfs dfs -renameSnapshot <path> <oldName> <newName>
- Arguments:
path |
The path of the snapshottable directory. |
oldName |
The old snapshot name. |
newName |
The new snapshot name. |
See also the corresponding Java API void renameSnapshot(Path path, String oldName, String newName) in FileSystem.
也可以参考文件系统中相关JAVA API void renameSnapshot(Path path, String oldName, String newName)
Get Snapshottable Directory Listing
Get all the snapshottable directories where the current user has permission to take snapshtos.
获得当前用户有权限产生快照的所有snapshottabl目录
- Command:
hdfs lsSnapshottableDir
- Arguments: none
See also the corresponding Java API SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing() in DistributedFileSystem.
也可以参考分布式文件系统中相关JAVA API SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()。
Get Snapshots Difference Report
Get the differences between two snapshots. This operation requires read access privilege for all files/directories in both snapshots.
在2个快照之间获得差异。这个操作需要在2个快照中,所有文件/目录的读和访问权限。
- Command:
hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
- Arguments:
path |
The path of the snapshottable directory. |
fromSnapshot |
The name of the starting snapshot. |
toSnapshot |
The name of the ending snapshot. |
- Results:
+ |
The file/directory has been created. |
- |
The file/directory has been deleted. |
M |
The file/directory has been modified. |
R |
The file/directory has been renamed. |
A RENAME entry indicates a file/directory has been renamed but is still under the same snapshottable directory. A file/directory is reported as deleted if it was renamed to outside of the snapshottble directory. A file/directory renamed from outside of the snapshottble directory is reported as newly created.
一个RENAME提示一个文件/目录被重命名,但是仍然存在相同的snapshottabl目录中。如果一个文件/目录被重命名到snapshottabl目录外,那么会打印为删除。从snapshottabl目录之外重命名进来的文件/目录,被打印为新创建。
The snapshot difference report does not guarantee the same operation sequence. For example, if we rename the directory "/foo" to "/foo2", and then append new data to the file "/foo2/bar", the difference report will be:
快照差异报告不能保证相同操作的顺序。例如,如果我们将目录”/foo”重命名为”/foo2″,然后增加一个新文件为”/foo2/bar”,这个差异报告将是:
R. /foo -> /foo2
M. /foo/bar
I.e., the changes on the files/directories under a renamed directory is reported using the original path before the rename ("/foo/bar" in the above example).
即,在一个目录重命名下的文件/目录 变更,在报告的时候,是使用原来未重命名之前的名称。(例如上面的”/foo/bar”)
See also the corresponding Java API SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot) in DistributedFileSystem.
也可以参考分布式文件系统中相关JAVA API SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)。
HDFS Snapshots的更多相关文章
- [HDFS Manual] CH8 HDFS Snapshots
HDFS Snapshots HDFS Snapshots 1. 概述 1.1 Snapshottable目录 1.2 快照路径 2. 带快照的更新 3. 快照操作 3.1 管理操作 3.2 用户操作 ...
- 四:HDFS Snapshots
1.介绍 HDFS快照保存某个时间点的文件系统快照,可以是部分的文件系统,也可以是全部的文件系统.快照用来做数据备份和灾备.有以下特点: 1.快照几乎是实时瞬间完成的 2.只有在做快照时文件系统有修改 ...
- Hadoop 2.x HDFS新特性
Hadoop 2.x HDFS新特性 1.HDFS联邦 2. HDFS HA(要用到zookeeper等,留在后面再讲) 3.HDFS快照 回顾: HDFS两层模型 Namespa ...
- HDFS笔记——技术点汇总
目录 · 概况 · 原理 · HDFS 架构 · 块 · NameNode · SecondaryNameNode · fsimage与edits合并 · DataNode · 数据读写 · 容错机制 ...
- 【转载 Hadoop&Spark 动手实践 2】Hadoop2.7.3 HDFS理论与动手实践
简介 HDFS(Hadoop Distributed File System )Hadoop分布式文件系统.是根据google发表的论文翻版的.论文为GFS(Google File System)Go ...
- HDFS 命令大全
目录 概要 用户命令 dfs 命令 追加文件内容 查看文件内容 得到文件的校验信息 修改用户组 修改文件权限 修改文件所属用户 本地拷贝到 hdfs hdfs 拷贝到本地 获取目录,文件数量及大小 h ...
- Hadoop学习笔记—HDFS
目录 搭建安装 三个核心组件 安装 配置环境变量 配置各上述三组件守护进程的相关属性 启停 监控和性能 Hadoop Rack Awareness yarn的NodeManagers监控 命令 hdf ...
- 从零自学Hadoop(10):Hadoop1.x与Hadoop2.x
阅读目录 序 里程碑 Hadoop1.x与Hadoop2.x 系列索引 本文版权归mephisto和博客园共有,欢迎转载,但须保留此段声明,并给出原文链接,谢谢合作. 文章是哥(mephisto)写的 ...
- 从零自学Hadoop(11):Hadoop命令上
阅读目录 序 概述 Hadoop Common Commands User Commands Administration Commands File System Shell 引用 系列索引 本文版 ...
随机推荐
- 开启apache的server-status辅助分析工具
在Apache的调优过程中,可以通过查看Apache提供的server-status(状态报告)来验证当前所设置数值是否合理,在httpd.conf文件中做如下设置来打开: #加载mod_status ...
- logback -- 配置详解 -- 二 -- <appender>
附: logback.xml实例 logback -- 配置详解 -- 一 -- <configuration>及子节点 logback -- 配置详解 -- 二 -- <appen ...
- libevent 入门教程:Echo Server based on libevent(转)
下面假定已经学习过基本的socket编程(socket, bind, listen, accept, connect, recv, send, close),并且对异步/callback有基本的认识. ...
- PHP mysql经典问题,防止库存把控不足问题
在目前这家公司做的第一个项目抽奖项目,要求每人每天可以有20次抽奖机会,抽奖机会可以通过多种方式获取,那么就要求每次入库增加抽奖机会的时候检测当前拥有的抽奖机会是否达到了20次,如果达到了,就不再增加 ...
- ios开发之NSString用strong还是用copy?
代码如下: 1,声明 @property(nonatomic,strong)NSString *firstName; @property(nonatomic,copy)NSString *second ...
- 在eclipse中查看android源代码
自己写了一个类MainAcvitivity extends Activity, 按F12(我把转到定义改成了F12的快捷键),转到Activity的定义,弹出下面这样的界面 就是说没有找到androi ...
- WAF Bypass数据库特性(Mysql探索篇)
0x01 背景 Mysql数据库特性探索,探索能够绕过WAF的数据库特性. 0x02 测试 常见有5个位置即: SELECT * FROM admin WHERE username = 1[位置一 ...
- Unity Animation需要Inspector右键打开Debug模式,然后勾选Legacy,最后再Inspector右键打开Normal
- Jar命令
JAR包是Java中所特有一种压缩文档,其实大家就可以把它理解为.zip包;当然也是有区别的,JAR包中有一个META-INF\MANIFEST.MF文件,当你打成JAR包时,它会自动生成. 一.ja ...
- linux系统环境搭建
一.安装jdk 参考帖子 用yum安装JDK(CentOS) 1.查看yum库中都有哪些jdk版本 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [r ...