Greenplum的全量恢复之gpdbrestore

gpdbrestore命令是对gp_restore命令的一个包装，提供了更灵活的选项，比如，使用gpcrondump自动备份的文件来恢复。使用gpdbrestore恢复必须具备：

1. 存在gpcrondump操作生成的备份文件。
2. GPDB系统正在运行。
3. 当前恢复的GPDB系统与使用gp_dump备份时的GPDB系统具有相同数量的Instance。

gpdbrestore常用参数解释

-a (do not prompt) 

 Do not prompt the user for confirmation.

-b <YYYYMMDD> 

 Looks for dump files in the segment data directories on the Greenplum
 Database array of hosts in db_dumps/<YYYYMMDD>.

-d <master_data_directory>

 Optional. The master host data directory. If not specified, the value
 set for $MASTER_DATA_DIRECTORY will be used. 

-e (drop target database before restore) 

 Drops the target database before doing the restore and then recreates
 it. 

-G [include|only]

 Restores global objects such as roles and tablespaces if the global
 object dump file db_dumps/<date>/gp_global_1_1_<timestamp> is found in
 the master data directory.

 Specify either "-G only" to only restore the global objects dump file
 or "-G include" to restore global objects along with a normal restore.
 Defaults to "include" if neither argument is provided.

-l <logfile_directory>

 The directory to write the log file. Defaults to ~/gpAdminLogs. 

-m (restore metadata only)

 Performs a restore of database metadata (schema and table definitions, SET
 statements, and so forth) without restoring data.  If the --restore-stats or
 -G options are provided as well, statistics or globals will also be restored.

 The --noplan and --noanalyze options are not supported in conjunction with
 this option, as they affect the restoration of data and no data is restored.

--prefix <prefix_string> 

 If you specified the gpcrondump option --prefix <prefix_string> to create
 the backup, you must specify this option with the <prefix_string> when
 restoring the backup. 

 If you created a full backup of a set of tables with gpcrondump and
 specified a prefix, you can use gpcrondump with the options
 --list-filter-tables and --prefix <prefix_string> to list the tables
 that were included or excluded for the backup. 

--restore-stats [include|only]

 Restores optimizer statistics if the statistics dump file
 db_dumps/<date>/gp_statistics_1_1_<timestamp> is found in the master data
 directory. Setting this option automatically skips the final analyze step,
 so it is not necessary to also set the --noanalyze flag in conjunction with
 this one.

-t <timestamp_key>

 The 14 digit timestamp key that uniquely identifies a backup set of data
 to restore. It is of the form YYYYMMDDHHMMSS. Looks for dump files
 matching this timestamp key in the segment data directories db_dumps
 directory on the Greenplum Database array of hosts. 

-T <schema>.<table_name>

 Table names to restore, specify multiple times for multiple tables. The
 named table(s) must exist in the backup set of the database being restored.
 Existing tables are not automatically truncated before data is restored
 from backup. If your intention is to replace existing data in the table
 from backup, truncate the table prior to running gpdbrestore -T. 

-S <schema>

 Schema names to restore, specify multiple times for multiple schemas.
 Existing tables are not automatically truncated before data is restored
 from backup. If your intention is to replace existing data in the table
 from backup, truncate the table prior to running gpdbrestore -S. 

--truncate

 Truncate table data before restoring data to the table from the backup.
 This option is supported only when restoring a set of tables with the
 option -T or --table-file.
 This option is not supported with the -e option.

-u <backup_directory> 

 Specifies the absolute path to the directory containing the db_dumps
 directory on each host. If not specified, defaults to the data directory
 of each instance to be backed up. Specify this option if you specified a
 backup directory with the gpcrondump option -u when creating a backup
 set. 

 If <backup_directory> is not writable, backup operation report status
 files are written to segment data directories. You can specify a
 different location where report status files are written with the
 --report-status-dir option.

恢复初体验；结果上一篇的备份命令来操作

1. 删表操作；模拟数据丢失

[gpadmin@mdw ~]$ psql lottu gpadmin
psql ()
Type "help" for help.

lottu=# \dt
                    List of relations
 Schema |        Name        | Type  |  Owner  | Storage
--------+--------------------+-------+---------+---------
 public | gpcrondump_history | table | gpadmin | heap
 public | lottu01            | table | gpadmin | heap
 public | lottu02            | table | gpadmin | heap
( rows)

lottu=# drop table lottu01;
DROP TABLE
lottu=# \q
[gpadmin@mdw ~]$ psql lottu gpadmin
psql ()
Type "help" for help.

lottu=# select * from lottu01;
ERROR:  relation "lottu01" does not exist
LINE : select * from lottu01

2.恢复操作

gpdbrestore -a -e --prefix lottu -u /home/gpadmin/backup --restore-stats include --report-status-dir /home/gpadmin/backup -t 20160713160238

[gpadmin
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Starting gpdbrestore with args: -a -e --prefix lottu -u /home/gpadmin/backup --restore-stats include --report-status-dir /home/gpadmin/backup -t
:::: gpdbrestore:mdw:gpadmin-[INFO]:-------------------------------------------
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Greenplum database restore parameters
:::: gpdbrestore:mdw:gpadmin-[INFO]:-------------------------------------------
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Restore type               = Full Database
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Database to be restored    = lottu
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Drop and re-create db      = On
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Restore method             = Restore specific timestamp
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Restore timestamp          =
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Restore compressed dump    = On
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Restore global objects     = Off
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Array fault tolerance      = f
:::: gpdbrestore:mdw:gpadmin-[INFO]:-------------------------------------------
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Dropping Database lottu
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Dropped Database lottu
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Invoking sql file: /home/gpadmin/backup/db_dumps//lottu_gp_cdatabase_1_1_20160713160238
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Creating gp_toolkit schema for database "lottu"
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Adding --prefix
:::: gpdbrestore:mdw:gpadmin-[INFO]:-gp_restore commandline: gp_restore -i -h mdw -p  -U gpadmin --gp-i --prefix=lottu_ --gp-k= --gp-l=p --gp-d=/home/gpadmin/backup/db_dumps/ --gp-r=/home/gpadmin/backup --status=/home/gpadmin/backup --gp-c -d "lottu":
:::: gpdbrestore:mdw:gpadmin-[WARNING]:-gpdbrestore finished but ERRORS were found, please check the restore report file for details
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Updating AO/CO statistics on master
:::: gpdbrestore:mdw:gpadmin-[INFO]:-No AO/CO tables restored, skipping statistics update...
:::: gpdbrestore:mdw:gpadmin-[INFO]:-Commencing restore of statistics

这里有个【WARNING】;暂时不管他；后面讲解

3. 数据验证是否恢复

[gpadmin@mdw ~]$ psql lottu gpadmin
psql ()
Type "help" for help.

lottu=# select * from lottu01;
 id |  name
----+---------
   | lottu1
   | lottu3
   | lottu5
   | lottu7
   | lottu9
   | lottu2
   | lottu4
   | lottu6
   | lottu8
  | lottu10
( rows)

lottu=# \dt
              List of relations
 Schema |  Name   | Type  |  Owner  | Storage
--------+---------+-------+---------+---------
 public | lottu01 | table | gpadmin | heap
 public | lottu02 | table | gpadmin | heap
( rows)

在第2步有 “recreate database”的操作；数据经验证；数据表的数据跟之前是一致的。数据数据是恢复OK的。但是表gpcrondump_history是消失了但是2步出现“[WARNING]:-gpdbrestore finished but ERRORS were found, please check the restore report file for details”

查看恢复操作日志出现error的地方 “ERROR: constraint "lottu01_pkey" does not exist。”

lottu=# \d lottu01
           Table "public.lottu01"
 Column |         Type          | Modifiers
--------+-----------------------+-----------
 id     | integer               | not null
 name   ) |
Indexes:
    "lottu01_pkey" PRIMARY KEY, btree (id)
Distributed by: (id)

恢复后的结果lottu01表确实是存在主键约束。这个出现是误报吗？。这个应该是存在的“BUG”.假如约束没有恢复；可以手动执行*_post_data这样一个文件。

特别注意

. gpdbrestore -e 参数表示恢复前是否执行 drop database, 然后执行 create database。所以如果目标环境没有对应的数据库的话，不需要加-e参数，否则会报错。表级恢复也不要使用-e。
. 如果 gpcrondump 时使用了-C 参数， 则恢复时会先执行DROP TABLE再执行建表的动作。
. 如果 gpcrondump 时没有使用 -C 参数，参数恢复时想先清理数据的话，可以使用gpdbrestore的--truncate参数（--truncate只能是表级恢复模式下使用, 即与-T . 或 --table-file 一同使用）
. Greenplum不允许删除模板库， 所以如果使用-e恢复模板库，会报错。 解决方法是改gpcrondump代码，对于模板库特殊处理，例如drop schema的方式清理模板库， 跳过模板库的DROP database报错以及create database 报错。

综上所述：gpdbrestore所带的参数取决于gpcrondump备份的参数是怎么选择的。这个用于数据库迁移（只限于配置相同架构的数据库）

参考文献：https://yq.aliyun.com/articles/30331?spm=5176.8067842.tagmain.48.etfAn9

Greenplum的全量恢复之gpdbrestore的更多相关文章

Greenplum的全量备份之gpcrondump
gpcrondump是对gp_dump的一个包装,可以直接调用或者从crontab中调用.这个命令还允许备份除了数据库和数据之外的对象,比如数据库角色和服务器配置等. gpcrondump 常用到的参 ...
greenplum全量恢复gprecoverseg -F出现Unable to connect to database时的相关分析及解决方法
之前有两位朋友碰到过在对greenplum的系统构架更改后,出现全量恢复gprecoverseg -F也无法正常执行的情况. 报错信息为Unable to connect to database. R ...
【直播】APP全量混淆和瘦身技术揭秘
[直播]APP全量混淆和瘦身技术揭秘近些年来移动APP数量呈现爆炸式的增长,黑产也从原来的PC端转移到了移动端,通过逆向手段造成数据泄漏.源码被盗.APP被山寨.破解后注入病毒或广告现象让用户苦不堪 ...
mysql备份脚本，每天执行一次全量备份，三次增量备份
线上一个小业务的mysql备份全量备份 #!/bin/bash #crete by hexm at -- #scripte name : full_backup.sh #descriptioni : ...
Mysql备份系列（4）--lvm-snapshot备份mysql数据(全量+增量）操作记录
Mysql最常用的三种备份工具分别是mysqldump.Xtrabackup(innobackupex工具).lvm-snapshot快照.前面分别介绍了:Mysql备份系列(1)--备份方案总结性梳 ...
Mysql备份系列（2）--mysqldump备份（全量+增量）方案操作记录
在日常运维工作中,对mysql数据库的备份是万分重要的,以防在数据库表丢失或损坏情况出现,可以及时恢复数据. 线上数据库备份场景:每周日执行一次全量备份,然后每天下午1点执行MySQLdump增量备份 ...
Mysql备份系列（3）--innobackupex备份mysql大数据(全量+增量）操作记录
在日常的linux运维工作中,大数据量备份与还原,始终是个难点.关于mysql的备份和恢复,比较传统的是用mysqldump工具,今天这里推荐另一个备份工具innobackupex.innobacku ...
利用ant脚本自动构建svn增量/全量系统程序升级包
首先请允许我这样说,作为开发或测试,你一定要具备这种本领.你可以手动打包.部署你的工程,但这不是最好的方法.最好的方式就是全自动化的方式.开发人员提交了代码后,可以自动构建.打包.部署到测试环境. ...
微信连wifi正式全量对外开放申请升级智能服务
之前我们提到过微信公众平台"微信连Wi-Fi"功能来了,昨日,微信连Wi-Fi自助申请入口正式全量对外开放(独立申请入口https://wifi.weixin.qq.com/),意 ...

随机推荐

python_类
1. 对象的概念对象包括特性和方法.特性只是作为对象的一部分的变量,方法则是存储在对象内的函数.对象中的方法和其他函数的区别在于方法总是将对象作为自己的第一个参数,这个参数一般称为self. 2. ...
java中的、标识符、运算符以及数据类型之间的转换。
---恢复内容开始--- 数据类型之间的转换: 1:自动转换:就是不用说出要转换成什么类型,由java中的虚拟机自动将小数据类型转换成大数据类型,但大数据中的数据精度有可能被破坏. 2:强制转换:强制 ...
Mysql复制-Slave库设置复制延迟
mysql> stop slave; mysql> change master to master_delay=10;#单位是秒 mysql> start slave; mysql& ...
scp noneed passwd
经常在不同linux机器之间互相scp拷文件,每次总是要输入密码才可行. 通过ssh-keygen生成公钥,在两台机器之间互相建立信任通道即可. 假设本地机器client,远程机器为server. 1 ...
操作系统：进程管理和IO控制
一.进程管理进程管理包括进程控制,进程调度,进程同步与通信,死锁控制四个内容. (一)进程控制进程是操作系统中运行的基本单位,包括程序段,数据段和进程控制段.操作系统通过进程控制块(PCB)管理进 ...
navicat 的查询功能
navicat的查询的位置在: 在编辑器界面写代码,代码完成后点左上角的运行. 代码: create(创建) table(一个表) <xxx>尖括号内的内容必填——我要创建并查询一个名叫 ...
新建一个Activity通过按钮打开它，再通过按钮关闭它
首先需要创建一个供打开和关闭的Activity,先在scr下当前项目的包中创建一个新类Activity1, 并选择让其继承自Activity类,如下图所示: 之后配置AndroidMainifest. ...
Eclipse 文本显示行号
JDK安装配置环境变量
我将JDK安装在D盘中在D盘中新建一个文件文件名为JAVA 运行jdk安装软件更改jdk安装路径为下一步等待jdk安装完成安装jre路径 jre路径改为点击下一步等待jre安装完成注 ...
Ruby与Python开发的环境IDE配置（附软件的百度云链接）
Ruby开发环境配置 1.Aptana_RadRails(提示功能不好,开发Ruby不推荐) 链接:http://pan.baidu.com/s/1i5q96K1 密码:yt04 2.Aptana S ...

Greenplum的全量恢复之gpdbrestore

Greenplum的全量恢复之gpdbrestore的更多相关文章

随机推荐

热门专题