【前记】

Segment检测及故障切换机制
GP Master首先会检测Primary状态,如果Primary不可连通,那么将会检测Mirror状态,Primary/Mirror状态总共有4种:
1. Primary活着,Mirror活着。GP Master探测Primary成功之后直接返回,进行下一个Segment检测;
2. Primary活着,Mirror挂了。GP Master探测Primary成功之后,通过Primary返回的状态得知Mirror挂掉了(Mirror挂掉之后,Primary将会探测到,将自己变成ChangeTracking模式),这时候更新Master元信息,进行下一个Segment检测;
3. Primary挂了,Mirror活着。GP Master探测Primary失败之后探测Mirror,发现Mirror是活着,这时候更新Master上面的元信息,同时使Mirror接管Primary(故障切换),进行下一个Segment检测;
4. Primary挂了,Mirror挂了。GP Master探测Primary失败之后探测Mirror,Mirror也是挂了,直到重试最大值,结束这个Segment的探测,也不更新Master元信息了,进行下一个Segment检测。
上面的2-4需要进行gprecoverseg进行segment恢复。

对失败的segment节点;启动时会直接跳过,忽略。

  1. [gpadmin@mdw ~]$ gpstart
  2. :::: gpstart:mdw:gpadmin-[INFO]:-Starting gpstart with args:
  3. :::: gpstart:mdw:gpadmin-[INFO]:-Gathering information and validating the environment...
  4. :::: gpstart:mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.8.1 build 1'
  5. :::: gpstart:mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '
  6. :::: gpstart:mdw:gpadmin-[INFO]:-Starting Master instance in admin mode
  7. :::: gpstart:mdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
  8. :::: gpstart:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  9. :::: gpstart:mdw:gpadmin-[INFO]:-Setting new master era
  10. :::: gpstart:mdw:gpadmin-[INFO]:-Master Started...
  11. :::: gpstart:mdw:gpadmin-[INFO]:-Shutting down master
  12. :::: gpstart:mdw:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on sdw2 directory /home/gpadmin/gpdata/gpdatam/gpseg0 <<<<<
  13. :::: gpstart:mdw:gpadmin-[INFO]:---------------------------
  14. :::: gpstart:mdw:gpadmin-[INFO]:-Master instance parameters
  15. :::: gpstart:mdw:gpadmin-[INFO]:---------------------------
  16. :::: gpstart:mdw:gpadmin-[INFO]:-Database = template1
  17. :::: gpstart:mdw:gpadmin-[INFO]:-Master Port =
  18. :::: gpstart:mdw:gpadmin-[INFO]:-Master directory = /home/gpadmin/gpdata/pgmaster/gpseg-
  19. :::: gpstart:mdw:gpadmin-[INFO]:-Timeout = seconds
  20. :::: gpstart:mdw:gpadmin-[INFO]:-Master standby = Off
  21. :::: gpstart:mdw:gpadmin-[INFO]:---------------------------------------
  22. :::: gpstart:mdw:gpadmin-[INFO]:-Segment instances that will be started
  23. :::: gpstart:mdw:gpadmin-[INFO]:---------------------------------------
  24. :::: gpstart:mdw:gpadmin-[INFO]:- Host Datadir Port Role
  25. :::: gpstart:mdw:gpadmin-[INFO]:- sdw1 /home/gpadmin/gpdata/gpdatap/gpseg0 Primary
  26. :::: gpstart:mdw:gpadmin-[INFO]:- sdw2 /home/gpadmin/gpdata/gpdatap/gpseg1 Primary
  27. :::: gpstart:mdw:gpadmin-[INFO]:- sdw1 /home/gpadmin/gpdata/gpdatam/gpseg1 Mirror
  28.  
  29. Continue with Greenplum instance startup Yy|Nn (default=N):
  30. > y
  31. :::: gpstart:mdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
  32. ...........
  33. :::: gpstart:mdw:gpadmin-[INFO]:-Process results...
  34. :::: gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
  35. :::: gpstart:mdw:gpadmin-[INFO]:- Successful segment starts =
  36. :::: gpstart:mdw:gpadmin-[INFO]:- Failed segment starts =
  37. :::: gpstart:mdw:gpadmin-[WARNING]:-Skipped segment starts (segments are marked down in configuration) = <<<<<<<<
  38. :::: gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
  39. :::: gpstart:mdw:gpadmin-[INFO]:-
  40. :::: gpstart:mdw:gpadmin-[INFO]:-Successfully started of segment instances, skipped other segments
  41. :::: gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
  42. :::: gpstart:mdw:gpadmin-[WARNING]:-****************************************************************************
  43. :::: gpstart:mdw:gpadmin-[WARNING]:-There are segment(s) marked down in the database
  44. :::: gpstart:mdw:gpadmin-[WARNING]:-To recover from this current state, review usage of the gprecoverseg
  45. :::: gpstart:mdw:gpadmin-[WARNING]:-management utility which will recover failed segment instance databases.
  46. :::: gpstart:mdw:gpadmin-[WARNING]:-****************************************************************************
  47. :::: gpstart:mdw:gpadmin-[INFO]:-Starting Master
  48. :::: gpstart:mdw:gpadmin-[INFO]:-Command pg_ctl reports Master mdw instance active
  49. :::: gpstart:mdw:gpadmin-[INFO]:-No standby master configured. skipping...
  50. :::: gpstart:mdw:gpadmin-[WARNING]:-Number of segments
  51. :::: gpstart:mdw:gpadmin-[INFO]:-Check status of database with gpstate utility

查看数据库的mirror的节点启动状态

  1. [gpadmin@mdw ~]$ gpstate -m
  2. :::: gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -m
  3. :::: gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.8.1 build 1'
  4. :::: gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56'
  5. :::: gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  6. :::: gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------
  7. :::: gpstate:mdw:gpadmin-[INFO]:--Current GPDB mirror list and status
  8. :::: gpstate:mdw:gpadmin-[INFO]:--Type = Spread
  9. :::: gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------
  10. :::: gpstate:mdw:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status
  11. :::: gpstate:mdw:gpadmin-[WARNING]:-sdw2 /home/gpadmin/gpdata/gpdatam/gpseg0 Failed <<<<<<<<
  12. :::: gpstate:mdw:gpadmin-[INFO]:- sdw1 /home/gpadmin/gpdata/gpdatam/gpseg1 Passive Synchronized
  13. :::: gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------
  14. :::: gpstate:mdw:gpadmin-[WARNING]:- segment(s) configured as mirror(s) have failed

可直观看出“[WARNING]:-sdw2 /home/gpadmin/gpdata/gpdatam/gpseg0 50000 Failed ”

如何恢复这个mirror segment呢?当然primary segment也是这样恢复的

1. 首先产生一个恢复的配置文件 :    gprecoverseg -o ./recov

  1. [gpadmin@mdw ~]$ gprecoverseg -o ./recov
  2. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -o ./recov
  3. :::: gprecoverseg:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.8.1 build 1'
  4. :::: gprecoverseg:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56'
  5. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Checking if segments are ready
  6. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  7. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  8. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Configuration file output to ./recov successfully.

2. 查看恢复的配置文件;可以知道哪些segment需要恢复

  1. [gpadmin@mdw ~]$ cat recov
  2. filespaceOrder=fastdisk
  3. sdw2::/home/gpadmin/gpdata/gpdatam/gpseg0

3. 使用这个配置文件进行恢复 : gprecoverseg -i ./recov

  1. [gpadmin@mdw ~]$ gprecoverseg -i ./recov
  2. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -i ./recov
  3. :::: gprecoverseg:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.8.1 build 1'
  4. :::: gprecoverseg:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56'
  5. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Checking if segments are ready
  6. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  7. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  8. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
  9. :::: gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
  10. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Recovery from configuration -i option supplied
  11. :::: gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
  12. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Recovery of
  13. :::: gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
  14. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Synchronization mode = Incremental
  15. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance host = sdw2
  16. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance address = sdw2
  17. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance directory = /home/gpadmin/gpdata/gpdatam/gpseg0
  18. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance port =
  19. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance replication port =
  20. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance fastdisk directory = /data/gpdata/seg1/pg_mir_cdr/gpseg0
  21. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance host = sdw1
  22. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance address = sdw1
  23. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance directory = /home/gpadmin/gpdata/gpdatap/gpseg0
  24. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance port =
  25. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance replication port =
  26. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance fastdisk directory = /data/gpdata/seg1/pg_pri_cdr/gpseg0
  27. :::: gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target = in-place
  28. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Process results...
  29. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Done updating primaries
  30. :::: gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************
  31. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Updating segments for resynchronization is completed.
  32. :::: gprecoverseg:mdw:gpadmin-[INFO]:-For segments updated successfully, resynchronization will continue in the background.
  33. :::: gprecoverseg:mdw:gpadmin-[INFO]:-
  34. :::: gprecoverseg:mdw:gpadmin-[INFO]:-Use gpstate -s to check the resynchronization progress.
  35. :::: gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************

4. 查看恢复状态

  1. [gpadmin@mdw ~]$ gpstate -m
  2. :::: gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -m
  3. :::: gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.8.1 build 1'
  4. :::: gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56'
  5. :::: gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
  6. :::: gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------
  7. :::: gpstate:mdw:gpadmin-[INFO]:--Current GPDB mirror list and status
  8. :::: gpstate:mdw:gpadmin-[INFO]:--Type = Spread
  9. :::: gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------
  10. :::: gpstate:mdw:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status
  11. :::: gpstate:mdw:gpadmin-[INFO]:- sdw2 /home/gpadmin/gpdata/gpdatam/gpseg0 Passive Resynchronizing
  12. :::: gpstate:mdw:gpadmin-[INFO]:- sdw1 /home/gpadmin/gpdata/gpdatam/gpseg1 Passive Synchronized
  13. :::: gpstate:mdw:gpadmin-[INFO]:--------------------------------------------------------------

5. 到上一步,数据库的主备就恢复了,但是还有一步,是可选的。
你要不要把primary , mirror角色对调一下,因为现在mirror和primary和优先角色是相反的。
如果要对调,使用以下命令,会停库来处理。

  1. gprecoverseg -r

【总结】

用于修复Segment的是gprecoverseg。使用方式比较简单,有限的几个主要参数如下:
 -i :主要参数,用于指定一个配置文件,该配置文件描述了需要修复的Segment和修复后的目的位置。
 -F :可选项,指定后,gprecoverseg会将”-i”中指定的或标记”d”的实例删除,并从活着的Mirror复制一个完整一份到目标位置。
 -r :当FTS发现有Primary宕机并进行主备切换,在gprecoverseg修复后,担当Primary的Mirror角色并不会立即切换回来,就会导致部分主机上活跃的Segment过多从而引起性能瓶颈。因此需要恢复Segment原先的角色,称为re-balance。

Greenplum failed segment的恢复方法的更多相关文章

  1. Greenplum failed segment的恢复方法--primary与mirror都可修复

    当在使用greenplum过程中有不当的操作时,可能会出现segment节点宕掉的情况(比如在greenplum运行的过程中停掉其中几台segment节点的服务器),通过下面的方法可以恢复segmen ...

  2. [原]Greenplum failed segment的恢复方法

    当在使用greenplum过程中有不当的操作时,可能会出现segment节点宕掉的情况(比如在greenplum运行的过程中停掉其中几台segment节点的服务器),通过下面的方法可以恢复segmen ...

  3. MySQL全备+binlog恢复方法之伪装master【原创】

    利用mysql全备 +binlog server恢复方法之伪装master 单实例试验 一.试验环境 10.72.7.40 实例 mysql3306为要恢复的对象,mysql3306的全备+binlo ...

  4. ORA-27125: unable to create shared memory segment的解决方法(转)

    ORA-27125: unable to create shared memory segment的解决方法(转) # Kernel sysctl configuration file for Red ...

  5. Vertica集群单节点宕机恢复方法

    Vertica集群单节点宕机恢复方法 第一种方法: 直接通过admintools -> 5 Restart Vertica on Host 第二种方法: 若第一种方法无法恢复,则清空宕机节点的c ...

  6. Oracle数据库常见的误操作恢复方法(上)

    实验环境:Linux6.4 + Oracle 11g 面向读者:Oracle开发维护人员 概要: 1.误操作drop了emp表 2.误操作delete了emp表 3.误操作delete了emp表的部分 ...

  7. linux下rm误删除数据库文件的恢复方法

    在linux redhat 5.4版本,rm误删除数据库文件的恢复过程分享.测试没有问题,可用. 1.首先测试rm 误删除数据库文件 [oracle@primary dbwdn]$ ll total ...

  8. 重装系统后QQ聊天记录恢复方法

    重装系统后QQ聊天记录恢复方法 近日又一次安装了系统,又一次安装了腾讯的.TM,TM也是安装在之前的文件夹底下,可是聊天记录和之前的自己定义表情都不见了,看来没有自己主动恢复回来. 我这里另一个特殊的 ...

  9. Eclipse默认配色的恢复方法

    Eclipse默认配色的恢复方法 很多搞开发的同学一开始不喜欢默认的eclipse白底配色,去网上千辛万苦搜到了很多黑底暗色的各种eclipse配色然后import上了,之后却发现并不适合自己,想找默 ...

随机推荐

  1. [3D]绘制XYZ小坐标轴

    源码: using System; using System.Collections.Generic; using System.Linq; using System.Text; using Slim ...

  2. passport 自动取密码

    django settings.py """ Django settings for password project. Generated by . For more ...

  3. 【转】解决:fatal error C1083: 无法打开预编译头文件

    http://blog.csdn.net/aafengyuan/article/details/7988584 是这样的,我创建了一个空项目,并通过"项目属性>C/C++>预编译 ...

  4. UISlide属性

    1.    minimumValue  :当值可以改变时,滑块可以滑动到最小位置的值,默认为0.0 _slider.minimumValue = 10.0; 2.    maximumValue :当 ...

  5. php 调用 java 接口

    php 需要开启 curl模块 /** HTTP 请求函数封装*/function http_request_cloudzone($url, $data){ //var_dump($url." ...

  6. mysql之innodb_buffer_pool

    1>.mysqld重启之后,innodb_buffer_pool几乎是空的,没有任何的缓存数据.随着sql语句的执行,table中的数据以及index 逐渐被填充到buffer pool里面,之 ...

  7. 安装shopex注意事项

    [原创]关于PHP5.3.x和Zend Optimizer(Zend Guard Loader),以及shopex4.8.5安装的问题  http://dzmailbox.blog.163.com/b ...

  8. uboot.lds (一)

    lds文件与scatter文件相似都是决定一个可执行程序的各个段的存储位置,以及入口地址,这也是链接定位的作用.U-boot的lds文件说明如下:       SECTIONS{        ... ...

  9. CSS_03_01_CSS类选择器

    第01步:编写css样式:class_01.css @charset "utf-8"; /* CSS Document */ div.class01{ background-col ...

  10. Android -- 自定义View小Demo(一)

    1,现在要实现下图的简单效果,很简单  ,就是使用paint在canvas上绘制5中不同颜色的圆圈,效果图如下: 这是绘制基本图形一种最简单的方法,下面是它的代码 ,注释写的很详细,也就不去讲解了 M ...