KingbaseES R6 集群repmgr witness 手工配置案例

使用见证服务器:

见证服务器是一个正常的KingbaseES实例，不是流复制群集的一部分; 其目的是，如果发生故障转移情况，则提供证明它是主服务器本身不可用的证据，而不是例如在不同物理位置之间的网络分裂。见证服务器的典型用例是双节点流复制设置，其中主要和备用服务器位于不同的位置（数据中心）。通过在与主服务器相同的位置（数据中心）中创建见证服务器，如果主服务器变得不可用，则备用服务器可以决定是否可以在不“脑裂”情况的情况下提升为主：如果它无法看到见证人或主服务器，它可能存在网络级中断，它不应该提升为主。如果它可以看到见证人但不能看到主节点，这证明没有网络中断且主本身不可用，因此它可以提升自己为主。

对于更复杂的复制方案，例如使用多个数据中心，最好使用基于位置的故障转移，这可确保只有与主服务器位于同一位置的节点才能成为主节点。

要创建见证服务器，请在与群集的主服务器位于同一物理位置的服务器上设置普通的PostgreSQL实例。不应该在与主服务器同一个物理主机创建见证服务器，否则如果主服务器由于硬件问题失败，见证服务器会失效。

数据库版本：

test=# select version();

                                                       version

----------------------------------------------------------------------------------------------------------------------

 KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit

(1 row)

repmgr cluster原架构：

[kingbase@node2 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

一、创建witness服务器

=注意：witness服务器，应该是一个独立的主机节点不能和主库或备库在同一个主机上，并且witness和其他主机之间不构成流复制，所以witness是一个独立的primary实例，其数据库systemID，不应该和其他数据库一致，需要单独initdb一个实例，不能是通过clone或copy生成数据库。=

1）初始化实例（node2节点）

=将cluster其他节点的软件安装文件，拷贝到witness节点，然后重新初始一个实例=

 [kingbase@node2 bin]$ ./initdb -D /home/kingbase/cluster/R6HA/KHA/kingbase/data -E utf8 -U system -W

......

配置repmgr extension：

启动数据库服务：

[kingbase@node2 bin]$ ./sys_ctl -D /home/kingbase/cluster/R6HA/KHA/kingbase/data start

......

server started

2）创建repmgr元数据库和schema

[kingbase@node2 bin]$ ./ksql -U system test

ksql (V8.0)

Type "help" for help.

# 创建esrep用户

test=# create user esrp with superuser;

CREATE ROLE

test=# alter user esrep with password 'Kingbaseha110';

ALTER ROLE

#创建esrep数据库

test=# create database esrep owner esrep;

CREATE DATABASE

test=# \c esrep esrep

You are now connected to database "esrep" as user "esrep".

esrep=# \d

              List of relations

 Schema |        Name         | Type | Owner

--------+---------------------+------+--------

 public | sys_stat_statements | view | system

(1 row)

# 创建repmgr schema

esrep=# create schema repmgr;

CREATE SCHEMA

二、将witness加入repmgr cluster

1）配置repmgr.conf文件

[kingbase@node2 etc]$ cat repmgr.conf

on_bmj=off

node_id=2

node_name=node249

promote_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr  standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf'

follow_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr  standby follow  -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf -W --upstream-node-id=%n'

conninfo='host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2'

log_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log'

data_directory='/home/kingbase/cluster/R6HA/KHA/kingbase/data'

sys_bindir='/home/kingbase/cluster/R6HA/KHA/kingbase/bin'

ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'

reconnect_attempts=2

reconnect_interval=3

failover='automatic'

recovery='automatic'

monitoring_history='no'

trusted_servers='192.168.7.1'

virtual_ip='192.168.7.240/24'

net_device='enp0s3'

ipaddr_path='/sbin'

arping_path='/sbin'

synchronous='quorum'

repmgrd_pid_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgrd.pid'

ping_path='/usr/bin'

#priority=0

2）注册witness到repmgr cluster

[kingbase@node2 bin]$ ./repmgr witness register -h 192.168.7.248

# -h 指向主库节点ip

INFO: connecting to witness node "node249" (ID: 2)

INFO: connecting to primary node

NOTICE: attempting to install extension "repmgr"

NOTICE: "repmgr" extension successfully installed

INFO: witness registration complete

NOTICE: witness node "node249" (ID: 2) successfully registered

[kingbase@node2 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 2  | node249  | witness | * running | node248  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

3）查看witness元数据库数据信息

=witness注册到repmgr cluster后，自动在esrep数据库的repmgr schema下创建repmgr元数据对象=

 [kingbase@node2 bin]$ ./ksql -U esrep esrep

ksql (V8.0)

Type "help" for help.

esrep=# \d repmgr.*

                                 Table "repmgr.events"

     Column      |           Type           | Collation | Nullable |      Default

-----------------+--------------------------+-----------+----------+-------------------

 node_id         | integer                  |           | not null |

 event           | text                     |           | not null |

 successful      | boolean                  |           | not null | true

 event_timestamp | timestamp with time zone |           | not null | CURRENT_TIMESTAMP

 details         | text                     |           |          | 

               Index "repmgr.idx_monitoring_history_time"

      Column       |           Type           | Key? |    Definition

-------------------+--------------------------+------+-------------------

 last_monitor_time | timestamp with time zone | yes  | last_monitor_time

 standby_node_id   | integer                  | yes  | standby_node_id

btree, for table "repmgr.monitoring_history"

                           Table "repmgr.monitoring_history"

          Column           |           Type           | Collation | Nullable | Default

---------------------------+--------------------------+-----------+----------+---------

 primary_node_id           | integer                  |           | not null |

 standby_node_id           | integer                  |           | not null |

 last_monitor_time         | timestamp with time zone |           | not null |

 last_apply_time           | timestamp with time zone |           |          |

 last_wal_primary_location | pg_lsn                   |           | not null |

 last_wal_standby_location | pg_lsn                   |           |          |

 replication_lag           | bigint                   |           | not null |

 apply_lag                 | bigint                   |           | not null |

Indexes:

    "idx_monitoring_history_time" btree (last_monitor_time, standby_node_id)

                                  Table "repmgr.nodes"

      Column      |            Type            | Collation | Nullable |     Default

------------------+----------------------------+-----------+----------+-----------------

 node_id          | integer                    |           | not null |

 upstream_node_id | integer                    |           |          |

 active           | boolean                    |           | not null | true

 node_name        | text                       |           | not null |

 type             | text                       |           | not null |

 location         | text                       |           | not null | 'default'::text

 priority         | integer                    |           | not null | 100

 conninfo         | text                       |           | not null |

 repluser         | character varying(63 char) |           | not null |

 slot_name        | text                       |           |          |

 config_file      | text                       |           | not null |

Indexes:

    "nodes_pkey" PRIMARY KEY, btree (node_id)

Check constraints:

    "nodes_type_check" CHECK (type = ANY (ARRAY['primary'::text, 'standby'::text, 'witness'::text, 'bdr'::text]))

Foreign-key constraints:

    "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE

Referenced by:

    TABLE "repmgr.nodes" CONSTRAINT "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE

       Index "repmgr.nodes_pkey"

 Column  |  Type   | Key? | Definition

---------+---------+------+------------

 node_id | integer | yes  | node_id

primary key, btree, for table "repmgr.nodes"

                           View "repmgr.replication_status"

          Column           |           Type           | Collation | Nullable | Default

---------------------------+--------------------------+-----------+----------+---------

 primary_node_id           | integer                  |           |          |

 standby_node_id           | integer                  |           |          |

 standby_name              | text                     |           |          |

 node_type                 | text                     |           |          |

 active                    | boolean                  |           |          |

 last_monitor_time         | timestamp with time zone |           |          |

 last_wal_primary_location | pg_lsn                   |           |          |

 last_wal_standby_location | pg_lsn                   |           |          |

 replication_lag           | text                     |           |          |

 replication_time_lag      | interval                 |           |          |

 apply_lag                 | text                     |           |          |

 communication_time_lag    | interval                 |           |          | 

                   View "repmgr.show_nodes"

       Column       |  Type   | Collation | Nullable | Default

--------------------+---------+-----------+----------+---------

 node_id            | integer |           |          |

 node_name          | text    |           |          |

 active             | boolean |           |          |

 upstream_node_id   | integer |           |          |

 upstream_node_name | text    |           |          |

 type               | text    |           |          |

 priority           | integer |           |          |

 conninfo           | text    |           |          | 

            Table "repmgr.voting_term"

 Column |  Type   | Collation | Nullable | Default

--------+---------+-----------+----------+---------

 term   | integer |           | not null |

Indexes:

    "voting_term_restrict" UNIQUE, btree ((true))

Rules:

    voting_term_delete AS

    ON DELETE TO repmgr.voting_term DO INSTEAD NOTHING

 Index "repmgr.voting_term_restrict"

 Column |  Type   | Key? | Definition

--------+---------+------+------------

 bool   | boolean | yes  | (true)

unique, btree, for table "repmgr.voting_term"

三、witness节点注册故障分析

=如下所示，witness在其他节点的状态为“? unreachable ”。=

[kingbase@node1 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status        | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+---------------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | primary | * running     |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 2  | node249  | witness | ? unreachable | node248  | default  | 0        | ?        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | standby |   running     | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running     | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

WARNING: following issues were detected

  - unable to connect to node "node249" (ID: 2)

1）测试ksql到witness节点的连接（连接失败）

[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U esrep esrep

ksql: error: could not connect to server: could not connect to server: No route to host

       Is the server running on host "192.168.7.249" and accepting

       TCP/IP connections on port 54321?

# 节点ping

[kingbase@node1 bin]$ ping 192.168.7.249

PING 192.168.7.249 (192.168.7.249) 56(84) bytes of data.

64 bytes from 192.168.7.249: icmp_seq=1 ttl=64 time=0.513 ms

64 bytes from 192.168.7.249: icmp_seq=2 ttl=64 time=0.390 ms

64 bytes from 192.168.7.249: icmp_seq=3 ttl=64 time=0.478 ms

^C

--- 192.168.7.249 ping statistics ---

3 packets transmitted, 3 received, 0% packet loss, time 2001ms

rtt min/avg/max/mdev = 0.390/0.460/0.513/0.054 ms

2）查看witness服务器防火墙配置

[root@node2 shell]# iptables -L

Chain INPUT (policy ACCEPT)

target     prot opt source               destination

ACCEPT     udp  --  anywhere             anywhere             udp dpt:domain

ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:domain

ACCEPT     udp  --  anywhere             anywhere             udp dpt:bootps

ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:bootps

ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED

ACCEPT     all  --  anywhere             anywhere

INPUT_direct  all  --  anywhere             anywhere

INPUT_ZONES_SOURCE  all  --  anywhere             anywhere

INPUT_ZONES  all  --  anywhere             anywhere

ACCEPT     icmp --  anywhere             anywhere

REJECT     all  --  anywhere             anywhere             reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)

target     prot opt source               destination

ACCEPT     all  --  anywhere             bogon/24             ctstate RELATED,ESTABLISHED

ACCEPT     all  --  192.168.122.0/24     anywhere

ACCEPT     all  --  anywhere             anywhere

REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable

REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable

ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED

ACCEPT     all  --  anywhere             anywhere

FORWARD_direct  all  --  anywhere             anywhere

FORWARD_IN_ZONES_SOURCE  all  --  anywhere             anywhere

FORWARD_IN_ZONES  all  --  anywhere             anywhere

FORWARD_OUT_ZONES_SOURCE  all  --  anywhere             anywhere

FORWARD_OUT_ZONES  all  --  anywhere             anywhere

ACCEPT     icmp --  anywhere             anywhere

REJECT     all  --  anywhere             anywhere             reject-with icmp-host-prohibited

......

=== 有以上可知，witness服务器节点防火墙被启动===

3）清理witness主机防火墙规则

[root@node2 shell]# iptables -F

4）测试witness主机数据库连接

[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U system test

ksql (V8.0)

Type "help" for help.

5）查看集群节点状态

[kingbase@node1 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 2  | node249  | witness | * running | node248  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

四、集群failover 切换后

1）查看集群节点状态

[kingbase@node2 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | primary | * running |          | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 2  | node249  | witness | * running | node248  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running | node248  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

2）集群主备切换后，witness重新注册连接新的主库

[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243

INFO: connecting to witness node "node249" (ID: 2)

INFO: connecting to primary node

INFO: "repmgr" extension is already installed

INFO: witness registration complete

NOTICE: witness node "node249" (ID: 2) successfully registered

[kingbase@node2 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 2  | node249  | witness | * running | node243  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | primary | * running |          | default  | 100      | 19       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243

INFO: connecting to witness node "node249" (ID: 2)

INFO: connecting to primary node

INFO: "repmgr" extension is already installed

INFO: witness registration complete

NOTICE: witness node "node249" (ID: 2) successfully registered

[kingbase@node2 bin]$ ./repmgr cluster show

 ID | Name     | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string

----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------

 1  | node248  | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 2  | node249  | witness | * running | node243  | default  | 0        | 1        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 3  | node243  | primary | * running |          | default  | 100      | 19       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

 5  | node243B | standby |   running | node243  | default  | 100      | 18       | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

新主库hamgr.log日志：

 [2021-03-01 12:49:05] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2"

[2021-03-01 12:49:05] [DETAIL] PQping() returned "PQPING_REJECT"

[2021-03-01 12:49:05] [WARNING] unable to connect to upstream node "node248" (ID: 1)

[2021-03-01 12:49:05] [INFO] sleeping 3 seconds until next reconnection attempt

[2021-03-01 12:49:08] [INFO] checking state of node 1, 1 of 2 attempts

[2021-03-01 12:49:08] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"

[2021-03-01 12:49:08] [DETAIL] PQping() returned "PQPING_REJECT"

[2021-03-01 12:49:08] [INFO] sleeping 3 seconds until next reconnection attempt

[2021-03-01 12:49:11] [INFO] checking state of node 1, 2 of 2 attempts

[2021-03-01 12:49:11] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"

[2021-03-01 12:49:11] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"

[2021-03-01 12:49:11] [WARNING] unable to reconnect to node 1 after 2 attempts

[2021-03-01 12:49:11] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds

[2021-03-01 12:49:12] [WARNING] wal receiver not running

[2021-03-01 12:49:12] [NOTICE] WAL receiver disconnected on all sibling nodes

[2021-03-01 12:49:12] [INFO] WAL receiver disconnected on all 2 sibling nodes

[2021-03-01 12:49:12] [INFO] 2 active sibling nodes registered

[2021-03-01 12:49:12] [INFO] primary and this node have the same location ("default")

[2021-03-01 12:49:12] [INFO] local node's last receive lsn: 5/640000A0

[2021-03-01 12:49:12] [INFO] checking state of sibling node "node249" (ID: 2)

[2021-03-01 12:49:12] [INFO] node "node249" (ID: 2) reports its upstream is node 1, last seen 7 second(s) ago

[2021-03-01 12:49:12] [INFO] node 2 last saw primary node 7 second(s) ago

[2021-03-01 12:49:12] [INFO] checking state of sibling node "node243B" (ID: 5)

[2021-03-01 12:49:12] [WARNING] repmgrd not running on node "node243B" (ID: 5), skipping

[2021-03-01 12:49:12] [INFO] visible nodes: 3; total nodes: 3; no nodes have seen the primary within the last 4 seconds

[2021-03-01 12:49:12] [NOTICE] promotion candidate is "node243" (ID: 3)

[2021-03-01 12:49:12] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms

[2021-03-01 12:49:12] [NOTICE] this node is the winner, will now promote itself and inform other nodes

[2021-03-01 12:49:12] [INFO] try to ping the trusted_servers "192.168.7.1" before execute promote_command

[2021-03-01 12:49:14] [NOTICE] PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data.

--- 192.168.7.1 ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1001ms

rtt min/avg/max/mdev = 2.450/2.460/2.471/0.050 ms

A

[2021-03-01 12:49:14] [NOTICE] successfully ping one or more of the trusted_servers "192.168.7.1"

[2021-03-01 12:49:14] [NOTICE] try to stop old primary db (host: "192.168.7.248")

ERROR: connection to database failed

DETAIL:

could not connect to server: Connection refused

        Is the server running on host "192.168.7.248" and accepting

        TCP/IP connections on port 54321?

DETAIL: attempted to connect using:

  user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr

[2021-03-01 12:49:16] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data.

--- 192.168.7.240 ping statistics ---

2 packets transmitted, 2 received, 0% packet loss, time 1002ms

rtt min/avg/max/mdev = 0.357/0.365/0.374/0.020 ms

[2021-03-01 12:49:16] [WARNING] the virtual ip is already on other host, try to release it on old primary node (host: "192.168.7.248")

[2021-03-01 12:49:16] [INFO] SSH connection to host "192.168.7.248" succeeded, ready to release vip on it

[2021-03-01 12:49:17] [NOTICE] old primary node (host: "192.168.7.248") release the virtual ip 192.168.7.240/24 success

[2021-03-01 12:49:17] [NOTICE] will acquire the virtual ip again

[2021-03-01 12:49:18] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data.

--- 192.168.7.240 ping statistics ---

2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 999ms

[2021-03-01 12:49:18] [WARNING] ping host"192.168.7.240" failed

[2021-03-01 12:49:18] [DETAIL] average RTT value is not greater than zero

[2021-03-01 12:49:19] [NOTICE] new primary node (ID: 3) acquire the virtual ip 192.168.7.240/24 success

[2021-03-01 12:49:19] [INFO] promote_command is:

  "/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr  standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf"

WARNING: 2 sibling nodes found, but option "--siblings-follow" not specified

DETAIL: these nodes will remain attached to the current primary:

  node249 (node ID: 2, witness server)

  node243B (node ID: 5)

NOTICE: promoting standby to primary

DETAIL: promoting server "node243" (ID: 3) using sys_promote()

NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete

INFO: SET synchronous TO "async" on primary host

NOTICE: STANDBY PROMOTE successful

DETAIL: server "node243" (ID: 3) was successfully promoted to primary

KingbaseES R6 集群repmgr witness 手工配置案例的更多相关文章

KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(一)
KingbaseES R6集群repmgr.conf参数'recovery'测试案例(一) 案例说明: 在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库 ...
KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(三)
案例三:测试'recovery = manual' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | Role ...
KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(二)
案例二:测试'recovery = automatic' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | R ...
KingbaseES R6集群归档备份故障分析解决案例
案例说明: 在使用ps工具查看主库进程,发现主库'archiver'进程失败,检查sys_log日志可以发现归档失败的信息.通过sys_log日志提取归档语句手工执行归档操作,提示"当前数据 ...
KingbaseES R6 集群主库网卡down测试案例
数据库版本: test=# select version(); version ------------------------------------------------------------ ...
KingbaseES R6 集群“双主”故障解决案例
实际工作中,可能会碰到集群脑裂的情况,在脑裂时,会出现双 primary情况.这时,需要用户介入,人工判断哪个节点的数据最新,减少数据丢失. 一.测试环境信息操作系统: [kingbase@node ...
KingbaseES R6 集群 recovery 参数对切换的影响
案例说明:在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库节点系统恢复正常后,如何对原主库节点进行处理,保证集群数据的一致性和安全,可以通过对repmg ...
KingbaseES R6 集群修改data目录
案例说明: 本案例是在部署完成KingbaseES R6集群后,由于业务的需求,集群需要修改data(数据存储)目录的测试.本案例分两种修改方式,第一种是离线修改data目录,即关闭整个集群后,修改数 ...
KingbaseES R6 集群修改物理IP和VIP案例
在用户的实际环境里,可能有时需要修改主机的IP,这就涉及到集群的配置修改.以下以例子的方式,介绍下KingbaseES R6集群如何修改IP. 一.案例测试环境操作系统: [KINGBASE@nod ...

随机推荐

bat-winget-win平台的软件包管理器
win10 1709版本以后引入的包管理器,如果不可用需要更新一下应用安装程序. winget命令的功能常用的就安装卸载更新 . 卸载使用中如果提示策略不允许,可执行下面命 ...
Python实现12种概率分布（附代码）
今天给大家带来的这篇文章是关于机器学习的,机器学习有其独特的数学基础,我们用微积分来处理变化无限小的函数,并计算它们的变化:我们使用线性代数来处理计算过程:我们还用概率论与统计学建模不确定性. 在这其 ...
字节跳动数据平台技术揭秘：基于 ClickHouse 的复杂查询实现与优化
更多技术交流.求职机会.试用福利,欢迎关注字节跳动数据平台微信公众号,回复[1]进入官方交流群 ClickHouse 作为目前业内主流的列式存储数据库(DBMS)之一,拥有着同类型 DBMS 难以企及 ...
ArrayList源码深度剖析，从最基本的扩容原理，到魔幻的迭代器和fast-fail机制，你想要的这都有！！！
ArrayList源码深度剖析本篇文章主要跟大家分析一下ArrayList的源代码.阅读本文你首先得对ArrayList有一些基本的了解,至少使用过它.如果你对ArrayList的一些基本使用还不太 ...
Object类中wait带参方法和notifyAll方法和线程间通信
notifyAll方法: 进入到Timed_Waiting(计时等待)状态有两种方式: 1.sleep(long m)方法,在毫秒值结束之后,线程睡醒,进入到Runnable或BLocked状态 2. ...
MySQL--SELECT检索语句
1.检索单个列 SELECT prod_name FROM products; --上述语句利用 SELECT语句从 products表中检索一个名为prod_name的列. 结束SQL:多条SQL语 ...
在 SQL Server 中查找活动的 SQL 连接
在SQL Server中有几种方法可以找到活动的 SQL 连接.让我们看看一些使用 T-SQL 查询的简单快捷的方法. SP_WHO SP_WHO 是 SQL Server 内置的系统存储过程, 其他 ...
Go语言基础三：基本数据类型和运算符
Go语言数据类型与其他编程语言一样,Go语言提供了各种数据类型,可分为基本的数据类型和复杂的数据类型.基本的数据类型就是基本的构造块,例如字符串.数字和布尔值.复杂的数据类型是用户自己定义的结构,由 ...
python 执行需要管理员权限的命令(Windows)
由于Windows存在管理员权限限制,执行需管理员权限的命令时会出错, 有两种方案, 1.采用python调用vbs文件,vbs调用bat文件 2.采用提供弹出用户管理员权限方式让用户确认 1.采用p ...
【洛谷P1754 球迷购票问题】题解
传送门卡特兰数经典 $\texttt{AB}$ 分拆问题. 分析: 题意相当于排列 $n$ 个 $\texttt A$ 和 $n$ 个 $\texttt B$,使得相邻 \(\t ...

KingbaseES R6 集群repmgr witness 手工配置案例

KingbaseES R6 集群repmgr witness 手工配置案例的更多相关文章

随机推荐

热门专题