使用见证服务器:

见证服务器是一个正常的KingbaseES实例,不是流复制群集的一部分; 其目的是,如果发生故障转移情况,则提供证明它是主服务器本身不可用的证据,而不是例如在不同物理位置之间的网络分裂。见证服务器的典型用例是双节点流复制设置,其中主要和备用服务器位于不同的位置(数据中心)。通过在与主服务器相同的位置(数据中心)中创建见证服务器,如果主服务器变得不可用,则备用服务器可以决定是否可以在不“脑裂”情况的情况下提升为主:如果它无法看到见证人或主服务器,它可能存在网络级中断,它不应该提升为主。如果它可以看到见证人但不能看到主节点,这证明没有网络中断且主本身不可用,因此它可以提升自己为主。

对于更复杂的复制方案,例如使用多个数据中心,最好使用基于位置的故障转移,这可确保只有与主服务器位于同一位置的节点才能成为主节点。

要创建见证服务器,请在与群集的主服务器位于同一物理位置的服务器上设置普通的PostgreSQL实例。不应该在与主服务器同一个物理主机创建见证服务器,否则如果主服务器由于硬件问题失败,见证服务器会失效。

数据库版本:

test=# select version();
version
----------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)

repmgr cluster原架构:

[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

一、创建witness服务器

=注意:witness服务器,应该是一个独立的主机节点不能和主库或备库在同一个主机上,并且witness和其他主机之间不构成流复制,所以witness是一个独立的primary实例,其数据库systemID,不应该和其他数据库一致,需要单独initdb一个实例,不能是通过clone或copy生成数据库。=

1)初始化实例(node2节点)

=将cluster其他节点的软件安装文件,拷贝到witness节点,然后重新初始一个实例=

 [kingbase@node2 bin]$ ./initdb -D /home/kingbase/cluster/R6HA/KHA/kingbase/data -E utf8 -U system -W
......

配置repmgr extension:

启动数据库服务:

[kingbase@node2 bin]$ ./sys_ctl -D /home/kingbase/cluster/R6HA/KHA/kingbase/data start

......

server started

2)创建repmgr元数据库和schema

[kingbase@node2 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help. # 创建esrep用户
test=# create user esrp with superuser;
CREATE ROLE
test=# alter user esrep with password 'Kingbaseha110';
ALTER ROLE #创建esrep数据库
test=# create database esrep owner esrep;
CREATE DATABASE
test=# \c esrep esrep
You are now connected to database "esrep" as user "esrep".
esrep=# \d
List of relations
Schema | Name | Type | Owner
--------+---------------------+------+--------
public | sys_stat_statements | view | system
(1 row) # 创建repmgr schema
esrep=# create schema repmgr;
CREATE SCHEMA

二、将witness加入repmgr cluster

1)配置repmgr.conf文件

[kingbase@node2 etc]$ cat repmgr.conf
on_bmj=off
node_id=2
node_name=node249
promote_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf'
follow_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby follow -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf -W --upstream-node-id=%n'
conninfo='host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2' log_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log'
data_directory='/home/kingbase/cluster/R6HA/KHA/kingbase/data'
sys_bindir='/home/kingbase/cluster/R6HA/KHA/kingbase/bin'
ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'
reconnect_attempts=2
reconnect_interval=3
failover='automatic'
recovery='automatic'
monitoring_history='no'
trusted_servers='192.168.7.1'
virtual_ip='192.168.7.240/24'
net_device='enp0s3'
ipaddr_path='/sbin'
arping_path='/sbin'
synchronous='quorum'
repmgrd_pid_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgrd.pid'
ping_path='/usr/bin'
#priority=0

2)注册witness到repmgr cluster

[kingbase@node2 bin]$ ./repmgr witness register -h 192.168.7.248
# -h 指向主库节点ip
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered [kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

3)查看witness元数据库数据信息

=witness注册到repmgr cluster后,自动在esrep数据库的repmgr schema下创建repmgr元数据对象=

 [kingbase@node2 bin]$ ./ksql -U esrep esrep
ksql (V8.0)
Type "help" for help. esrep=# \d repmgr.*
Table "repmgr.events"
Column | Type | Collation | Nullable | Default
-----------------+--------------------------+-----------+----------+-------------------
node_id | integer | | not null |
event | text | | not null |
successful | boolean | | not null | true
event_timestamp | timestamp with time zone | | not null | CURRENT_TIMESTAMP
details | text | | | Index "repmgr.idx_monitoring_history_time"
Column | Type | Key? | Definition
-------------------+--------------------------+------+-------------------
last_monitor_time | timestamp with time zone | yes | last_monitor_time
standby_node_id | integer | yes | standby_node_id
btree, for table "repmgr.monitoring_history" Table "repmgr.monitoring_history"
Column | Type | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
primary_node_id | integer | | not null |
standby_node_id | integer | | not null |
last_monitor_time | timestamp with time zone | | not null |
last_apply_time | timestamp with time zone | | |
last_wal_primary_location | pg_lsn | | not null |
last_wal_standby_location | pg_lsn | | |
replication_lag | bigint | | not null |
apply_lag | bigint | | not null |
Indexes:
"idx_monitoring_history_time" btree (last_monitor_time, standby_node_id) Table "repmgr.nodes"
Column | Type | Collation | Nullable | Default
------------------+----------------------------+-----------+----------+-----------------
node_id | integer | | not null |
upstream_node_id | integer | | |
active | boolean | | not null | true
node_name | text | | not null |
type | text | | not null |
location | text | | not null | 'default'::text
priority | integer | | not null | 100
conninfo | text | | not null |
repluser | character varying(63 char) | | not null |
slot_name | text | | |
config_file | text | | not null |
Indexes:
"nodes_pkey" PRIMARY KEY, btree (node_id)
Check constraints:
"nodes_type_check" CHECK (type = ANY (ARRAY['primary'::text, 'standby'::text, 'witness'::text, 'bdr'::text]))
Foreign-key constraints:
"nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE
Referenced by:
TABLE "repmgr.nodes" CONSTRAINT "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE Index "repmgr.nodes_pkey"
Column | Type | Key? | Definition
---------+---------+------+------------
node_id | integer | yes | node_id
primary key, btree, for table "repmgr.nodes" View "repmgr.replication_status"
Column | Type | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
primary_node_id | integer | | |
standby_node_id | integer | | |
standby_name | text | | |
node_type | text | | |
active | boolean | | |
last_monitor_time | timestamp with time zone | | |
last_wal_primary_location | pg_lsn | | |
last_wal_standby_location | pg_lsn | | |
replication_lag | text | | |
replication_time_lag | interval | | |
apply_lag | text | | |
communication_time_lag | interval | | | View "repmgr.show_nodes"
Column | Type | Collation | Nullable | Default
--------------------+---------+-----------+----------+---------
node_id | integer | | |
node_name | text | | |
active | boolean | | |
upstream_node_id | integer | | |
upstream_node_name | text | | |
type | text | | |
priority | integer | | |
conninfo | text | | | Table "repmgr.voting_term"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
term | integer | | not null |
Indexes:
"voting_term_restrict" UNIQUE, btree ((true))
Rules:
voting_term_delete AS
ON DELETE TO repmgr.voting_term DO INSTEAD NOTHING Index "repmgr.voting_term_restrict"
Column | Type | Key? | Definition
--------+---------+------+------------
bool | boolean | yes | (true)
unique, btree, for table "repmgr.voting_term"

三、witness节点注册故障分析

=如下所示,witness在其他节点的状态为“? unreachable ”。=

[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+---------------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | ? unreachable | node248 | default | 0 | ? | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 WARNING: following issues were detected
- unable to connect to node "node249" (ID: 2)

1)测试ksql到witness节点的连接(连接失败)

[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U esrep esrep
ksql: error: could not connect to server: could not connect to server: No route to host
Is the server running on host "192.168.7.249" and accepting
TCP/IP connections on port 54321?
# 节点ping
[kingbase@node1 bin]$ ping 192.168.7.249
PING 192.168.7.249 (192.168.7.249) 56(84) bytes of data.
64 bytes from 192.168.7.249: icmp_seq=1 ttl=64 time=0.513 ms
64 bytes from 192.168.7.249: icmp_seq=2 ttl=64 time=0.390 ms
64 bytes from 192.168.7.249: icmp_seq=3 ttl=64 time=0.478 ms
^C
--- 192.168.7.249 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.390/0.460/0.513/0.054 ms

2)查看witness服务器防火墙配置

[root@node2 shell]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT udp -- anywhere anywhere udp dpt:domain
ACCEPT tcp -- anywhere anywhere tcp dpt:domain
ACCEPT udp -- anywhere anywhere udp dpt:bootps
ACCEPT tcp -- anywhere anywhere tcp dpt:bootps
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
INPUT_direct all -- anywhere anywhere
INPUT_ZONES_SOURCE all -- anywhere anywhere
INPUT_ZONES all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere bogon/24 ctstate RELATED,ESTABLISHED
ACCEPT all -- 192.168.122.0/24 anywhere
ACCEPT all -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
FORWARD_direct all -- anywhere anywhere
FORWARD_IN_ZONES_SOURCE all -- anywhere anywhere
FORWARD_IN_ZONES all -- anywhere anywhere
FORWARD_OUT_ZONES_SOURCE all -- anywhere anywhere
FORWARD_OUT_ZONES all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
......

=== 有以上可知,witness服务器节点防火墙被启动===

3)清理witness主机防火墙规则

[root@node2 shell]# iptables -F

4)测试witness主机数据库连接

[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U system test
ksql (V8.0)
Type "help" for help.

5)查看集群节点状态

[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

四、集群failover 切换后

1)查看集群节点状态

[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

2)集群主备切换后,witness重新注册连接新的主库

[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered [kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | node243 | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node243 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | primary | * running | | default | 100 | 19 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node243 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered [kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | node243 | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node243 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | primary | * running | | default | 100 | 19 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node243 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2

新主库hamgr.log日志:

 [2021-03-01 12:49:05] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2"
[2021-03-01 12:49:05] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:05] [WARNING] unable to connect to upstream node "node248" (ID: 1)
[2021-03-01 12:49:05] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:08] [INFO] checking state of node 1, 1 of 2 attempts
[2021-03-01 12:49:08] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:08] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:08] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:11] [INFO] checking state of node 1, 2 of 2 attempts
[2021-03-01 12:49:11] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:11] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:49:11] [WARNING] unable to reconnect to node 1 after 2 attempts
[2021-03-01 12:49:11] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds
[2021-03-01 12:49:12] [WARNING] wal receiver not running
[2021-03-01 12:49:12] [NOTICE] WAL receiver disconnected on all sibling nodes
[2021-03-01 12:49:12] [INFO] WAL receiver disconnected on all 2 sibling nodes
[2021-03-01 12:49:12] [INFO] 2 active sibling nodes registered
[2021-03-01 12:49:12] [INFO] primary and this node have the same location ("default")
[2021-03-01 12:49:12] [INFO] local node's last receive lsn: 5/640000A0
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node249" (ID: 2)
[2021-03-01 12:49:12] [INFO] node "node249" (ID: 2) reports its upstream is node 1, last seen 7 second(s) ago
[2021-03-01 12:49:12] [INFO] node 2 last saw primary node 7 second(s) ago
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node243B" (ID: 5)
[2021-03-01 12:49:12] [WARNING] repmgrd not running on node "node243B" (ID: 5), skipping
[2021-03-01 12:49:12] [INFO] visible nodes: 3; total nodes: 3; no nodes have seen the primary within the last 4 seconds
[2021-03-01 12:49:12] [NOTICE] promotion candidate is "node243" (ID: 3)
[2021-03-01 12:49:12] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms
[2021-03-01 12:49:12] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2021-03-01 12:49:12] [INFO] try to ping the trusted_servers "192.168.7.1" before execute promote_command
[2021-03-01 12:49:14] [NOTICE] PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data. --- 192.168.7.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 2.450/2.460/2.471/0.050 ms
A
[2021-03-01 12:49:14] [NOTICE] successfully ping one or more of the trusted_servers "192.168.7.1"
[2021-03-01 12:49:14] [NOTICE] try to stop old primary db (host: "192.168.7.248")
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "192.168.7.248" and accepting
TCP/IP connections on port 54321? DETAIL: attempted to connect using:
user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr
[2021-03-01 12:49:16] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data. --- 192.168.7.240 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.357/0.365/0.374/0.020 ms [2021-03-01 12:49:16] [WARNING] the virtual ip is already on other host, try to release it on old primary node (host: "192.168.7.248")
[2021-03-01 12:49:16] [INFO] SSH connection to host "192.168.7.248" succeeded, ready to release vip on it
[2021-03-01 12:49:17] [NOTICE] old primary node (host: "192.168.7.248") release the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:17] [NOTICE] will acquire the virtual ip again
[2021-03-01 12:49:18] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data. --- 192.168.7.240 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 999ms [2021-03-01 12:49:18] [WARNING] ping host"192.168.7.240" failed
[2021-03-01 12:49:18] [DETAIL] average RTT value is not greater than zero
[2021-03-01 12:49:19] [NOTICE] new primary node (ID: 3) acquire the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:19] [INFO] promote_command is:
"/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf"
WARNING: 2 sibling nodes found, but option "--siblings-follow" not specified
DETAIL: these nodes will remain attached to the current primary:
node249 (node ID: 2, witness server)
node243B (node ID: 5)
NOTICE: promoting standby to primary
DETAIL: promoting server "node243" (ID: 3) using sys_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: SET synchronous TO "async" on primary host
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node243" (ID: 3) was successfully promoted to primary

KingbaseES R6 集群repmgr witness 手工配置案例的更多相关文章

  1. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(一)

    KingbaseES R6集群repmgr.conf参数'recovery'测试案例(一) 案例说明: 在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库 ...

  2. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(三)

    案例三:测试'recovery = manual' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | Role ...

  3. KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(二)

    案例二:测试'recovery = automatic' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | R ...

  4. KingbaseES R6集群归档备份故障分析解决案例

    案例说明: 在使用ps工具查看主库进程,发现主库'archiver'进程失败,检查sys_log日志可以发现归档失败的信息.通过sys_log日志提取归档语句手工执行归档操作,提示"当前数据 ...

  5. KingbaseES R6 集群主库网卡down测试案例

    数据库版本: test=# select version(); version ------------------------------------------------------------ ...

  6. KingbaseES R6 集群“双主”故障解决案例

    实际工作中,可能会碰到集群脑裂的情况,在脑裂时,会出现双 primary情况.这时,需要用户介入,人工判断哪个节点的数据最新,减少数据丢失. 一.测试环境信息 操作系统: [kingbase@node ...

  7. KingbaseES R6 集群 recovery 参数对切换的影响

    案例说明:在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库节点系统恢复正常后,如何对原主库节点进行处理,保证集群数据的一致性和安全,可以通过对repmg ...

  8. KingbaseES R6 集群修改data目录

    案例说明: 本案例是在部署完成KingbaseES R6集群后,由于业务的需求,集群需要修改data(数据存储)目录的测试.本案例分两种修改方式,第一种是离线修改data目录,即关闭整个集群后,修改数 ...

  9. KingbaseES R6 集群修改物理IP和VIP案例

    在用户的实际环境里,可能有时需要修改主机的IP,这就涉及到集群的配置修改.以下以例子的方式,介绍下KingbaseES R6集群如何修改IP. 一.案例测试环境 操作系统: [KINGBASE@nod ...

随机推荐

  1. dubbox、zookeeper BUG记录

    主要错误信息: dubbo:com.alibaba.dubbo.rpc.RpcException: Failed to invoke the method... Caused by: com.alib ...

  2. js 表面学习 - 认识结构2

    单行注释以 // 开头. 多行注释以 /* 开头,以 */ 结尾. 任何位于 /* 和 */ 之间的文本都会被 JavaScript 忽略. JavaScript 数据类型 JavaScript 变量 ...

  3. ansible-playbook批量修改密码

    1. 将服务器ip写到ansible hosts文件中 2. 实现免密登录服务器 将ansible服务器公钥拷贝到目标服务器用户目录下的.ssh/authorized_keys 手动连接一次或者自己写 ...

  4. 有关于weiphp2.00611上传sae的一些注意(图片上传解决方案)

    一.安装中注意的事项  安装时使用的系统为weiphp2.0611    版本     1.将所有文件上传到代码库中     2.按照步骤进行安装weiphp,注意在数据库导入的时候需要手动导入.  ...

  5. NC14247 Xorto

    NC14247 Xorto 题目 题目描述 给定一个长度为 \(n\) 的整数数组,问有多少对互不重叠的非空区间,使得两个区间内的数的异或和为 \(0\) . 输入描述 第一行一个数 \(n\) 表示 ...

  6. C++ 练气期之指针所指何处

    1. 指针 指针是一种C++数据类型,用来描述内存地址. 什么是内存地址? 内存中的每一个存储单元格都有自己的地址,地址是使用二进制进行编码.地址从形态上看是一个整型数据类型.但是,它的数据含义并不表 ...

  7. 《Stepwise Metric Promotion for Unsupervised Video Person Re-identification》 ICCV 2017

    Motivation: 这是ICCV 17年做无监督视频ReID的一篇文章.这篇文章简单来说基于两个Motivation. 在不同地方或者同一地方间隔较长时间得到的tracklet往往包含的人物是不同 ...

  8. zenmap安装

    发现最新版的KALI不带zenmap了,下面是安装步骤: 安装包转换工具:sudo apt-get install alien fakeroot -y 下载并转换:https://nmap.org/d ...

  9. 0. 西门子 WinCC 组态软件 -- 概述

    西门子 WinCC 组态软件 -- 概述 1.西门子WinCC各产品线及定位 WinCC是由SIEMENS(西门子)公司开发的SCADA(数据采集与监控)系统,能高效控制自动化过程,具有极强的开放性和 ...

  10. CF1612D X-Magic Pair

    题意: 给一个数对 \((a,b)\) ,每次可以进行操作 \((a,b) \to (|a-b|,b)\) 或 \((a,b) \to (a,∣a−b∣)\),问最后能否令 \(a=x\) 或 \(b ...