KingbaseES R6 集群repmgr witness 手工配置案例
使用见证服务器:
见证服务器是一个正常的KingbaseES实例,不是流复制群集的一部分; 其目的是,如果发生故障转移情况,则提供证明它是主服务器本身不可用的证据,而不是例如在不同物理位置之间的网络分裂。见证服务器的典型用例是双节点流复制设置,其中主要和备用服务器位于不同的位置(数据中心)。通过在与主服务器相同的位置(数据中心)中创建见证服务器,如果主服务器变得不可用,则备用服务器可以决定是否可以在不“脑裂”情况的情况下提升为主:如果它无法看到见证人或主服务器,它可能存在网络级中断,它不应该提升为主。如果它可以看到见证人但不能看到主节点,这证明没有网络中断且主本身不可用,因此它可以提升自己为主。
对于更复杂的复制方案,例如使用多个数据中心,最好使用基于位置的故障转移,这可确保只有与主服务器位于同一位置的节点才能成为主节点。
要创建见证服务器,请在与群集的主服务器位于同一物理位置的服务器上设置普通的PostgreSQL实例。不应该在与主服务器同一个物理主机创建见证服务器,否则如果主服务器由于硬件问题失败,见证服务器会失效。
数据库版本:
test=# select version();
version
----------------------------------------------------------------------------------------------------------------------
KingbaseES V008R006C003B0010 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(1 row)
repmgr cluster原架构:
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
一、创建witness服务器
=注意:witness服务器,应该是一个独立的主机节点不能和主库或备库在同一个主机上,并且witness和其他主机之间不构成流复制,所以witness是一个独立的primary实例,其数据库systemID,不应该和其他数据库一致,需要单独initdb一个实例,不能是通过clone或copy生成数据库。=
1)初始化实例(node2节点)
=将cluster其他节点的软件安装文件,拷贝到witness节点,然后重新初始一个实例=
[kingbase@node2 bin]$ ./initdb -D /home/kingbase/cluster/R6HA/KHA/kingbase/data -E utf8 -U system -W
......
配置repmgr extension:
启动数据库服务:
[kingbase@node2 bin]$ ./sys_ctl -D /home/kingbase/cluster/R6HA/KHA/kingbase/data start
......
server started
2)创建repmgr元数据库和schema
[kingbase@node2 bin]$ ./ksql -U system test
ksql (V8.0)
Type "help" for help.
# 创建esrep用户
test=# create user esrp with superuser;
CREATE ROLE
test=# alter user esrep with password 'Kingbaseha110';
ALTER ROLE
#创建esrep数据库
test=# create database esrep owner esrep;
CREATE DATABASE
test=# \c esrep esrep
You are now connected to database "esrep" as user "esrep".
esrep=# \d
List of relations
Schema | Name | Type | Owner
--------+---------------------+------+--------
public | sys_stat_statements | view | system
(1 row)
# 创建repmgr schema
esrep=# create schema repmgr;
CREATE SCHEMA
二、将witness加入repmgr cluster
1)配置repmgr.conf文件
[kingbase@node2 etc]$ cat repmgr.conf
on_bmj=off
node_id=2
node_name=node249
promote_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf'
follow_command='/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby follow -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf -W --upstream-node-id=%n'
conninfo='host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2'
log_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgr.log'
data_directory='/home/kingbase/cluster/R6HA/KHA/kingbase/data'
sys_bindir='/home/kingbase/cluster/R6HA/KHA/kingbase/bin'
ssh_options='-q -o ConnectTimeout=10 -o StrictHostKeyChecking=no -o ServerAliveInterval=2 -o ServerAliveCountMax=5 -p 22'
reconnect_attempts=2
reconnect_interval=3
failover='automatic'
recovery='automatic'
monitoring_history='no'
trusted_servers='192.168.7.1'
virtual_ip='192.168.7.240/24'
net_device='enp0s3'
ipaddr_path='/sbin'
arping_path='/sbin'
synchronous='quorum'
repmgrd_pid_file='/home/kingbase/cluster/R6HA/KHA/kingbase/hamgrd.pid'
ping_path='/usr/bin'
#priority=0
2)注册witness到repmgr cluster
[kingbase@node2 bin]$ ./repmgr witness register -h 192.168.7.248
# -h 指向主库节点ip
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3)查看witness元数据库数据信息
=witness注册到repmgr cluster后,自动在esrep数据库的repmgr schema下创建repmgr元数据对象=
[kingbase@node2 bin]$ ./ksql -U esrep esrep
ksql (V8.0)
Type "help" for help.
esrep=# \d repmgr.*
Table "repmgr.events"
Column | Type | Collation | Nullable | Default
-----------------+--------------------------+-----------+----------+-------------------
node_id | integer | | not null |
event | text | | not null |
successful | boolean | | not null | true
event_timestamp | timestamp with time zone | | not null | CURRENT_TIMESTAMP
details | text | | |
Index "repmgr.idx_monitoring_history_time"
Column | Type | Key? | Definition
-------------------+--------------------------+------+-------------------
last_monitor_time | timestamp with time zone | yes | last_monitor_time
standby_node_id | integer | yes | standby_node_id
btree, for table "repmgr.monitoring_history"
Table "repmgr.monitoring_history"
Column | Type | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
primary_node_id | integer | | not null |
standby_node_id | integer | | not null |
last_monitor_time | timestamp with time zone | | not null |
last_apply_time | timestamp with time zone | | |
last_wal_primary_location | pg_lsn | | not null |
last_wal_standby_location | pg_lsn | | |
replication_lag | bigint | | not null |
apply_lag | bigint | | not null |
Indexes:
"idx_monitoring_history_time" btree (last_monitor_time, standby_node_id)
Table "repmgr.nodes"
Column | Type | Collation | Nullable | Default
------------------+----------------------------+-----------+----------+-----------------
node_id | integer | | not null |
upstream_node_id | integer | | |
active | boolean | | not null | true
node_name | text | | not null |
type | text | | not null |
location | text | | not null | 'default'::text
priority | integer | | not null | 100
conninfo | text | | not null |
repluser | character varying(63 char) | | not null |
slot_name | text | | |
config_file | text | | not null |
Indexes:
"nodes_pkey" PRIMARY KEY, btree (node_id)
Check constraints:
"nodes_type_check" CHECK (type = ANY (ARRAY['primary'::text, 'standby'::text, 'witness'::text, 'bdr'::text]))
Foreign-key constraints:
"nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE
Referenced by:
TABLE "repmgr.nodes" CONSTRAINT "nodes_upstream_node_id_fkey" FOREIGN KEY (upstream_node_id) REFERENCES repmgr.nodes(node_id) DEFERRABLE
Index "repmgr.nodes_pkey"
Column | Type | Key? | Definition
---------+---------+------+------------
node_id | integer | yes | node_id
primary key, btree, for table "repmgr.nodes"
View "repmgr.replication_status"
Column | Type | Collation | Nullable | Default
---------------------------+--------------------------+-----------+----------+---------
primary_node_id | integer | | |
standby_node_id | integer | | |
standby_name | text | | |
node_type | text | | |
active | boolean | | |
last_monitor_time | timestamp with time zone | | |
last_wal_primary_location | pg_lsn | | |
last_wal_standby_location | pg_lsn | | |
replication_lag | text | | |
replication_time_lag | interval | | |
apply_lag | text | | |
communication_time_lag | interval | | |
View "repmgr.show_nodes"
Column | Type | Collation | Nullable | Default
--------------------+---------+-----------+----------+---------
node_id | integer | | |
node_name | text | | |
active | boolean | | |
upstream_node_id | integer | | |
upstream_node_name | text | | |
type | text | | |
priority | integer | | |
conninfo | text | | |
Table "repmgr.voting_term"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
term | integer | | not null |
Indexes:
"voting_term_restrict" UNIQUE, btree ((true))
Rules:
voting_term_delete AS
ON DELETE TO repmgr.voting_term DO INSTEAD NOTHING
Index "repmgr.voting_term_restrict"
Column | Type | Key? | Definition
--------+---------+------+------------
bool | boolean | yes | (true)
unique, btree, for table "repmgr.voting_term"
三、witness节点注册故障分析
=如下所示,witness在其他节点的状态为“? unreachable ”。=
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+---------------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | ? unreachable | node248 | default | 0 | ? | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
WARNING: following issues were detected
- unable to connect to node "node249" (ID: 2)
1)测试ksql到witness节点的连接(连接失败)
[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U esrep esrep
ksql: error: could not connect to server: could not connect to server: No route to host
Is the server running on host "192.168.7.249" and accepting
TCP/IP connections on port 54321?
# 节点ping
[kingbase@node1 bin]$ ping 192.168.7.249
PING 192.168.7.249 (192.168.7.249) 56(84) bytes of data.
64 bytes from 192.168.7.249: icmp_seq=1 ttl=64 time=0.513 ms
64 bytes from 192.168.7.249: icmp_seq=2 ttl=64 time=0.390 ms
64 bytes from 192.168.7.249: icmp_seq=3 ttl=64 time=0.478 ms
^C
--- 192.168.7.249 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 0.390/0.460/0.513/0.054 ms
2)查看witness服务器防火墙配置
[root@node2 shell]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT udp -- anywhere anywhere udp dpt:domain
ACCEPT tcp -- anywhere anywhere tcp dpt:domain
ACCEPT udp -- anywhere anywhere udp dpt:bootps
ACCEPT tcp -- anywhere anywhere tcp dpt:bootps
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
INPUT_direct all -- anywhere anywhere
INPUT_ZONES_SOURCE all -- anywhere anywhere
INPUT_ZONES all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere bogon/24 ctstate RELATED,ESTABLISHED
ACCEPT all -- 192.168.122.0/24 anywhere
ACCEPT all -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
FORWARD_direct all -- anywhere anywhere
FORWARD_IN_ZONES_SOURCE all -- anywhere anywhere
FORWARD_IN_ZONES all -- anywhere anywhere
FORWARD_OUT_ZONES_SOURCE all -- anywhere anywhere
FORWARD_OUT_ZONES all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
......
=== 有以上可知,witness服务器节点防火墙被启动===
3)清理witness主机防火墙规则
[root@node2 shell]# iptables -F
4)测试witness主机数据库连接
[kingbase@node1 bin]$ ./ksql -h 192.168.7.249 -U system test
ksql (V8.0)
Type "help" for help.
5)查看集群节点状态
[kingbase@node1 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
四、集群failover 切换后
1)查看集群节点状态
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | primary | * running | | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node248 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node248 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2)集群主备切换后,witness重新注册连接新的主库
[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | node243 | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node243 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | primary | * running | | default | 100 | 19 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node243 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2[kingbase@node2 bin]$ ./repmgr witness register --force -h 192.168.7.243
INFO: connecting to witness node "node249" (ID: 2)
INFO: connecting to primary node
INFO: "repmgr" extension is already installed
INFO: witness registration complete
NOTICE: witness node "node249" (ID: 2) successfully registered
[kingbase@node2 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+----------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node248 | standby | running | node243 | default | 100 | 18 | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
2 | node249 | witness | * running | node243 | default | 0 | 1 | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
3 | node243 | primary | * running | | default | 100 | 19 | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
5 | node243B | standby | running | node243 | default | 100 | 18 | host=192.168.7.243 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2
新主库hamgr.log日志:
[2021-03-01 12:49:05] [WARNING] unable to ping "host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2"
[2021-03-01 12:49:05] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:05] [WARNING] unable to connect to upstream node "node248" (ID: 1)
[2021-03-01 12:49:05] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:08] [INFO] checking state of node 1, 1 of 2 attempts
[2021-03-01 12:49:08] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:08] [DETAIL] PQping() returned "PQPING_REJECT"
[2021-03-01 12:49:08] [INFO] sleeping 3 seconds until next reconnection attempt
[2021-03-01 12:49:11] [INFO] checking state of node 1, 2 of 2 attempts
[2021-03-01 12:49:11] [WARNING] unable to ping "user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr"
[2021-03-01 12:49:11] [DETAIL] PQping() returned "PQPING_NO_RESPONSE"
[2021-03-01 12:49:11] [WARNING] unable to reconnect to node 1 after 2 attempts
[2021-03-01 12:49:11] [NOTICE] setting "wal_retrieve_retry_interval" to 86405000 milliseconds
[2021-03-01 12:49:12] [WARNING] wal receiver not running
[2021-03-01 12:49:12] [NOTICE] WAL receiver disconnected on all sibling nodes
[2021-03-01 12:49:12] [INFO] WAL receiver disconnected on all 2 sibling nodes
[2021-03-01 12:49:12] [INFO] 2 active sibling nodes registered
[2021-03-01 12:49:12] [INFO] primary and this node have the same location ("default")
[2021-03-01 12:49:12] [INFO] local node's last receive lsn: 5/640000A0
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node249" (ID: 2)
[2021-03-01 12:49:12] [INFO] node "node249" (ID: 2) reports its upstream is node 1, last seen 7 second(s) ago
[2021-03-01 12:49:12] [INFO] node 2 last saw primary node 7 second(s) ago
[2021-03-01 12:49:12] [INFO] checking state of sibling node "node243B" (ID: 5)
[2021-03-01 12:49:12] [WARNING] repmgrd not running on node "node243B" (ID: 5), skipping
[2021-03-01 12:49:12] [INFO] visible nodes: 3; total nodes: 3; no nodes have seen the primary within the last 4 seconds
[2021-03-01 12:49:12] [NOTICE] promotion candidate is "node243" (ID: 3)
[2021-03-01 12:49:12] [NOTICE] setting "wal_retrieve_retry_interval" to 5000 ms
[2021-03-01 12:49:12] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2021-03-01 12:49:12] [INFO] try to ping the trusted_servers "192.168.7.1" before execute promote_command
[2021-03-01 12:49:14] [NOTICE] PING 192.168.7.1 (192.168.7.1) 56(84) bytes of data.
--- 192.168.7.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 2.450/2.460/2.471/0.050 ms
A
[2021-03-01 12:49:14] [NOTICE] successfully ping one or more of the trusted_servers "192.168.7.1"
[2021-03-01 12:49:14] [NOTICE] try to stop old primary db (host: "192.168.7.248")
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "192.168.7.248" and accepting
TCP/IP connections on port 54321?
DETAIL: attempted to connect using:
user=esrep connect_timeout=10 dbname=esrep host=192.168.7.248 port=54321 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=2 fallback_application_name=repmgr
[2021-03-01 12:49:16] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data.
--- 192.168.7.240 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.357/0.365/0.374/0.020 ms
[2021-03-01 12:49:16] [WARNING] the virtual ip is already on other host, try to release it on old primary node (host: "192.168.7.248")
[2021-03-01 12:49:16] [INFO] SSH connection to host "192.168.7.248" succeeded, ready to release vip on it
[2021-03-01 12:49:17] [NOTICE] old primary node (host: "192.168.7.248") release the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:17] [NOTICE] will acquire the virtual ip again
[2021-03-01 12:49:18] [NOTICE] PING 192.168.7.240 (192.168.7.240) 56(84) bytes of data.
--- 192.168.7.240 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 999ms
[2021-03-01 12:49:18] [WARNING] ping host"192.168.7.240" failed
[2021-03-01 12:49:18] [DETAIL] average RTT value is not greater than zero
[2021-03-01 12:49:19] [NOTICE] new primary node (ID: 3) acquire the virtual ip 192.168.7.240/24 success
[2021-03-01 12:49:19] [INFO] promote_command is:
"/home/kingbase/cluster/R6HA/KHA/kingbase/bin/repmgr standby promote -f /home/kingbase/cluster/R6HA/KHA/kingbase/etc/repmgr.conf"
WARNING: 2 sibling nodes found, but option "--siblings-follow" not specified
DETAIL: these nodes will remain attached to the current primary:
node249 (node ID: 2, witness server)
node243B (node ID: 5)
NOTICE: promoting standby to primary
DETAIL: promoting server "node243" (ID: 3) using sys_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: SET synchronous TO "async" on primary host
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node243" (ID: 3) was successfully promoted to primary
KingbaseES R6 集群repmgr witness 手工配置案例的更多相关文章
- KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(一)
KingbaseES R6集群repmgr.conf参数'recovery'测试案例(一) 案例说明: 在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库 ...
- KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(三)
案例三:测试'recovery = manual' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | Role ...
- KingbaseES R6 集群repmgr.conf参数'recovery'测试案例(二)
案例二:测试'recovery = automatic' 1.查看集群节点状态信息: [kingbase@node1 bin]$ ./repmgr cluster show ID | Name | R ...
- KingbaseES R6集群归档备份故障分析解决案例
案例说明: 在使用ps工具查看主库进程,发现主库'archiver'进程失败,检查sys_log日志可以发现归档失败的信息.通过sys_log日志提取归档语句手工执行归档操作,提示"当前数据 ...
- KingbaseES R6 集群主库网卡down测试案例
数据库版本: test=# select version(); version ------------------------------------------------------------ ...
- KingbaseES R6 集群“双主”故障解决案例
实际工作中,可能会碰到集群脑裂的情况,在脑裂时,会出现双 primary情况.这时,需要用户介入,人工判断哪个节点的数据最新,减少数据丢失. 一.测试环境信息 操作系统: [kingbase@node ...
- KingbaseES R6 集群 recovery 参数对切换的影响
案例说明:在KingbaseES R6集群中,主库节点出现宕机(如重启或关机),会产生主备切换,但是当主库节点系统恢复正常后,如何对原主库节点进行处理,保证集群数据的一致性和安全,可以通过对repmg ...
- KingbaseES R6 集群修改data目录
案例说明: 本案例是在部署完成KingbaseES R6集群后,由于业务的需求,集群需要修改data(数据存储)目录的测试.本案例分两种修改方式,第一种是离线修改data目录,即关闭整个集群后,修改数 ...
- KingbaseES R6 集群修改物理IP和VIP案例
在用户的实际环境里,可能有时需要修改主机的IP,这就涉及到集群的配置修改.以下以例子的方式,介绍下KingbaseES R6集群如何修改IP. 一.案例测试环境 操作系统: [KINGBASE@nod ...
随机推荐
- 自建批量更改标准BO数据程序
by zyi
- ArrayList分析2 :Itr、ListIterator以及SubList中的坑
ArrayList分析2 : Itr.ListIterator以及SubList中的坑 转载请注明出处:https://www.cnblogs.com/funnyzpc/p/16409137.html ...
- Codeforces Round #783 (Div. 2)
A. Direction Change 题意 从(1,1)点出发到(n,m),每次可以向上下左右四个方向移动,但是不能与上次移动方向相同 最少要移动多少不,如果不能到达输出 -1 思路 假设n< ...
- 关于使用netstat -lantup查看的SSHD 6010端口解释
关于使用netstat -lantup查看的SSHD 6010端口解释: 1.使用netstat -lantup查看当前系统开启的服务端口 tcp6 0 0 ::1:6010 ...
- JDBC:Statement问题
1.Statement问题 2.解决办法:通过PreparedStatement代替 实践: package com.dgd.test; import java.io.FileInputStrea ...
- 全网最新的nacos 2.1.0集群多节点部署教程
原文链接:全网最新的nacos 2.1.0集群多节点部署教程-语雀 基本信息 进度整理中 版本 2.1.0 版本发布日期 2022-04-29 git revision number b5845313 ...
- APISpace 空号检测API接口 免费好用
空号检测也称空号在线过滤,在线筛号,号码在线清洗.空号检测平台借助第五代大数据空号检测系统,为用户提供高精准的空号检测.号码过滤.号码筛选.号码清洗等众多号码检测功能,让用户快速准确的检测出活跃号.空 ...
- 构建 API 的7个建议【翻译】
迄今为止,越来越多的企业依靠API来为客户提供服务,以确保竞争的优势和业务可见性.出现这个情况的原因是微服务和无服务器架构正变得越来越普遍,API作为其中的关键节点,继承和承载了更多业务. 在这个前提 ...
- 2022-07-11 第六组 润土 JavaScript01学习笔记
1.JS的数据类型: 数字 字符串 布尔型 空(null) unefined(未定义) 2.定义变量 var let(不可重复) const(常量不可更改) 3.复杂的数据类型: 数组:一个变量对应多 ...
- GitHub中Fork来的仓库如何进行双向更新
一.做点贡献 想对别人的某个仓库"做点贡献"怎么办? 1. Fork该仓库 首先Fork该仓库,本文以git-learn这个仓库为例 这样自己的账号下就会出现这样一个仓库 2. C ...