KingbaseES R6 通过脚本构建集群案例
案例说明:
KingbaseES V8R6部署一般可采用图形化方式快速部署,但在生产一线,有的服务器系统未启用图形化环境,所以对于KingbaseES V8R6的集群需采用手工字符界面方式部署,本次文档记录了在一线环境下的字符界面部署操作步骤。
1)本案例在通用机环境下完成。
2)需要首先安装KingbaseES R6 cluster版本的软件包。
3)本案例主要用于系统环境不能提供图形化部署或者图形化部署中出现故障时。
4)本案例在通用机环境完成,专用机环境可用于参考。
5)通用机环境的操作基本由kingbase用户完成。
6)在通过脚本一键部署R6集群时,请先做好系统环境的准备工作:(如ssh信任关系、防火墙、selinux配置、进程资源管理配置、用户创建、ip分配等)。
一、 系统环境
1.1 集群架构
1.2 数据库版本
KingbaseES_V008R006C003B0062_Aarch64
1.3 系统CPU架构(鲲鹏920)
[root@ECOLABAPP37 ~]# lscpu
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
NUMA node(s): 8
Vendor ID: HiSilicon
Model: 0
Model name: Kunpeng-920
Stepping: 0x1
CPU max MHz: 3000.0000
CPU min MHz: 200.0000
......
1.4 系统内存信息
[root@ECOLABAPP37 ~]# free -m
total used free shared buff/cache
available
Mem: 522103 18400 501575 63 2127
501458
Swap: 65535 0 65535
1.5 网卡信息
nm-bond: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 10.248.52.* netmask 255.255.240.0 broadcast 10.248.63.255
inet6 fe80::1728:3b0b:9694:6c2c prefixlen 64 scopeid 0x20<link>
ether 00:07:45:c2:d1:20 txqueuelen 1000 (Ethernet)
RX packets 83667032 bytes 5305257118 (4.9 GiB)
RX errors 0 dropped 16629 overruns 0 frame 0
TX packets 513509 bytes 44561399 (42.4 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
1.6 系统内核信息
[root@ECOLABAPP37 ~]# uname -a
Linux ECOLABAPP37 4.19.90-2003.4.0.0036.oe1.aarch64 #1 SMP Mon Mar 23 19:06:43 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
二、 配置系统环境(all nodes)
2.1 创建kingbase用户
[root@ECOLABAPP37 ~]# id kingbase
uid=1002(kingbase) gid=1002(kingbase) groups=1002(kingbase)
2.2 关闭主机系统防火墙
[root@ECOLABAPP37 Scripts]# systemctl stop firewalld
[root@ECOLABAPP37 Scripts]# systemctl disable firewalld
[root@ECOLABAPP38 ~]# systemctl status firewalld
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
……
2.3 配置selinux
[kingbase@node3 ~]$ cat /etc/sysconfig/selinux |grep -v ^#|grep -v ^$
SELINUXTYPE=targeted
SELINUX=disabled
三、 通过脚本构建集群
3.1 配置部署环境
Kingbase用户在宿主目录下创建文件夹:
[kingbase@ECOLABAPP37 ~] mkdir R6_install
将部署脚本、配置文件及数据库license.dat文件放置到当前目录下。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-rw-r--r-- 1 kingbase kingbase 2.9K Apr 19 17:20 license.dat
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57 trust_cluster.sh
-rw------- 1 kingbase kingbase 32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase 31K Apr 19 16:57 V8R6一键部署集群脚本操作手册.docx
1) 查看和编辑集群配置文件(根据系统环境进行修改)
[kingbase@node3 ~]$ cat install.conf |grep -v ^#|grep -v ^$
on_bmj=0
all_ip=(10.248.52.165 10.248.52.166)
install_dir="/home/kingbase/cluster"
zip_package="/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip"
license_file=(license.dat)
db_user="system" # the user name of database
db_password="123456" # the password of database
db_port="54321" # the port of database, defaults is 54321
db_mode="oracle" # database mode: pg, oracle
db_auth="scram-sha-256" # database authority: scram-sha-256, md5, default is scram-sha-256
trusted_servers="10.248.48.1"
virtual_ip="10.248.52.174/20"
net_device=(nm-bond nm-bond)
ipaddr_path="/sbin"
arping_path="/usr/sbin"
ping_path="/bin"
super_user="root"
execute_user="kingbase"
reconnect_attempts="6" # the number of retries in the event of an error
reconnect_interval="10" # retry interval
recovery="manual" # the way of cluster recovery: automatic/manual
ssh_port="22" # the port of ssh, default is 22
2) 配置主机间ssh互信(可以手工配置,也可以通过以下脚本配置,建议手工配置)
注意:
需要配置
kingbase
用户之间、
root
用户之间、
kingbase
和
root
用户之间,配置完成后检查用户信任关系
查看脚本内容(部分内容):
[kingbase@ECOLABAPP37 R6_install]$ cat
trust_cluster.sh
#!/bin/bash
# you should change two parameters: general_user and
all_ip
# general_user is the general user which you want to
config SSH password free
# all_ip is the devices that you want to config SSH
password free
shell_folder=$(dirname $(readlink -f "$0"))
install_conf="${shell_folder}/install.conf"
primary_host=""
curren_user=`whoami`
......
for ips in ${all_ip[@]}
do
ssh -p
${ssh_port} root@$ips "cp -r /root/.ssh /home/$general_user/"
ssh -p
${ssh_port} root@$ips "chmod 700 /home/$general_user/.ssh/"
ssh -p
${ssh_port} root@$ips "chown -R $general_user:$general_user
/home/$general_user/.ssh/"
done
3)查看cluser部署脚本(部分内容)
[kingbase@ECOLABAPP37 R6_install]$ cat
V8R6_cluster_install.sh
#!/bin/bash
shell_folder=$(dirname $(readlink -f "$0"))
install_conf=""
#all_ip=(192.168.28.10 192.168.28.11)
all_ip=()
#install_dir="/home/kingbase/tmp_kingbase"
install_dir=""
#zip_package="${shell_folder}/db.zip"
zip_package=""
#license_path="${shell_folder}"
license_path="${shell_folder}"
#BMJ Kingbase install path
soft_dir="/opt/Kingbase/ES/V8/Server"
......
# start up
the cluster
echo
"[INSTALL] start up the whole cluster ..."
execute_command ${execute_user} ${all_ip[-2]}
"${sys_bindir}/sys_monitor.sh start"
[ $? -ne 0 ]
&& exit 1
echo
"[INSTALL] start up the whole cluster ... OK"
}
main
exit 0
3.2 执行脚本部署
注意:
必须将
license.dat
文件也存放到当前目录下,缺少
license.dat
将会出现错误。
当前集群手工部署文件存储目录:
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28
install.conf
-rw-r--r-- 1 kingbase kingbase 2.9K Apr 19 17:20
license.dat
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57
trust_cluster.sh
-rw------- 1 kingbase kingbase 32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase 31K Apr 19 16:57 V8R6一键部署集群脚本操作手册.docx
执行部署脚本:
根据输出日志信息,判断部署过程中的故障。完整阅读输出日志,结合图形化部署工具,可以加深repmgr集群部署的工作机制。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ sh
V8R6_cluster_install.sh
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] check if the virtual ip
"10.248.52.*" already exist ...
[CONFIG_CHECK] there is no "10.248.52.*" on
any host, OK
[CONFIG_CHECK] the number of net_device matches the
length of all_ip or the number of net_device is 1 ... OK
[CONFIG_CHECK] the number of license_num matches the
length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target
"10.248.52.*" ..... OK
[RUNNING] success connect to the target
"10.248.52.*" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on
"10.248.52.*:54321" ..... OK
[RUNNING] the db is not running on
"10.248.52.*:54321" ..... OK
[RUNNING] check if the install dir is already exist
...
[RUNNING] the install dir is not exist on
"10.248.52.*" ..... OK
[RUNNING] the install dir is not exist on
"10.248.52.*" ..... OK
[INSTALL] create the install dir
"/home/kingbase/cluster/kingbase" on every host ...
[INSTALL] success to create the install dir
"/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] success to create the install dir
"/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] decompress the
"/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to
"/home/kingbase/cluster/kingbase"
[INSTALL] success to decompress the
"/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to
"/home/kingbase/cluster/kingbase" on "10.248.52.*"..... OK
[INSTALL] create the dir
"/home/kingbase/cluster/kingbase/etc" on all host
[INSTALL] scp the dir "/home/kingbase/cluster/kingbase"
to other host
[INSTALL] try to copy the install dir
"/home/kingbase/cluster/kingbase" to "10.248.52.*" .....
[INSTALL] success to scp the install dir
"/home/kingbase/cluster/kingbase" to "10.248.52.*" ..... OK
[RUNNING] chmod u+s for "/sbin" and
"/home/kingbase/cluster/kingbase/bin"
[RUNNING] chmod u+s /sbin/ip on
"10.248.52.*" ..... OK
[RUNNING] chmod u+s
/home/kingbase/cluster/kingbase/bin/arping on "10.248.52.*" ..... OK
[RUNNING] chmod u+s /sbin/ip on
"10.248.52.*" ..... OK
[RUNNING] chmod u+s
/home/kingbase/cluster/kingbase/bin/arping on "10.248.52.*" ..... OK
[INSTALL] check license_file "license.dat"
[INSTALL] success to access license_file:
/home/kingbase/R6_install/license.dat
[INSTALL] Copy license to
/home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy
/home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on
10.248.52.*
[INSTALL] check license_file "license.dat"
[INSTALL] success to access license_file:
/home/kingbase/R6_install/license.dat
[INSTALL] Copy license to
/home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy
/home/kingbase/R6_install/license.dat to /home/kingbase/cluster/kingbase/../ on
10.248.52.*
[INSTALL] begin to init the database on
"10.248.52.*" ...
The files belonging to this database system will be
owned by user "kingbase".
This user must also own the server process.
The database cluster will be initialized with locale
"en_US.UTF-8".
The default database encoding has accordingly been set
to "UTF8".
The default text search configuration will be set to
"english".
Data page checksums are disabled.
creating directory
/home/kingbase/cluster/kingbase/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ...
posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
Begin setup encrypt device
initializing the encrypt device ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
create security database ... ok
load security database ... ok
create initial audit rules ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/home/kingbase/cluster/kingbase/bin/sys_ctl -D
/home/kingbase/cluster/kingbase/data -l logfile start
[INSTALL] end to init the database on
"10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on
"10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on
"10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on
"10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on
"10.248.52.*" ...
[INSTALL] write the repmgr.conf on
"10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on
"10.248.52.*" ...
[INSTALL] write the repmgr.conf on
"10.248.52.*" ... OK
[INSTALL] start up the database on
"10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl
-w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D
/home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
[INSTALL] start up the database on
"10.248.52.*" ... OK
[INSTALL] create the database "esrep" and
user "esrep" for repmgr ...
CREATE DATABASE
CREATE ROLE
[INSTALL] create the database "esrep" and
user "esrep" for repmgr ... OK
[INSTALL] register the primary on
"10.248.52.*" ...
INFO: connecting to primary database...
NOTICE: attempting to install extension
"repmgr"
NOTICE: "repmgr" extension successfully
installed
NOTICE: PING 10.248.52.* (10.248.52.*) 56(84) bytes of
data.
--- 10.248.52.* ping statistics ---
2 packets transmitted, 0 received, 100% packet loss,
time 1005ms
WARNING: ping host"10.248.52.*" failed
DETAIL: average RTT value is not greater than zero
INFO: loadvip result: 1, arping result: 1
NOTICE: node (ID: 1) acquire the virtual ip 10.248.52.*
success
NOTICE: primary node record (ID: 1) registered
[INSTALL] register the primary on
"10.248.52.*" ... OK
[INSTALL] clone and start up the standby ...
clone the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/repmgr -h 10.248.52.*
-U esrep -d esrep -p 54321 standby clone
NOTICE: destination directory
"/home/kingbase/cluster/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.248.52.*
user=esrep port=54321 dbname=esrep
DETAIL: current installation size is 64 MB
NOTICE: checking for available walsenders on the
source node (2 required)
NOTICE: checking replication connections can be made
to the source server (2 required)
INFO: creating directory
"/home/kingbase/cluster/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the
-c/--fast-checkpoint option
INFO: executing:
/home/kingbase/cluster/kingbase/bin/sys_basebackup -l "repmgr base
backup" -D
/home/kingbase/cluster/kingbase/data -h 10.248.52.* -p 54321 -U esrep -X stream
-S repmgr_slot_2
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D
/home/kingbase/cluster/kingbase/data start
HINT: after starting the server, you need to register
this standby with "repmgr standby register"
clone the standby on "10.248.52.*" ... OK
start up the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60
-l /home/kingbase/cluster/kingbase/logfile -D
/home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
start up the standby on "10.248.52.*" ... OK
register the standby on "10.248.52.*" ...
INFO: connecting to local node "node2" (ID:
2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming
upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2)
successfully registered
[INSTALL] register the standby on
"10.248.52.*" ... OK
[INSTALL] start up the whole cluster ...
2021-04-19 17:31:58 Ready to start all DB ...
2021-04-19 17:31:58 begin to start DB on
"[10.248.52.*]".
2021-04-19 17:31:59 DB on "[10.248.52.*]"
already started, connect to check it.
2021-04-19 17:32:00 DB on "[10.248.52.*]"
start success.
2021-04-19 17:32:00 Try to ping trusted_servers on
host 10.248.52.* ...
2021-04-19 17:32:02 Try to ping trusted_servers on
host 10.248.52.* ...
2021-04-19 17:32:05 begin to start DB on
"[10.248.52.*]".
2021-04-19 17:32:05 DB on "[10.248.52.*]"
already started, connect to check it.
2021-04-19 17:32:06 DB on "[10.248.52.*]"
start success.
ID | Name | Role
| Status | Upstream | Location
| Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100
| 1 | user=esrep
dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1
keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default
| 100 | 1 | user=esrep dbname=esrep port=54321
host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10
keepalives_interval=1 keepalives_count=3
2021-04-19 17:32:06 The primary DB is started.
2021-04-19 17:32:12 Success to load virtual ip
[10.248.52.*] on primary host [10.248.52.*].
2021-04-19 17:32:12 Try to ping vip on host
10.248.52.* ...
2021-04-19 17:32:14 Try to ping vip on host
10.248.52.* ...
2021-04-19 17:32:17 begin to start repmgrd on
"[10.248.52.*]".
[2021-04-19 17:32:17] [NOTICE] using provided
configuration file
"/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 17:32:17] [NOTICE] redirecting logging
output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:17 repmgrd on
"[10.248.52.*]" start success.
2021-04-19 17:32:17 begin to start repmgrd on
"[10.248.52.*]".
[2021-04-19 15:15:45] [NOTICE] using provided
configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 15:15:45] [NOTICE] redirecting logging
output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:18 repmgrd on
"[10.248.52.*]" start success.
ID | Name | Role
| Status | Upstream | repmgrd |
PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 62956 | no | n/a
2 | node2 | standby | running | node1 | running | 25769 | no | 0 second(s) ago
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf
does not exist
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf
does not exist
2021-04-19 17:32:22 Done.
[INSTALL] start up the whole cluster ... OKSuccess.
You can now start the database server using:
/home/kingbase/cluster/kingbase/bin/sys_ctl -D
/home/kingbase/cluster/kingbase/data -l logfile start
[INSTALL] end to init the database on
"10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on
"10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on
"10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on "10.248.52.*"
...
[INSTALL] wirte the sys_hba.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on
"10.248.52.*" ...
[INSTALL] write the repmgr.conf on
"10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on
"10.248.52.*" ...
[INSTALL] write the repmgr.conf on
"10.248.52.*" ... OK
[INSTALL] start up the database on
"10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl
-w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D
/home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
[INSTALL] start up the database on
"10.248.52.*" ... OK
[INSTALL] create the database "esrep" and
user "esrep" for repmgr ...
CREATE DATABASE
CREATE ROLE
[INSTALL] create the database "esrep" and
user "esrep" for repmgr ... OK
[INSTALL] register the primary on
"10.248.52.*" ...
INFO: connecting to primary database...
NOTICE: attempting to install extension
"repmgr"
NOTICE: "repmgr" extension successfully
installed
NOTICE: PING 10.248.52.* (10.248.52.*) 56(84) bytes of
data.
--- 10.248.52.* ping statistics ---
2
packets
transmitted, 0 received, 100% packet loss, time 1005ms
WARNING: ping host"10.248.52.*" failed
DETAIL: average RTT value is not greater than zero
INFO: loadvip result: 1, arping result: 1
NOTICE: node (ID: 1) acquire the virtual ip
10.248.52.* success
NOTICE: primary node record (ID: 1) registered
[INSTALL] register the primary on
"10.248.52.*" ... OK
[INSTALL] clone and start up the standby ...
clone the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/repmgr -h
10.248.52.* -U esrep -d esrep -p 54321 standby clone
NOTICE: destination directory
"/home/kingbase/cluster/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.248.52.*
user=esrep port=54321 dbname=esrep
DETAIL: current installation size is 64 MB
NOTICE: checking for available walsenders on the
source node (2 required)
NOTICE: checking replication connections can be made
to the source server (2 required)
INFO: creating directory
"/home/kingbase/cluster/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the
-c/--fast-checkpoint option
INFO: executing:
/home/kingbase/cluster/kingbase/bin/sys_basebackup
-l "repmgr base backup" -D
/home/kingbase/cluster/kingbase/data -h 10.248.52.* -p 54321 -U esrep -X stream
-S repmgr_slot_2
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D
/home/kingbase/cluster/kingbase/data start
HINT: after starting the server, you need to register
this standby with "repmgr standby register"
clone the standby on "10.248.52.*" ... OK
start up the standby on "10.248.52.*" ...
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60
-l /home/kingbase/cluster/kingbase/logfile -D
/home/kingbase/cluster/kingbase/data start
waiting for server to start.... done
server started
start up the standby on "10.248.52.*" ... OK
register the standby on "10.248.52.*" ...
INFO: connecting to local node "node2" (ID:
2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming
upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node "node2" (ID: 2)
successfully registered
[INSTALL] register the standby on
"10.248.52.*" ... OK
[INSTALL] start up the whole cluster ...
2021-04-19 17:31:58 Ready to start all DB ...
2021-04-19 17:31:58 begin to start DB on
"[10.248.52.*]".
2021-04-19 17:31:59 DB on "[10.248.52.*]"
already started, connect to check it.
2021-04-19 17:32:00 DB on "[10.248.52.*]"
start success.
2021-04-19 17:32:00 Try to ping trusted_servers on
host 10.248.52.* ...
2021-04-19 17:32:02 Try to ping trusted_servers on
host 10.248.52.* ...
2021-04-19 17:32:05 begin to start DB on
"[10.248.52.*]".
2021-04-19 17:32:05 DB on "[10.248.52.*]"
already started, connect to check it.
2021-04-19 17:32:06 DB on "[10.248.52.*]"
start success.
ID | Name | Role
| Status | Upstream | Location
| Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100
| 1 | user=esrep dbname=esrep
port=54321 host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10
keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default
| 100 | 1 | user=esrep dbname=esrep port=54321
host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10
keepalives_interval=1 keepalives_count=3
2021-04-19 17:32:06 The primary DB is started.
2021-04-19 17:32:12 Success to load virtual ip
[10.248.52.*] on primary host [10.248.52.*].
2021-04-19 17:32:12 Try to ping vip on host
10.248.52.* ...
2021-04-19 17:32:14 Try to ping vip on host
10.248.52.* ...
2021-04-19 17:32:17 begin to start repmgrd on
"[10.248.52.*]".
[2021-04-19 17:32:17] [NOTICE] using provided
configuration file "/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 17:32:17] [NOTICE] redirecting logging
output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:17 repmgrd on
"[10.248.52.*]" start success.
2021-04-19 17:32:17 begin to start repmgrd on
"[10.248.52.*]".
[2021-04-19 15:15:45] [NOTICE] using provided
configuration file
"/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf"
[2021-04-19 15:15:45] [NOTICE] redirecting logging
output to "/home/kingbase/cluster/kingbase/hamgr.log"
2021-04-19 17:32:18 repmgrd on
"[10.248.52.*]" start success.
ID | Name | Role
| Status | Upstream | repmgrd |
PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+---------+-------+---------+--------------------
1 | node1 | primary | * running | | running | 62956 | no | n/a
2 | node2 | standby | running | node1 | running | 25769 | no | 0 second(s) ago
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf
does not exist
/home/kingbase/cluster/kingbase/bin/../etc/all_nodes_tools.conf
does not exist
2021-04-19 17:32:22 Done.
[INSTALL] start up the whole cluster ... OK
=== 根据以上信息获知,集群手工部署成功===
四、查看集群部署后的状态
4.1 查看数据库服务状态(主库)
[kingbase@ECOLABAPP37 ~]$ ps -ef |grep kingbase
kingbase
62335 1 0 17:31 ? 00:00:00
/home/kingbase/cluster/kingbase/bin/kingbase -D /home/kingbas/cluster/kingbase/data
kingbase
62336 62335 0 17:31 ? 00:00:00 kingbase: logger
kingbase
62338 62335 0 17:31 ? 00:00:00 kingbase: checkpointer
kingbase
62339 62335 0 17:31 ? 00:00:00 kingbase: background writer
kingbase
62340 62335 0 17:31 ? 00:00:00 kingbase: walwriter
kingbase
62341 62335 0 17:31 ? 00:00:00 kingbase: autovacuum launcher
kingbase
62342 62335 0 17:31 ? 00:00:00 kingbase: archiver last was
000000010000000000000002.00000028.backup
kingbase
62343 62335 0 17:31 ? 00:00:00 kingbase: stats collector
kingbase
62344 62335 0 17:31 ? 00:00:00 kingbase: ksh writer
kingbase
62345 62335 0 17:31 ? 00:00:00 kingbase: ksh collector
kingbase
62346 62335 0 17:31 ? 00:00:00 kingbase: sys_kwr collector
kingbase
62347 62335 0 17:31 ? 00:00:00 kingbase: logical replication
launcher
kingbase
62426 62335 0 17:31 ? 00:00:00 kingbase: walsender esrep
10.248.52.*(52926) streaming 0/300B810
kingbase
62954 62335 0 17:32 ? 00:00:00 kingbase: esrep esrep
10.248.52.*(47290) idle
kingbase
62956 1 0 17:32 ? 00:00:00
/home/kingbase/cluster/kingbase/bin/repmgrd -d -v -f
/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf
kingbase
62966 62335 0 17:32 ? 00:00:00 kingbase: esrep esrep
10.248.52.*(52934) idle
kingbase
63178 1 0 17:32 ? 00:00:00
/home/kingbase/cluster/kingbase/bin/kbha -A daemon -f
/home/kingbase/cluster/kingbase/bin/../etc/repmgr.conf
kingbase
63822 63178 0 17:35 ? 00:00:00 ping -q -c3 -w2 10.248.48.*
4.2
主备流复制状态
[kingbase@ECOLABAPP37 ~]$ ksql -U system test
ksql (V8.0)
Type "help" for help.
test=# select * from sys_stat_replication;
pid | usesysid | usename | application_name
| client_addr | client_hostname | client_port | backend_s
tart | backend_xmin | state
| sent_lsn | write_lsn |
flush_lsn | replay_lsn | write_lag | flush_lag |
replay_lag |
sync_priority | sync_state |
reply_time
-------+----------+---------+------------------+---------------+-----------------+------
62426 | 16385 | esrep | node2 | 10.248.52.* | | 52926 | 2021-04-19 17:31:
57.986053+08 | | streaming | 0/300B810 |
0/300B810 | 0/300B810 | 0/300B810 | | |
| 1 | quorum | 2021-04-19 15:19:35.941223+08
(1 row)
4.3
查看集群节点状态
[kingbase@ECOLABAPP37 ~]$ repmgr cluster show
ID | Name | Role
| Status | Upstream | Location
| Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+--------
1 | node1 | primary | * running | | default | 100
| 1 | user=esrep
dbname=esrep port=54321 host=10.248.52.* connect_timeout=10 keepalives=1
keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default
| 100 | 1 | user=esrep dbname=esrep port=54321
host=10.248.52.* connect_timeout=10 keepalives=1 keepalives_idle=10
keepalives_interval=1 keepalives_count=3
4.4
测试主备流复制同步
主库DML操作:
test=# create database prod;
CREATE DATABASE
test=# \c prod;
You are now connected to database "prod" as
user "system".
prod=# create table t1 (id int);
CREATE TABLE
prod=# insert into t1 values (10),(20),(30);
INSERT 0 3
prod=# select * from t1;
id
----
10
20
30
(3 rows)
备库查看同步数据:
[kingbase@ECOLABAPP38 ~]$ ksql -U system test
ksql (V8.0)
Type "help" for help.
test=# \c prod
You are now connected to database "prod" as
user "system".
prod=# select * from t1;
id
----
10
20
30
(3 rows)
五、部署故障案例
故障现象说明:
没有将license.dat文件存放到集群部署脚本的当前目录下,在执行部署脚本时,出现故障,无法访问到license.dat文件,后将license.dat文件拷贝到此目录后,部署成功。
[kingbase@ECOLABAPP37 ~]$ cd R6_install/
[kingbase@ECOLABAPP37 R6_install]$ ls -lh
total 80K
-rw------- 1 kingbase kingbase 5.0K Apr 19 17:28 install.conf
-r-xr-xr-x 1 kingbase kingbase 2.1K Apr 19 16:57
trust_cluster.sh
-rw------- 1 kingbase kingbase 32K Apr 19 16:57 V8R6_cluster_install.sh
-rw------- 1 kingbase kingbase 31K Apr 19 16:57 V8R6一键部署集群脚本操作手册.docx
[kingbase@ECOLABAPP37 R6_install]$ sh
V8R6_cluster_install.sh
[CONFIG_CHECK] file format is correct ... OK
[CONFIG_CHECK] check if the virtual ip
"10.248.52.*" already exist ...
[CONFIG_CHECK] there is no "10.248.52.*" on
any host, OK
[CONFIG_CHECK] the number of net_device matches the
length of all_ip or the number of net_device is 1 ... OK
[CONFIG_CHECK] the number of license_num matches the
length of all_ip or the number of license_num is 1 ... OK
[RUNNING] check if the host can be reached ...
[RUNNING] success connect to the target
"10.248.52.*" ..... OK
[RUNNING] success connect to the target
"10.248.52.*" ..... OK
[RUNNING] check the db is running or not...
[RUNNING] the db is not running on
"10.248.52.*:54321" ..... OK
[RUNNING] the db is not running on
"10.248.52.*:54321" ..... OK
[RUNNING] check if the install dir is already exist
...
[RUNNING] the install dir is not exist on
"10.248.52.*" ..... OK
[RUNNING] the install dir is not exist on
"10.248.52.*" ..... OK
[INSTALL] create the install dir
"/home/kingbase/cluster/kingbase" on every host ...
[INSTALL] success to create the install dir
"/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] success to create the install dir
"/home/kingbase/cluster/kingbase" on "10.248.52.*" ..... OK
[INSTALL] decompress the
"/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to
"/home/kingbase/cluster/kingbase"
[INSTALL] success to decompress the
"/opt/Kingbase/ES/V8/DeployTools/zip/Aarch64/db.zip" to
"/home/kingbase/cluster/kingbase" on "10.248.52.*"..... OK
[INSTALL] create the dir
"/home/kingbase/cluster/kingbase/etc" on all host
[INSTALL] scp the dir
"/home/kingbase/cluster/kingbase" to other host
[INSTALL] try to copy the install dir
"/home/kingbase/cluster/kingbase" to "10.248.52.*" .....
[INSTALL] success to scp the install dir
"/home/kingbase/cluster/kingbase" to "10.248.52.*" ..... OK
[RUNNING] chmod u+s for "/sbin" and
"/usr/sbin"
[RUNNING] chmod u+s /sbin/ip on
"10.248.52.*" ..... OK
[RUNNING] chmod u+s /usr/sbin/arping on
"10.248.52.*" ..... OK
[RUNNING] chmod u+s /sbin/ip on
"10.248.52.*" ..... OK
[RUNNING] chmod u+s /usr/sbin/arping on
"10.248.52.*" ..... OK
[INSTALL] check license_file "license.dat"
[INSTALL] Copy license to
/home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat
to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] check license_file "license.dat"
[INSTALL] Copy license to
/home/kingbase/cluster/kingbase/../: license.dat
[INSTALL] success to copy /home/kingbase/R6_install/license.dat
to /home/kingbase/cluster/kingbase/../ on 10.248.52.*
[INSTALL] begin to init the database on
"10.248.52.*" ...
The files belonging to this database system will be
owned by user "kingbase".
This user must also own the server process.
The database cluster will be initialized with locale
"en_US.UTF-8".
The default database encoding has accordingly been set
to "UTF8".
The default text search configuration will be set to
"english".
Data page checksums are disabled.
creating directory /home/kingbase/cluster/kingbase/data
... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ...
posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Shanghai
creating configuration files ... ok
Begin setup encrypt device
initializing the encrypt device ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
create security database ... ok
load security database ... ok
create initial audit rules ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/home/kingbase/cluster/kingbase/bin/sys_ctl -D
/home/kingbase/cluster/kingbase/data -l logfile start
[INSTALL] end to init the database on
"10.248.52.*" ... OK
[INSTALL] wirte the kingbase.conf on
"10.248.52.*" ...
[INSTALL] wirte the kingbase.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the es_rep.conf on
"10.248.52.*" ...
[INSTALL] wirte the es_rep.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the sys_hba.conf on
"10.248.52.*" ...
[INSTALL] wirte the sys_hba.conf on
"10.248.52.*" ... OK
[INSTALL] wirte the .encpwd on every host
[INSTALL] write the repmgr.conf on every host
[INSTALL] write the repmgr.conf on
"10.248.52.*" ...
[INSTALL] write the repmgr.conf on
"10.248.52.*" ... OK
[INSTALL] write the repmgr.conf on
"10.248.52.*" ...
[INSTALL] write the repmgr.conf on
"10.248.52.*" ... OK
[INSTALL] start up the database on
"10.248.52.*" ...
[INSTALL] /home/kingbase/cluster/kingbase/bin/sys_ctl
-w -t 60 -l /home/kingbase/cluster/kingbase/logfile -D
/home/kingbase/cluster/kingbase/data start
waiting for server to start.... stopped waiting
sys_ctl: could not start server
Examine the log output.
=注意:必须将license.dat文件也存放到当前目录下,以上错误就是缺少license.dat=
在排除故障时,可以手工执行一下命令,然后查看故障日志:
/home/kingbase/cluster/kingbase/bin/sys_ctl -w -t 60 -l
/home/kingbase/cluster/kingbase/logfile -D /home/kingbase/cluster/kingba
KingbaseES R6 通过脚本构建集群案例的更多相关文章
- KingbaseES V8R6C5 通过securecmdd工具手工脚本部署集群
案例说明: 对于KingbaseES V8R6C5版本在部集群时,需要建立kingbase.root用户在节点间的ssh互信,如果在生产环境禁用root用户ssh登录,则通过ssh部署会失败:V8R6 ...
- KingbaseES V8R6C5禁用root用户ssh登录图形化部署集群案例
案例说明: 对于KingbaseES V8R6C5版本在部集群时,需要建立kingbase.root用户在节点间的ssh互信,如果在生产环境禁用root用户ssh登录,则通过ssh部署会失败:在图形化 ...
- WebLogic集群案例分析
WebLogic集群案例分析 2012年8月,某证券交易系统(采用Weblogic中间件),由于基金业务火爆,使系统压力太大,后台服务器频繁死机时,这时工程师们紧急调试系统及恢复操作,等完成这些操作花 ...
- python脚本实现集群检测和管理
python脚本实现集群检测和管理 场景是这样的:一个生产机房,会有很多的测试机器和生产机器(也就是30台左右吧),由于管理较为混乱导致了哪台机器有人用.哪台机器没人用都不清楚,从而产生了一个想法-- ...
- memcached构建集群分析之一
memcached本身是不支持集群的,集群所关注的容灾.容错.宕机恢复机制统统都没有,实战中需要自己实现容灾机制. memcached集群相比memcached的优势: 巨量数据分布到集群的多台应用主 ...
- 分布式搜索ElasticSearch构建集群与简单搜索实例应用
分布式搜索ElasticSearch构建集群与简单搜索实例应用 关于ElasticSearch不介绍了,直接说应用. 分布式ElasticSearch集群构建的方法. 1.通过在程序中创建一个嵌入es ...
- 使用Akka构建集群(二)
前言 在<使用Akka构建集群(一)>一文中通过简单集群监听器的例子演示了如何使用Akka搭建一个简单的集群,但是这个例子“也许”离我们的实际业务场景太远,你基本不太可能去做这样的工作,除 ...
- 使用Akka构建集群(一)
概述 Akka提供的非常吸引人的特性之一就是轻松构建自定义集群,这也是我要选择Akka的最基本原因之一.如果你不想敲太多代码,也可以通过简单的配置构建一个非常简单的集群.本文为说明Akka集群构建的学 ...
- Docker 0x13: Docker 构建集群/服务/Compose/分布式服务栈
目录 Docker 构建集群/服务/Compose/分布式服务栈 集群 初始化集群服务 安装docker-machine 管理节点和工作节点 docker集群构建完成 集群中部署应用 集群服务访问特性 ...
随机推荐
- Metasploit msfvenom
一. msfvenom简介 msfvenom是msf payload和msf encode的结合体,于2015年6月8日取代了msf payload和msf encode.在此之后,metasploi ...
- linux函数与数组
1. 函数的定义 方法1: function_name () { statement } 方法2: function function_name () { statement } --先定义后使用 例 ...
- Linux 源码编译安装软件
程序包编译安装的步骤: 源代码-->预处理-->编译-->汇编-->链接-->执行 多文件:文件中的代码之间,很可能存在跨文件依赖关系 1.编译源码的项目工具 使用相关的 ...
- Django【执行查询】(二)
官方Django3.2 文档:https://docs.djangoproject.com/en/3.2/topics/db/queries/ 本文大部分内容参考官方3.2版本文档撰写,仅供学习使用 ...
- SuperSocket 1.6 创建一个简易的报文长度在头部的Socket服务器
我们来做一个头为6位报文总长度,并且长度不包含长度域自身的例子.比如这样的Socket报文000006123456. 添加SuperSocket.Engine,直接使用Nuget搜索SuperSock ...
- 写了个 Markdown 命令行小工具,希望能提高园友们发文的效率!
写了个 Markdown 命令行小工具,希望能提高园友们发文的效率! 前言 笔者使用 Typora 来编写 Markdown 格式的博文,图片采用的是本地相对路径存储(太懒了不想折腾图床). 时间久了 ...
- MQ系列2:消息中间件的技术选型
1 背景 在高并发.高消息吞吐的互联网场景中,我们经常会使用消息队列(Message Queue)作为基础设施,在服务端架构中担当消息中转.消息削峰.事务异步处理 等职能. 对于那些不需要实时响应的的 ...
- Centos7安装最新docker
Centos7安装最新docker(root身份运行) 环境查看 CentOS 需要7版本以上,内核最好3.10以上 1.查看Linux版本:rpm -q centos-release 2.查看内核版 ...
- mysql常见用法
查看连接数show processlist; 查看慢日志 show variables like '%slow_query_log%'; show variables like 'long_query ...
- 「SDOI2016」征途 题解
「SDOI2016」征途 先浅浅复制一个方差 显然dp,可以搞一个 \(dp[i][j]\)为前i段路程j天到达的最小方差 开始暴力转移 \(dp[i][j]=min(dp[k][j-1]+?)(j- ...