( Master High Availability)是一款开源的MySQL的高可用per脚本开发的程序套件,它为MySQL主从复制架构提供了automating master failover 功能。MHA在监控到master节点故障时,会提升其中拥有最新数据的slave节点成为新的master节点,在此期间,MHA会通过与其它从节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,即按需切换master/slave节点。
相较于其它HA软件,MHA的目的在于维持MySQL Replication中Master库的高可用性,其最大特点是可以修复多个Slave之间的差异日志,最终使所有Slave保持数据一致,然后从中选择一个充当新的Master,并将其它Slave指向它。
(1)从宕机崩溃的master保存二进制日志事件(binlog events);
(2)识别含有最新更新的slave;
(3)应用差异的中继日志(relay log)到其他的slave;
(4)应用从master保存的二进制日志事件(binlog events);
(5)提升一个slave为新的master;
(6)使其他的slave连接新的master进行复制;
环境准备
主机
|
角色
|
服务 |
端口
|
mha-node
|
172.16.40.201
|
slave
|
mysql-5.7.25(PerconaServer)
|
7066
|
node
|
172.16.40.202
|
slave(master-b)
|
mysql-5.7.25(PerconaServer)
|
7066
|
manager/node
|
172.16.40.203
|
master
|
mysql-5.7.25(PerconaServer)
|
7066
|
node
|
二、在3台服务器上安装mysql 服务
1、版本选择:Percona-Server-5.7.25-28-Linux.x86_64.ssl101.tar.gz
Mysql 安装:略
2、 配置主从
【172.16.40.203(master)】:
mysql> grant replication slave on *.* to 'repl'@'172.16.40.%' identified by 'replpasswod';
mysql>flush privileges;
【172.16.40.202(slave(master-b)】:
mysql> CHANGE MASTER TO MASTER_HOST='172.16.40.203',MASTER_PORT=7066,MASTER_USER='repl',MASTER_PASSWORD='replpasswod',MASTER_AUTO_POSITION=194;
mysql>start slave;
mysql>show slave status\G;
【172.16.40.201(slave)】:
mysql> CHANGE MASTER TO MASTER_HOST='172.16.40.203',MASTER_PORT=7066,MASTER_USER='repl',MASTER_PASSWORD='replpasswod',MASTER_AUTO_POSITION=194;
mysql>start slave;
mysql>show slave status\G;
3、开启Mysql 半同步复制
# mysql -S /tmp/7066.sock -p
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
mysql> SET GLOBAL rpl_semi_sync_master_enabled=1;
mysql> SET GLOBAL rpl_semi_sync_slave_enabled=1;
# 重启mysql
# /etc/init.d/mysqld-7066 restart
#确认是否开启半同步
mysql> show global variables like '%semi%';或 show global status like '%semi%';
三、搭建MHA
1、下载MHA套件
2、配置服务器间免密登陆
( 注意:Manager 要是装到某一台MySQL上,则需要自己和自己无密码登入,单独到一台服务器则不需要)
【172.16.40.202(manager)】:
# ssh-keygen -t rsa
# cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
# ssh-copy-id root@172.16.40.203
# ssh-copy-id root@172.16.40.201
【172.16.40.203(node)】:
# ssh-keygen -t rsa
# ssh-copy-id root@172.16.40.202
# ssh-copy-id root@172.16.40.201
【172.16.40.201(node)】:
# ssh-keygen -t rsa
# ssh-copy-id root@172.16.40.202
# ssh-copy-id root@172.16.40.203
3、安装MHA
(注意:如果manager 没有安装在独立的服务器上则每个节点都需要安装node)
(1)上传 mha4mysql-node-0.58.tar.gz 包到所有服务器并安装
# 安装 需要perl,perl-DBD-MySQL,perl-devel 依赖,yum安装即可,yum install DBD-MySQL
# tar zxvf mha4mysql-node-0.58.tar.gz
# cd mha4mysql-node-0.58
# perl Makefile.PL
# make&&make install
(2)上传 mha4mysql-manager-0.58.tar.gz 包到manager 服务器并安装
# 安装依赖
# yum install perl-Config-Tiny,perl-Log-Dispatch, perl-Parallel-ForkManager,perl-Time-HiRes -y
# tar zxvf mha4mysql-manager-0.58.tar.gz
# cd mha4mysql-manager-0.58
# perl Makefile.PL
# make&&make install
工具包介绍:
Manager工具:
- masterha_check_ssh : 检查MHA的SSH配置。
- masterha_check_repl : 检查MySQL复制。
- masterha_manager : 启动MHA。
- masterha_check_status : 检测当前MHA运行状态。
- masterha_master_monitor : 监测master是否宕机。
- masterha_master_switch : 控制故障转移(自动或手动)。
- masterha_conf_host : 添加或删除配置的server信息。
Node工具:
- save_binary_logs : 保存和复制master的二进制日志。
- apply_diff_relay_logs : 识别差异的中继日志事件并应用于其它slave。
- filter_mysqlbinlog : 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)。
- purge_relay_logs : 清除中继日志(不会阻塞SQL线程)。
报错解决:
[root@localhost authors]# masterha_check_ssh
"NI_NUMERICHOST" is not exported by the Socket module
"getaddrinfo" is not exported by the Socket module
"getnameinfo" is not exported by the Socket module
Can't continue after import errors at /usr/local/share/perl5/MHA/NodeUtil.pm line 29
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/NodeUtil.pm line 29.
Compilation failed in require at /usr/local/share/perl5/MHA/SlaveUtil.pm line 28.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/SlaveUtil.pm line 28.
Compilation failed in require at /usr/local/share/perl5/MHA/DBHelper.pm line 26.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/DBHelper.pm line 26.
Compilation failed in require at /usr/local/share/perl5/MHA/HealthCheck.pm line 30.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/HealthCheck.pm line 30.
Compilation failed in require at /usr/local/share/perl5/MHA/Server.pm line 28.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/Server.pm line 28.
Compilation failed in require at /usr/local/share/perl5/MHA/Config.pm line 29.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/Config.pm line 29.
Compilation failed in require at /usr/local/share/perl5/MHA/SSHCheck.pm line 32.
BEGIN failed--compilation aborted at /usr/local/share/perl5/MHA/SSHCheck.pm line 32.
Compilation failed in require at /usr/local/bin/masterha_check_ssh line 25.
BEGIN failed--compilation aborted at /usr/local/bin/masterha_check_ssh line 25.
使用cpan 安装依赖包
cpan[1]> install ExtUtils::Constant
cpan[1]> install Socket
Tips:如果服务器无法联网的情况下、可以根据cpan 的提示信息地址手动下载依赖包并放到对应的目录下在执行安装命令即可
问题解决:
[root@localhost authors]# masterha_check_ssh --help
Usage:
masterha_check_ssh --global_conf=/etc/masterha_default.cnf
--conf=/etc/conf/masterha/app1.cnf
See online reference
(http://code.google.com/p/mysql-master-ha/wiki/Requirements#SSH_public_k
ey_authentication) for details.
4、配置MHA
(1) 在【172.16.40.202(manager)】创建工作目录
# mkdir -p /home/mysql/app/mha/masterha
(2) 复制配置文件并修改
[server default]
manager_workdir=/home/mysql/app/mha/masterha
manager_log=/home/mysql/app/mha/masterha/logs/manager.log
master_binlog_dir=/home/mysql/app/mha/7066/logs/binlog
password=romysqladmint // 设置监控用户
user=root
ping_interval=1
remote_workdir=/opt/TMHA2/mha4mysql-node-master
repl_password=replpasswod
repl_user=repl
ssh_user=root
shutdown_script=""
log_level=debug
#master node
[server1]
hostname=172.16.40.203
port=7066
ssh_port=22
#slave node
[server2]
hostname=172.16.40.202
port=7066
ssh_port=22
#candidate_master=1 //设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave
#slave node
[server3]
hostname=172.16.40.201
port=7066
ssh_port=22
# 数据库授权监控用户
(3) Manager 状态检查:
[root@fuzhou202 conf]# masterha_check_ssh --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
Sun Mar 24 19:30:26 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Mar 24 19:30:26 2019 - [info] Reading application default configuration from /home/mysql/app/mha/masterha/conf/app1.cnf..
Sun Mar 24 19:30:26 2019 - [info] Reading server configuration from /home/mysql/app/mha/masterha/conf/app1.cnf..
Sun Mar 24 19:30:26 2019 - [info] Starting SSH connection tests..
Sun Mar 24 19:30:26 2019 - [debug]
Sun Mar 24 19:30:26 2019 - [debug] Connecting via SSH from root@172.16.40.202(172.16.40.202:22) to root@172.16.40.203(172.16.40.203:22)..
Sun Mar 24 19:30:26 2019 - [debug] ok.
Sun Mar 24 19:30:26 2019 - [debug] Connecting via SSH from root@172.16.40.202(172.16.40.202:22) to root@172.16.40.201(172.16.40.201:22)..
Sun Mar 24 19:30:26 2019 - [debug] ok.
Sun Mar 24 19:30:27 2019 - [debug]
Sun Mar 24 19:30:26 2019 - [debug] Connecting via SSH from root@172.16.40.203(172.16.40.203:22) to root@172.16.40.202(172.16.40.202:22)..
Sun Mar 24 19:30:26 2019 - [debug] ok.
Sun Mar 24 19:30:26 2019 - [debug] Connecting via SSH from root@172.16.40.203(172.16.40.203:22) to root@172.16.40.201(172.16.40.201:22)..
Sun Mar 24 19:30:26 2019 - [debug] ok.
Sun Mar 24 19:30:27 2019 - [debug]
Sun Mar 24 19:30:27 2019 - [debug] Connecting via SSH from root@172.16.40.201(172.16.40.201:22) to root@172.16.40.202(172.16.40.202:22)..
Sun Mar 24 19:30:27 2019 - [debug] ok.
Sun Mar 24 19:30:27 2019 - [debug] Connecting via SSH from root@172.16.40.201(172.16.40.201:22) to root@172.16.40.203(172.16.40.203:22)..
Sun Mar 24 19:30:27 2019 - [debug] ok.
Sun Mar 24 19:30:27 2019 - [info] All SSH connection tests passed successfully.
---------
# masterha_check_repl --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
# masterha_check_status --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
上述脚本执行都通过开启manager 监控
(4)开启manager 监控服务
#启动manager
# nohup masterha_manager --conf=/home/mysql/app/mha/masterha/conf/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mysql/app/mha/masterha/logs/manager.log 2>&1 &
#检查状态
[root@fuzhou202 logs]# masterha_check_status --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
app1 (pid:9163) is running(0:PING_OK), master:172.16.40.203
#关闭manager
# masterha_stop --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
(5) 配置脚本方式管理VIP
# 在 master 【172.16.40.203 (master)】节点上手动绑定VIP
# ifconfig eth0:1 172.16.40.99/24
# 创建perl master-failover 脚本
# 在配置文件中添加参数
master_ip_failover_script= /usr/local/bin/master_ip_failover
# vim /home/mysql/app/mha/masterha/conf/app1.cnf #添加
master_ip_failover_script= /usr/local/bin/master_ip_failover
#编辑脚本,内容如下
# vim /usr/local/bin/master_ip_failover
---------------------------------------------------
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '172.16.40.99/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
--------------------------------------
# chmod +x /usr/local/bin/master_ip_failover
验证自动自动 master-failover
# masterha_check_repl --conf=/home/mysql/app/mha/masterha/conf/app1.cnf
...
un Mar 24 20:17:11 2019 - [info] Checking master_ip_failover_script status:
Sun Mar 24 20:17:11 2019 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.16.40.203 --orig_master_ip=172.16.40.203 --orig_master_port=7066
IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 172.16.40.99/24===
四、测试
1、自动 master-failover
sysbench生成测试数据
# 主库生成数据
# sysbench --test=oltp --oltp-table-size=1000000 --oltp-read-only=off --init-rng=on --num-threads=4 --max-requests=0 --oltp-dist-type=uniform --max-time=1800 --mysql-user=root --mysql-socket=/tmp/7706.sock --mysql-password=mysqladmin--db-driver=mysql --mysql-table-engine=innodb --oltp-test-mode=complex prepare
# 关闭一台mysql的slave io_thread,模拟复制延迟情况
mysql > stop slave io_thread;
# sysbench --test=oltp --oltp-table-size=1000000 --oltp-read-only=off --init-rng=on --num-threads=4--max-requests=0 --oltp-dist-type=uniform --max-time=180 --mysql-user=root --mysql-socket=/tmp/7066.sock --mysql-password=mysqladmin --db-driver=mysql --mysql-table-engine=innodb --oltp-test-mode=complex run
# 关闭 master mysql
# pkill -9 mysqld
#观察manager 日志
...
Mon Mar 25 11:07:42 2019 - [info] 172.16.40.201: Resetting slave info succeeded.
Mon Mar 25 11:07:42 2019 - [info] Master failover to 172.16.40.201(172.16.40.201:7066) completed successfully.
Mon Mar 25 11:07:42 2019 - [info] Deleted server1 entry from /home/mysql/app/mha/masterha/conf/app1.cnf .
Mon Mar 25 11:07:42 2019 - [debug] Disconnected from 172.16.40.202(172.16.40.202:7066)
Mon Mar 25 11:07:42 2019 - [debug] Disconnected from 172.16.40.201(172.16.40.201:7066)
Mon Mar 25 11:07:42 2019 - [info]
----- Failover Report -----
app1: MySQL Master failover 172.16.40.203(172.16.40.203:7066) to 172.16.40.201(172.16.40.201:7066) succeeded
Master 172.16.40.203(172.16.40.203:7066) is down!
Check MHA Manager logs at fuzhou202:/home/mysql/app/mha/masterha/logs/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 172.16.40.203(172.16.40.203:7066)
Selected 172.16.40.201(172.16.40.201:7066) as a new master.
172.16.40.201(172.16.40.201:7066): OK: Applying all logs succeeded.
172.16.40.201(172.16.40.201:7066): OK: Activated master IP address.
172.16.40.202(172.16.40.202:7066): OK: Slave started, replicating from 172.16.40.201(172.16.40.201:7066)
172.16.40.201(172.16.40.201:7066): Resetting slave info succeeded.
Master failover to 172.16.40.201(172.16.40.201:7066) completed successfully.
#mha 自动切换已经识别最新的slave,提升为master,并修改配置成功
# vip 也已经切换到另外一台
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:b7:29:df brd ff:ff:ff:ff:ff:ff
inet 172.16.40.201/24 brd 172.16.40.255 scope global eth0
inet 172.16.40.99/24 brd 172.16.40.255 scope global secondary eth0:1
inet6 fe80::250:56ff:feb7:29df/64 scope link
valid_lft forever preferred_lft forever
注意:
mha 使用自动切换 master-failover后,manager的监控程序就会自动停止,因为启动参数设置了 --remove_dead_master_conf
所以已经恢复的mysql节点在手动加入mha 时需要的操作
1、将节点信息添加到配置文件 /home/mysql/app/mha/masterha/conf/app1.cnf
2、为已经恢复得mysql重新配置主从 CHANGE MASTER TO MASTER_HOST='172.16.40.203', MASTER_PORT=7066, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='replpasswod'; MASTER_HOST为切换后的新masetr的ip,start salve; 查看复制状态 show slave status\G
3、重新启动 manager的监控程序
2、手动 master-failover
手动 master-failover 无需开启 manager的监控程序, 当主服务器故障时,人工手动调用MHA来进行故障切换操作
# masterha_master_switch --master_state=dead --conf=/home/mysql/app/mha/masterha/conf/app1.cnf -dead_master_host=172.16.40.202 --dead_master_port=7066 --new_master_host=172.16.40.203 --new_master_port=7066 --ignore_last_failover
#此时会输出交互信息确认继续即可
- 从MySQL高可用引出对高可用架构设计的一些思考
高可用HA(High Availability)是分布式系统架构设计中必须考虑的因素之一,它通常是指,通过设计减少系统不能提供服务的时间. 假设系统一直能够提供服务,我们说系统的可用性是100%.如果 ...
- 从mysql高可用架构看高可用架构设计
高可用HA(High Availability)是分布式系统架构设计中必须考虑的因素之一,它通常是指,通过设计减少系统不能提供服务的时间. 假设系统一直能够提供服务,我们说系统的可用性是100%.如果 ...
- Mysql双主互备+keeplived高可用架构介绍
一.Mysql双主互备+keeplived高可用架构介绍 Mysql主从复制架构可以在很大程度保证Mysql的高可用,在一主多从的架构中还可以利用读写分离将读操作分配到从库中,减轻主库压力.但是在这种 ...
- Mysql双主互备+keeplived高可用架构(部分)
一.Mysql双主互备+keeplived高可用架构介绍 Mysql主从复制架构可以在很大程度保证Mysql的高可用,在一主多从的架构中还可以利用读写分离将读操作分配到从库中,减轻主库压力.但是在这种 ...
- Redis 高可用架构设计(转载)
转载自:https://mp.weixin.qq.com/s?__biz=MzA3NDcyMTQyNQ==&mid=2649263292&idx=1&sn=b170390684 ...
- Docker Kubernetes 高可用架构设计
Docker Kubernetes 高可用架构设计 官方方案:保证master端不发生单点故障. 官方使用一台Load Balancer负载均衡代理3台master端,终端与etcd与work Nod ...
- 【转】单表60亿记录等大数据场景的MySQL优化和运维之道 | 高可用架构
此文是根据杨尚刚在[QCON高可用架构群]中,针对MySQL在单表海量记录等场景下,业界广泛关注的MySQL问题的经验分享整理而成,转发请注明出处. 杨尚刚,美图公司数据库高级DBA,负责美图后端数据 ...
- [转载] 单表60亿记录等大数据场景的MySQL优化和运维之道 | 高可用架构
原文: http://mp.weixin.qq.com/s?__biz=MzAwMDU1MTE1OQ==&mid=209406532&idx=1&sn=2e9b0cc02bdd ...
- 单表60亿记录等大数据场景的MySQL优化和运维之道 | 高可用架构
015-08-09 杨尚刚 高可用架构 此文是根据杨尚刚在[QCON高可用架构群]中,针对MySQL在单表海量记录等场景下,业界广泛关注的MySQL问题的经验分享整理而成,转发请注明出处. 杨尚刚,美 ...
随机推荐
- 【BZOJ4004】装备购买(线性基)
[BZOJ4004]装备购买(线性基) 题面 BZOJ 洛谷 Description 脸哥最近在玩一款神奇的游戏,这个游戏里有 n 件装备,每件装备有 m 个属性,用向量zi(aj ,.....,am ...
- Round 403 div. 2
B 可以二分相遇的坐标:也可以二分时间,判断是否存在两个人的区间没有交. An easy way to intersect a number of segments [l1, r1], ..., [l ...
- 51nod 1952 栈(单调队列)
用deque实时维护栈的情况. 数加入栈顶部,删掉栈顶部的数,相当于加入一个数,删掉最早出现的数,每次求最大值,这个直接记录一下就好了. 数加入栈底部,删掉栈顶部的数,相当于加入一个数,删掉最晚出现的 ...
- 【bzoj4195】【NOI2015】程序自动分析
4195: [Noi2015]程序自动分析 Time Limit: 10 Sec Memory Limit: 512 MBSubmit: 3470 Solved: 1626[Submit][Sta ...
- jvm容器的关系
jvm实例,tomcat容器,spring容器,在内存中的关系5 1.一个java项目对应一个jvm 吗? 2.tomcat里面加载多个java项目 ,是不是用了一个jvm? 3.java项目中的sp ...
- Lucene 索引与检索架构图
- 使用RVM轻松部署Ruby环境
Ruby用得不多,但发现有业务需要部署指定的版本和插件.起初找了一些Fedora的src.rpm重新打包,发现依赖问题比较多,最终还是费劲的把el6的包编出来了. 不巧今天又有业务要求el5的包,原本 ...
- uva 10288 Coupons (分数模板)
https://vjudge.net/problem/UVA-10288 大街上到处在卖彩票,一元钱一张.购买撕开它上面的锡箔,你会看到一个漂亮的图案. 图案有n种,如果你收集到所有n(n≤33)种彩 ...
- FTP、SFTP文件下载内容校验
描述: 从FTP.SFTP下载的文件做MD5码校验,文件名和MD5码值存放在表格里,表格位置在FTP.SFTP服务器上. os模块只能遍历本地目录/文件,需要先连接FTP.SFTP服务器,将表格下载到 ...
- [转]memmove、memcpy和memccpy
原文地址:http://www.cppblog.com/kang/archive/2009/04/05/78984.html 在原文基础上进行了一些小修改~ memmove.memcpy和memccp ...