MySQL高可用方案 MHA之二 master_ip

异步主从复制架构
master:
10.150.20.90 ed3jrdba90
slave:
10.15.20.97 ed3jrdba97
10.150.20.132 ed3jrdba132
manager:
10.150.20.95 ed3jrdba95
#新增VIP
vip:10.150.20.200

四台机器的系统情况：
OS:CentOS7.3
MySQL:5.7.21
MHA:0.58
网卡名：ens3

mha manager节点

1：配置app1.cnf文件
添加master_ip_failover_script的文件路径，mysql master失败时执行的切换脚本。
#vi /etc/mysql_mha/app1.cnf
#自动failover时候的切换脚本
master_ip_failover_script= /usr/local/bin/master_ip_failover

[root@dev05 ~]# cat /etc/mysql_mha/app1.cnf

[server default]
manager_log=/data/mysql_mha/app1-manager.log
manager_workdir=/data/mysql_mha/app1
master_binlog_dir=/data/mysql_33061/logs
master_ip_failover_script=/usr/local/bin/master_ip_failover
password=mha_monitor
ping_interval=5
remote_workdir=/data/mysql_mha/app1
repl_password=replicator
repl_user=replicator
shutdown_script=""
ssh_user=root
user=mha_monitor

[server1]
hostname=10.150.20.90
port=33061

[server1]
hostname=10.150.20.97
port=33061

[server3]
hostname=10.150.20.132
port=33061

编辑master_ip_failover脚本文件：没有使用 keepalived ，通过脚本的方式管理vip

# cp /usr/local/bin/master_ip_failover /usr/local/bin/master_ip_failover.bak
# vi /usr/local/bin/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

my (

$command, $ssh_user, $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip, $new_master_port

);

#############################添加内容部分#########################################

my $vip = '10.150.20.200';

my $brdc = '10.150.20.255';

my $ifdev = 'ens3';

my $key = '';

my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";

my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";

##################################################################################

GetOptions(

'command=s' => \$command,

'ssh_user=s' => \$ssh_user,

'orig_master_host=s' => \$orig_master_host,

'orig_master_ip=s' => \$orig_master_ip,

'orig_master_port=i' => \$orig_master_port,

'new_master_host=s' => \$new_master_host,

'new_master_ip=s' => \$new_master_ip,

'new_master_port=i' => \$new_master_port,

);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = ;

eval {

print "Disabling the VIP on old master: $orig_master_host \n";

&stop_vip();

$exit_code = ;

};

if ($@) {

warn "Got Error: $@\n";

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "start" ) {

my $exit_code = ;

eval {

print "Enabling the VIP - $vip on the new master - $new_master_host \n";

&start_vip();

$exit_code = ;

};

if ($@) {

warn $@;

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "status" ) {

print "Checking the Status of the script.. OK \n";

exit ;

}

else {

&usage();

exit ;

}

}

sub start_vip() {

`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

# A simple system call that disable the VIP on the old_master

sub stop_vip() {

`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

master_ip_failover

更换ip后，一定要执行下arping

检查复制环境ssh
# masterha_check_ssh --conf=/etc/mysql_mha/app1.cnf
Wed Dec 12 14:43:27 2018 - [info] Reading default configuration from /etc/masterha_default.cnf..
Wed Dec 12 14:43:27 2018 - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf..
Wed Dec 12 14:43:27 2018 - [info] Reading server configuration from /etc/mysql_mha/app1.cnf..
Wed Dec 12 14:43:27 2018 - [info] Starting SSH connection tests..
Wed Dec 12 14:43:28 2018 - [debug]
Wed Dec 12 14:43:27 2018 - [debug] Connecting via SSH from root@10.150.20.90(10.150.20.90:22) to root@10.150.20.97(10.150.20.97:22)..
Wed Dec 12 14:43:27 2018 - [debug] ok.
Wed Dec 12 14:43:27 2018 - [debug] Connecting via SSH from root@10.150.20.90(10.150.20.90:22) to root@10.150.20.132(10.150.20.132:22)..
Wed Dec 12 14:43:27 2018 - [debug] ok.
Wed Dec 12 14:43:28 2018 - [debug]
Wed Dec 12 14:43:27 2018 - [debug] Connecting via SSH from root@10.150.20.97(10.150.20.97:22) to root@10.150.20.90(10.150.20.90:22)..
Wed Dec 12 14:43:27 2018 - [debug] ok.
Wed Dec 12 14:43:27 2018 - [debug] Connecting via SSH from root@10.150.20.97(10.150.20.97:22) to root@10.150.20.132(10.150.20.132:22)..
Wed Dec 12 14:43:28 2018 - [debug] ok.
Wed Dec 12 14:43:29 2018 - [debug]
Wed Dec 12 14:43:28 2018 - [debug] Connecting via SSH from root@10.150.20.132(10.150.20.132:22) to root@10.150.20.90(10.150.20.90:22)..
Wed Dec 12 14:43:28 2018 - [debug] ok.
Wed Dec 12 14:43:28 2018 - [debug] Connecting via SSH from root@10.150.20.132(10.150.20.132:22) to root@10.150.20.97(10.150.20.97:22)..
Wed Dec 12 14:43:28 2018 - [debug] ok.
Wed Dec 12 14:43:29 2018 - [info] All SSH connection tests passed successfully.
检查整个复制环境
# masterha_check_repl --conf=/etc/mysql_mha/app1.cnf
Wed Dec 12 14:44:35 2018 - [info] Slaves settings check done.
Wed Dec 12 14:44:35 2018 - [info]
10.150.20.90(10.150.20.90:33061) (current master)
+--10.150.20.97(10.150.20.97:33061)
+--10.150.20.132(10.150.20.132:33061)

Wed Dec 12 14:44:35 2018 - [info] Checking replication health on 10.150.20.97..
Wed Dec 12 14:44:35 2018 - [info] ok.
Wed Dec 12 14:44:35 2018 - [info] Checking replication health on 10.150.20.132..
Wed Dec 12 14:44:35 2018 - [info] ok.
Wed Dec 12 14:44:35 2018 - [info] Checking master_ip_failover_script status:
Wed Dec 12 14:44:35 2018 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.150.20.90 --orig_master_ip=10.150.20.90 --orig_master_port=33061
Wed Dec 12 14:44:35 2018 - [info] OK.
Wed Dec 12 14:44:35 2018 - [warning] shutdown_script is not defined.
Wed Dec 12 14:44:35 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

启动 mha manager
# nohup masterha_manager --conf=/etc/mysql_mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /data/mysql_mha/app1-manager.log 2>&1 &
查看 manager status
# masterha_check_status --conf=/etc/mysql_mha/app1.cnf
查看 manager log
# tail -n 1000 -f /var/log/masterha/app1-manager.log

验证 failover

在主库qa05.010150020090.yz节点，进行vip绑定：
[root@qa05 ~]#ip addr add 10.150.20.200/24 brd 10.150.20.255 dev ens3 label ens3:1
[root@qa05 ~]#/usr/sbin/arping -q -A -c 1 -I ens3 10.150.20.200

#vip解绑：
# ip addr del 10.150.20.200/24 dev ens3 label ens3:1

模拟故障，在qa05.010150020090.yz上 kill 掉 mysqld 进程
[root@qa05 ~]## ps -ef|grep -i mysql
mysql 3114 1 0 Aug06 ? 00:00:51 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
root 15551 10466 0 Aug06 pts/1 00:00:00 mysql
root 25521 21213 0 03:52 pts/2 00:00:00 grep --color=auto -i mysql
[root@qa05 ~]## kill -9 27593 26101

观察 mha manager 之前打开的日志输出
[root@dev05 ~]# tail -n 1000 -f /data/mysql_mha/app1-manager.log

Wed Dec  ::  - [warning] Got error on MySQL select ping:  (MySQL server has gone away)

Wed Dec  ::  - [info] Executing SSH check script: save_binary_logs --command=test --start_pos= --binlog_dir=/data/mysql_33061/logs --output_file=/data/mysql_mha/app1/save_binary_logs_test --manager_version=0.58 --binlog_prefix=mysql-bin

Wed Dec  ::  - [info] HealthCheck: SSH to 10.150.20.90 is reachable.

Wed Dec  ::  - [warning] Got error on MySQL connect:  (Can't connect to MySQL server on '10.150.20.90' (111))

Wed Dec  ::  - [warning] Connection failed  time(s)..

Wed Dec  ::  - [warning] Got error on MySQL connect:  (Can't connect to MySQL server on '10.150.20.90' (111))

Wed Dec  ::  - [warning] Connection failed  time(s)..

Wed Dec  ::  - [warning] Got error on MySQL connect:  (Can't connect to MySQL server on '10.150.20.90' (111))

Wed Dec  ::  - [warning] Connection failed  time(s)..

Wed Dec  ::  - [warning] Master is not reachable from health checker!

Wed Dec  ::  - [warning] Master 10.150.20.90(10.150.20.90:) is not reachable!

Wed Dec  ::  - [warning] SSH is reachable.

Wed Dec  ::  - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mysql_mha/app1.cnf again, and trying to connect to all servers to check server status..

Wed Dec  ::  - [info] Reading default configuration from /etc/masterha_default.cnf..

Wed Dec  ::  - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf..

Wed Dec  ::  - [info] Reading server configuration from /etc/mysql_mha/app1.cnf..

Wed Dec  ::  - [info] GTID failover mode =

Wed Dec  ::  - [info] Dead Servers:

Wed Dec  ::  - [info] 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] Alive Servers:

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:)

Wed Dec  ::  - [info] Alive Slaves:

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] Checking slave configurations..

Wed Dec  ::  - [info] read_only= is not set on slave 10.150.20.97(10.150.20.97:).

Wed Dec  ::  - [warning] relay_log_purge= is not set on slave 10.150.20.97(10.150.20.97:).

Wed Dec  ::  - [info] read_only= is not set on slave 10.150.20.132(10.150.20.132:).

Wed Dec  ::  - [info] Checking replication filtering settings..

Wed Dec  ::  - [info] Replication filtering check ok.

Wed Dec  ::  - [info] Master is down!

Wed Dec  ::  - [info] Terminating monitoring script.

Wed Dec  ::  - [info] Got exit code  (Master dead).

Wed Dec  ::  - [info] MHA::MasterFailover version 0.58.

Wed Dec  ::  - [info] Starting master failover.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase : Configuration Check Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] GTID failover mode =

Wed Dec  ::  - [info] Dead Servers:

Wed Dec  ::  - [info] 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] Checking master reachability via MySQL(double check)...

Wed Dec  ::  - [info] ok.

Wed Dec  ::  - [info] Alive Servers:

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:)

Wed Dec  ::  - [info] Alive Slaves:

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] Starting Non-GTID based failover.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] ** Phase : Configuration Check Phase completed.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase : Dead Master Shutdown Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Forcing shutdown so that applications never connect to the current master..

Wed Dec  ::  - [info] Executing master IP deactivation script:

Wed Dec  ::  - [info] /usr/local/bin/master_ip_failover --orig_master_host=10.150.20.90 --orig_master_ip=10.150.20.90 --orig_master_port= --command=stopssh --ssh_user=root

Wed Dec  ::  - [info] done.

Wed Dec  ::  - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.

Wed Dec  ::  - [info] * Phase : Dead Master Shutdown Phase completed.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase : Master Recovery Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 3.1: Getting Latest Slaves Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] The latest binary log file/position on all slaves is mysql-bin.:

Wed Dec  ::  - [info] Latest slaves (Slaves that received relay log files to the latest):

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] The oldest binary log file/position on all slaves is mysql-bin.:

Wed Dec  ::  - [info] Oldest slaves:

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Fetching dead master's binary logs..

Wed Dec  ::  - [info] Executing command on the dead master 10.150.20.90(10.150.20.90:): save_binary_logs --command=save --start_file=mysql-bin. --start_pos= --binlog_dir=/data/mysql_33061/logs --output_file=/data/mysql_mha/app1/saved_master_binlog_from_10.150.20.90_33061_20181212145426.binlog --handle_raw_binlog= --disable_log_bin= --manager_version=0.58

Creating /data/mysql_mha/app1 if not exists.. ok.

Concat binary/relay logs from mysql-bin. pos  to mysql-bin. EOF into /data/mysql_mha/app1/saved_master_binlog_from_10.150.20.90_33061_20181212145426.binlog ..

Binlog Checksum enabled

Dumping binlog format description event, from position  to .. ok.

No need to dump effective binlog data from /data/mysql_33061/logs/mysql-bin. (pos starts , filesize ). Skipping.

Binlog Checksum enabled

/data/mysql_mha/app1/saved_master_binlog_from_10.150.20.90_33061_20181212145426.binlog has no effective data events.

Event not exists.

Wed Dec  ::  - [info] Additional events were not found from the orig master. No need to save.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 3.3: Determining New Master Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Finding the latest slave that has all relay logs for recovering other slaves..

Wed Dec  ::  - [info] All slaves received relay logs to the same position. No need to resync each other.

Wed Dec  ::  - [info] Searching new master from slaves..

Wed Dec  ::  - [info] Candidate masters from the configuration file:

Wed Dec  ::  - [info] Non-candidate masters:

Wed Dec  ::  - [info] New master is 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] Starting master failover..

Wed Dec  ::  - [info]

From:

10.150.20.90(10.150.20.90:) (current master)

+--10.150.20.97(10.150.20.97:)

+--10.150.20.132(10.150.20.132:)

To:

10.150.20.97(10.150.20.97:) (new master)

+--10.150.20.132(10.150.20.132:)

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 3.4: New Master Diff Log Generation Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] This server has all relay logs. No need to generate diff files from the latest slave.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 3.5: Master Log Apply Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.

Wed Dec  ::  - [info] Starting recovery on 10.150.20.97(10.150.20.97:)..

Wed Dec  ::  - [info] This server has all relay logs. Waiting all logs to be applied..

Wed Dec  ::  - [info] done.

Wed Dec  ::  - [info] All relay logs were successfully applied.

Wed Dec  ::  - [info] Getting new master's binlog name and position..

Wed Dec  ::  - [info] mysql-bin.:

Wed Dec  ::  - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.150.20.97', MASTER_PORT=, MASTER_LOG_FILE='mysql-bin.000010', MASTER_LOG_POS=, MASTER_USER='replicator', MASTER_PASSWORD='xxx';

Wed Dec  ::  - [info] Executing master IP activate script:

Wed Dec  ::  - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=10.150.20.90 --orig_master_ip=10.150.20.90 --orig_master_port= --new_master_host=10.150.20.97 --new_master_ip=10.150.20.97 --new_master_port= --new_master_user='mha_monitor' --new_master_password=xxx

Set read_only= on the new master.

Creating app user on the new master..

Wed Dec  ::  - [info] OK.

Wed Dec  ::  - [info] ** Finished master recovery successfully.

Wed Dec  ::  - [info] * Phase : Master Recovery Phase completed.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase : Slaves Recovery Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] -- Slave diff file generation on host 10.150.20.132(10.150.20.132:) started, pid: . Check tmp log /data/mysql_mha/app1/10.150..132_33061_20181212145426.log if it takes time..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Log messages from 10.150.20.132 ...

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] This server has all relay logs. No need to generate diff files from the latest slave.

Wed Dec  ::  - [info] End of log messages from 10.150.20.132.

Wed Dec  ::  - [info] -- 10.150.20.132(10.150.20.132:) has the latest relay log events.

Wed Dec  ::  - [info] Generating relay diff files from the latest slave succeeded.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] -- Slave recovery on host 10.150.20.132(10.150.20.132:) started, pid: . Check tmp log /data/mysql_mha/app1/10.150..132_33061_20181212145426.log if it takes time..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Log messages from 10.150.20.132 ...

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Starting recovery on 10.150.20.132(10.150.20.132:)..

Wed Dec  ::  - [info] This server has all relay logs. Waiting all logs to be applied..

Wed Dec  ::  - [info] done.

Wed Dec  ::  - [info] All relay logs were successfully applied.

Wed Dec  ::  - [info] Resetting slave 10.150.20.132(10.150.20.132:) and starting replication from the new master 10.150.20.97(10.150.20.97:)..

Wed Dec  ::  - [info] Executed CHANGE MASTER.

Wed Dec  ::  - [info] Slave started.

Wed Dec  ::  - [info] End of log messages from 10.150.20.132.

Wed Dec  ::  - [info] -- Slave recovery on host 10.150.20.132(10.150.20.132:) succeeded.

Wed Dec  ::  - [info] All new slave servers recovered successfully.

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] * Phase : New master cleanup phase..

Wed Dec  ::  - [info]

Wed Dec  ::  - [info] Resetting slave info on the new master..

Wed Dec  ::  - [info] 10.150.20.97: Resetting slave info succeeded.

Wed Dec  ::  - [info] Master failover to 10.150.20.97(10.150.20.97:) completed successfully.

Wed Dec  ::  - [info] Deleted server1 entry from /etc/mysql_mha/app1.cnf .

Wed Dec  ::  - [info]

----- Failover Report -----

app1: MySQL Master failover 10.150.20.90(10.150.20.90:) to 10.150.20.97(10.150.20.97:) succeeded

Master 10.150.20.90(10.150.20.90:) is down!

Check MHA Manager logs at dev05..yz:/data/mysql_mha/app1-manager.log for details.

Started automated(non-interactive) failover.

Invalidated master IP address on 10.150.20.90(10.150.20.90:)

The latest slave 10.150.20.97(10.150.20.97:) has all relay logs for recovery.

Selected 10.150.20.97(10.150.20.97:) as a new master.

10.150.20.97(10.150.20.97:): OK: Applying all logs succeeded.

10.150.20.97(10.150.20.97:): OK: Activated master IP address.

10.150.20.132(10.150.20.132:): This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

10.150.20.132(10.150.20.132:): OK: Applying all logs succeeded. Slave started, replicating from 10.150.20.97(10.150.20.97:)

10.150.20.97(10.150.20.97:): Resetting slave info succeeded.

Master failover to 10.150.20.97(10.150.20.97:) completed successfully.

app1-manager.log

从日志，可以看出new master切换至10.150.20.97，此时manager节点mha manager关闭

[root@dev05 ~]# masterha_check_status --conf=/etc/mysql_mha/app1.cnf

app1 is stopped(2:NOT_RUNNING).

而新主qa06.010150020097.yz，vip绑定到ens3网卡上
[root@qa06 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 54:52:00:49:48:92 brd ff:ff:ff:ff:ff:ff
inet 10.150.20.97/24 brd 10.150.20.255 scope global ens3
valid_lft forever preferred_lft forever
inet 10.150.20.200/24 brd 10.150.20.255 scope global secondary ens3:1
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 02:42:7f:36:38:fe brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever

此时的mha manager节点的配置文件app1.cnf被修改为：
[root@dev05 ~]#cat /etc/mysql_mha/app1.cnf
[server default]
manager_log=/data/mysql_mha/app1-manager.log
manager_workdir=/data/mysql_mha/app1
master_binlog_dir=/data/mysql_33061/logs
master_ip_failover_script=/usr/local/bin/master_ip_failover
password=mha_monitor
ping_interval=5
remote_workdir=/data/mysql_mha/app1
repl_password=replicator
repl_user=replicator
shutdown_script=""
ssh_user=root
user=mha_monitor

[server2]
hostname=10.150.20.97
port=33061

[server3]
hostname=10.150.20.132
port=33061

重新编辑app1.cnf
[root@dev05 ~]#cat /etc/mysql_mha/app1.cnf
[server default]
manager_log=/data/mysql_mha/app1-manager.log
manager_workdir=/data/mysql_mha/app1
master_binlog_dir=/data/mysql_33061/logs
master_ip_failover_script=/usr/local/bin/master_ip_failover
password=mha_monitor
ping_interval=5
remote_workdir=/data/mysql_mha/app1
repl_password=replicator
repl_user=replicator
shutdown_script=""
ssh_user=root
user=mha_monitor

[server1]
hostname=10.150.20.97
port=33061
[server2]
hostname=10.150.20.90
port=33061
[server3]
hostname=10.150.20.132
port=33061

重启qa05.010150020090.yz的MySQL,搭建主从，指向新主
mysql> change master to
master_host='10.150.20.97',
master_user='replicator',
master_password='replicator',
master_port=33061,
master_log_file='mysql-bin.000010',
master_log_pos=2774;
mysql> start slave;

检测复制环境
# masterha_check_repl --conf=/etc/mysql_mha/app1.cnf

Wed Dec  ::  - [info] Reading default configuration from /etc/masterha_default.cnf..

Wed Dec  ::  - [info] Reading application default configuration from /etc/mysql_mha/app1.cnf..

Wed Dec  ::  - [info] Reading server configuration from /etc/mysql_mha/app1.cnf..

Wed Dec  ::  - [info] MHA::MasterMonitor version 0.58.

Wed Dec  ::  - [info] GTID failover mode =

Wed Dec  ::  - [info] Dead Servers:

Wed Dec  ::  - [info] Alive Servers:

Wed Dec  ::  - [info] 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] 10.150.20.90(10.150.20.90:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:)

Wed Dec  ::  - [info] Alive Slaves:

Wed Dec  ::  - [info] 10.150.20.90(10.150.20.90:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] 10.150.20.132(10.150.20.132:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled

Wed Dec  ::  - [info] Replicating from 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] Current Alive Master: 10.150.20.97(10.150.20.97:)

Wed Dec  ::  - [info] Checking slave configurations..

Wed Dec  ::  - [info] read_only= is not set on slave 10.150.20.90(10.150.20.90:).

Wed Dec  ::  - [warning] relay_log_purge= is not set on slave 10.150.20.90(10.150.20.90:).

Wed Dec  ::  - [info] read_only= is not set on slave 10.150.20.132(10.150.20.132:).

Wed Dec  ::  - [info] Checking replication filtering settings..

Wed Dec  ::  - [info] binlog_do_db= , binlog_ignore_db=

Wed Dec  ::  - [info] Replication filtering check ok.

Wed Dec  ::  - [info] GTID (with auto-pos) is not supported

Wed Dec  ::  - [info] Starting SSH connection tests..

Wed Dec  ::  - [info] All SSH connection tests passed successfully.

Wed Dec  ::  - [info] Checking MHA Node version..

Wed Dec  ::  - [info] Version check ok.

Wed Dec  ::  - [info] Checking SSH publickey authentication settings on the current master..

Wed Dec  ::  - [info] HealthCheck: SSH to 10.150.20.97 is reachable.

Wed Dec  ::  - [info] Master MHA Node version is 0.58.

Wed Dec  ::  - [info] Checking recovery script configurations on 10.150.20.97(10.150.20.97:)..

Wed Dec  ::  - [info] Executing command: save_binary_logs --command=test --start_pos= --binlog_dir=/data/mysql_33061/logs --output_file=/data/mysql_mha/app1/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.

Wed Dec  ::  - [info] Connecting to root@10.150.20.97(10.150.20.97:)..

Creating /data/mysql_mha/app1 if not exists.. ok.

Checking output directory is accessible or not..

ok.

Binlog found at /data/mysql_33061/logs, up to mysql-bin.

Wed Dec  ::  - [info] Binlog setting check done.

Wed Dec  ::  - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Wed Dec  ::  - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.90 --slave_ip=10.150.20.90 --slave_port= --workdir=/data/mysql_mha/app1 --target_version=5.7.-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx

Wed Dec  ::  - [info] Connecting to root@10.150.20.90(10.150.20.90:)..

Checking slave recovery environment settings..

Opening /data/mysql_33061/logs/relay-log.info ... ok.

Relay log found at /data/mysql_33061/logs, up to relaylog.

Temporary relay log file is /data/mysql_33061/logs/relaylog.

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Wed Dec  ::  - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha_monitor' --slave_host=10.150.20.132 --slave_ip=10.150.20.132 --slave_port= --workdir=/data/mysql_mha/app1 --target_version=5.7.-log --manager_version=0.58 --relay_log_info=/data/mysql_33061/logs/relay-log.info --relay_dir=/data/mysql_33061/data/ --slave_pass=xxx

Wed Dec  ::  - [info] Connecting to root@10.150.20.132(10.150.20.132:)..

Checking slave recovery environment settings..

Opening /data/mysql_33061/logs/relay-log.info ... ok.

Relay log found at /data/mysql_33061/data, up to cgdb-relay-bin.

Temporary relay log file is /data/mysql_33061/data/cgdb-relay-bin.

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Wed Dec  ::  - [info] Slaves settings check done.

Wed Dec  ::  - [info]

10.150.20.97(10.150.20.97:) (current master)

+--10.150.20.90(10.150.20.90:)

+--10.150.20.132(10.150.20.132:)

Wed Dec  ::  - [info] Checking replication health on 10.150.20.90..

Wed Dec  ::  - [info] ok.

Wed Dec  ::  - [info] Checking replication health on 10.150.20.132..

Wed Dec  ::  - [info] ok.

Wed Dec  ::  - [info] Checking master_ip_failover_script status:

Wed Dec  ::  - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.150.20.97 --orig_master_ip=10.150.20.97 --orig_master_port=

Wed Dec  ::  - [info] OK.

Wed Dec  ::  - [warning] shutdown_script is not defined.

Wed Dec  ::  - [info] Got exit code  (Not master dead).

MySQL Replication Health is OK.

复制环境

小结：
1：搭建MHA时，vip绑定需要自行绑定到主库；当主库发生failover，vip会绑定到新主
2:发生master_ip_failover之后，mha监控程序自动断掉；
3：vip绑定：
# ip addr add 10.150.20.200/24 brd 10.150.20.255 dev ens3 label ens3:1
# /usr/sbin/arping -q -A -c 1 -I ens3 10.150.20.200
vip解绑：
# ip addr del 10.150.20.200/24 dev ens3 label ens3:1
4：关闭mha监控程序为：
# masterha_stop --conf=/etc/mysql_mha/app1.cnf
Stopped app1 successfully.

5：failover的过程，基本为以下步骤：
1）.配置文件检查阶段，这个阶段会检查整个集群配置文件
2）.宕机的master处理，这个阶段包括虚拟ip摘除操作，主机关机操作
3）.复制dead master和最新slave相差的relay log，并保存到MHA Manger具体的目录下
4）.识别含有最新更新的slave
5）.应用从master保存的二进制日志事件（binlog events）
6）.提升一个slave为新的master进行复制
7）.使其他的slave连接新的master进行复制

MySQL高可用方案 MHA之二 master_ip_failover的更多相关文章

MySQL高可用方案MHA自动Failover与手动Failover的实践及原理
集群信息角色 IP地址 ServerID 类型 Master ...
MySQL高可用方案--MHA部署及故障转移
架构设计及必要配置主机环境 IP 主机名担任角色 192.168.192.128 node_master MySQL-Master| ...
mysql高可用方案MHA介绍
mysql高可用方案MHA介绍概述 MHA是一位日本MySQL大牛用Perl写的一套MySQL故障切换方案,来保证数据库系统的高可用.在宕机的时间内(通常10-30秒内),完成故障切换,部署MHA, ...
MySQL高可用方案MHA在线切换的步骤及原理
在日常工作中,会碰到如下的场景,如mysql数据库升级,主服务器硬件升级等,这个时候就需要将写操作切换到另外一台服务器上,那么如何进行在线切换呢?同时,要求切换过程短,对业务的影响比较小. MHA就提 ...
MySQL高可用方案MHA的部署和原理
MHA(Master High Availability)是一套相对成熟的MySQL高可用方案,能做到在0~30s内自动完成数据库的故障切换操作,在master服务器不宕机的情况下,基本能保证数据的一 ...
mysql 高可用方案MHA介绍
概述 MHA是一位日本MySQL大牛用Perl写的一套MySQL故障切换方案,来保证数据库系统的高可用.在宕机的时间内(通常10—30秒内),完成故障切换,部署MHA,可避免主从一致性问题,节约购买新 ...
MySQL高可用方案--MHA原理
简介 MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本DeNA公司youshimaton(现就职于Facebook公司)开发,是日 ...
Mysql - 高可用方案之MMM(二)
一.概述上一篇博客中(https://www.cnblogs.com/ddzj01/p/11535796.html)介绍了如何搭建MMM架构,本文将通过实验介绍MMM架构的优缺点. 二.优点 1. ...
MySQL高可用方案 MHA之一MHA安装
MHA0.58安装 MHA(Master High Availability)由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点).管理节点mha4mysql-manager ...

随机推荐

SCUT - 486 - 无向图上的点 - Dijkstra
好像原题是这个?https://www.cnblogs.com/kanchuang/p/11120052.html 这个有解释:https://blog.csdn.net/wddwjlss/artic ...
计算机系统结构总结_Cache Optimization
Textbook: <计算机组成与设计——硬件/软件接口> HI <计算机体系结构——量化研究方法> QR Ch4. Cache Optimization 本章要 ...
C#.NET、Power BI、数据挖掘
阅读目录 1.采集目目标特点与分析 2.方案第一版-Low到爆,别笑话 3.碰壁后的第二版方案 4.最终方案第三版 5.总结说起采集,其实我是个外行,以前拔过阿里巴巴的客户数据,在我博客的文章:C# ...
线性渐变css
从上到下的线性渐变: #grad { background: -webkit-linear-gradient(red, blue); /* Safari 5.1 - 6.0 */ background ...
C++函数声明与定义
一个C++函数,如果没有函数声明而只有函数定义,程序照样运行,但要求这个函数定义必须放在main函数之前,否则编译按照从上到下的顺序扫描下来,就会出现编译器不认识它的情况. 如果一个程序同时有函数声明 ...
Java多线程(1)
线程与进程进程:程序的执行过程线程:线程共享进程的资源 Java多线程实现的方式继承Tread类:使用getName()获取当前线程名实现Runnable接口:Thread.currentT ...
关于tomcat NoClassDefDoundErr异常的记录
在做DRP项目的时候,copy了drp1.3,粘贴重命名成drp1.4,把drp1.4加入到tomcat中,发现drp1.4中新加的jsp可以正常运行,而从1.3那copy来的不能运行,抛出NoCla ...
tp5 模板参数配置(模板静态文件路径)
tp5 模板参数配置(模板静态文件路径) // 模板页面使用 <link rel="stylesheet" type="text/css" href=&q ...
动态路由协议RIP
RIP Routing Information Protocol,属IGP协议,是距离矢量型动态路由协议(直接发送路由信息的协议为距离矢量型协议),使用UDP协议,端口号520. 贝尔曼福特算法 RI ...
计蒜客蓝桥模拟 I. 天上的星星
计算二维前缀和,节省时间.容斥定理. 代码: #include <cstdio> #include <cstdlib> #include <cstring> #in ...

MySQL高可用方案 MHA之二 master_ip_failover

MySQL高可用方案 MHA之二 master_ip_failover的更多相关文章

随机推荐

热门专题