MHA-手动Failover流程(传统复制>ID复制)
本文仅梳理手动Failover流程。MHA的介绍详见:MySQL高可用架构之MHA
一、基本环境
1.1、复制结构
VMware10.0+CentOS6.9+MySQL5.7.21
ROLE | HOSTNAME | BASEDIR | DATADIR | IP | PORT |
Node1 | ZST1 | /usr/local/mysql | /data/mysql/mysql3307/data | 192.168.85.132 | 3307 |
Node2 | ZST2 | /usr/local/mysql | /data/mysql/mysql3307/data | 192.168.85.133 | 3307 |
Node3 | ZST3 | /usr/local/mysql | /data/mysql/mysql3307/data | 192.168.85.134 | 3307 |
传统复制基于Row+Position,GTID复制基于Row+Gtid搭建的一主两从复制结构:Node1->{Node2、Node3}
1.2、MHA配置文件
文中使用的MHA版本是0.56,并且在Node1、Node2、Node3全部安装manager、node包
MHA的配置文件如下
- # 全局级配置文件:/etc/masterha/masterha_default.conf
- [root@ZST1 masterha]# cat masterha_default.conf
- [server default]
- #MySQL的用户和密码
- user=mydba
- password=mysql5721
- #系统ssh用户
- ssh_user=root
- #复制用户
- repl_user=repl
- repl_password=repl
- #监控
- ping_interval=
- #shutdown_script=/etc/masterha/send_report.sh
- #切换调用的脚本
- master_ip_failover_script=/etc/masterha/master_ip_failover
- master_ip_online_change_script=/etc/masterha/master_ip_online_change
- log_level=debug
- [root@ZST1 masterha]#
- # 集群1配置文件:/etc/masterha/app1.conf
- [root@ZST1 masterha]# cat app1.conf
- [server default]
- #mha manager工作目录
- manager_workdir=/var/log/masterha/app1
- manager_log=/var/log/masterha/app1/app1.log
- remote_workdir=/var/log/masterha/app1
- [server1]
- hostname=192.168.85.132
- port=
- master_binlog_dir=/data/mysql/mysql3307/logs
- candidate_master=
- check_repl_delay=
- [server2]
- hostname=192.168.85.133
- port=
- master_binlog_dir=/data/mysql/mysql3307/logs
- candidate_master=
- check_repl_delay=
- [server3]
- hostname=192.168.85.134
- port=
- master_binlog_dir=/data/mysql/mysql3307/logs
- candidate_master=
- check_repl_delay=
- [root@ZST1 masterha]#
1.3、测试数据
通过停止从节点的io_thread,再往主节点写入数据,模拟出主从数据、从从数据不一致~
- #首先清空表中记录
- mydba@192.168.85.132,3307 [replcrash]> truncate table py_user;
- #Node1写入第一条记录
- mydba@192.168.85.132,3307 [replcrash]> insert into py_user(name,add_time,server_id) select left(uuid(),32),now(),@@server_id;
- #Node3停止io_thread
- mydba@192.168.85.134,3307 [replcrash]> stop slave io_thread;
- #Node1写入第二条记录
- mydba@192.168.85.132,3307 [replcrash]> insert into py_user(name,add_time,server_id) select left(uuid(),32),now(),@@server_id;
- #Node2停止io_thread
- mydba@192.168.85.133,3307 [replcrash]> stop slave io_thread;
- #Node1写入第三条记录
- mydba@192.168.85.132,3307 [replcrash]> insert into py_user(name,add_time,server_id) select left(uuid(),32),now(),@@server_id;
- # 最终各节点记录如下
- #Node1有三条记录
- mydba@192.168.85.132,3307 [replcrash]> select * from py_user;
- +-----+----------------------------------+---------------------+-----------+
- | uid | name | add_time | server_id |
- +-----+----------------------------------+---------------------+-----------+
- | 1 | 153dc6bf-325d-11e8-88e6-000c29c1 | 2018-03-28 15:53:20 | 1323307 |
- | 2 | 272f15ee-325d-11e8-88e6-000c29c1 | 2018-03-28 15:53:50 | 1323307 |
- | 3 | 2d8900cc-325d-11e8-88e6-000c29c1 | 2018-03-28 15:54:01 | 1323307 |
- +-----+----------------------------------+---------------------+-----------+
- 3 rows in set (0.00 sec)
- mydba@192.168.85.132,3307 [replcrash]> show master status;
- +------------------+----------+--------------+------------------+-------------------+
- | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
- +------------------+----------+--------------+------------------+-------------------+
- | mysql-bin.000004 | 1303 | | | |
- +------------------+----------+--------------+------------------+-------------------+
- 1 row in set (0.00 sec)
- #Node2有两条记录
- mydba@192.168.85.133,3307 [replcrash]> select * from py_user;
- +-----+----------------------------------+---------------------+-----------+
- | uid | name | add_time | server_id |
- +-----+----------------------------------+---------------------+-----------+
- | 1 | 153dc6bf-325d-11e8-88e6-000c29c1 | 2018-03-28 15:53:20 | 1323307 |
- | 2 | 272f15ee-325d-11e8-88e6-000c29c1 | 2018-03-28 15:53:50 | 1323307 |
- +-----+----------------------------------+---------------------+-----------+
- 2 rows in set (0.00 sec)
- mydba@192.168.85.133,3307 [replcrash]> show master status;
- +------------------+----------+--------------+------------------+-------------------+
- | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
- +------------------+----------+--------------+------------------+-------------------+
- | mysql-bin.000007 | 8859 | | | |
- +------------------+----------+--------------+------------------+-------------------+
- 1 row in set (0.00 sec)
- #Node1有一条记录
- mydba@192.168.85.134,3307 [replcrash]> select * from py_user;
- +-----+----------------------------------+---------------------+-----------+
- | uid | name | add_time | server_id |
- +-----+----------------------------------+---------------------+-----------+
- | 1 | 153dc6bf-325d-11e8-88e6-000c29c1 | 2018-03-28 15:53:20 | 1323307 |
- +-----+----------------------------------+---------------------+-----------+
- 1 row in set (0.00 sec)
- mydba@192.168.85.134,3307 [replcrash]> show master status;
- +------------------+----------+--------------+------------------+-------------------+
- | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
- +------------------+----------+--------------+------------------+-------------------+
- | mysql-bin.000002 | 10322 | | | |
- +------------------+----------+--------------+------------------+-------------------+
- 1 row in set (0.00 sec)
很明显从节点Node3落后于从节点Node2、从节点Node2落后于主节点Node1
二、传统复制下手动Failover
手动Failover场景,Master挂掉,但是mha_manager没有开启,可以通过手动Failover
2.1、手动Failover
• 关闭Node1节点数据库服务
- # 关闭Node1节点数据库服务
- mydba@192.168.85.132,3307 [replcrash]> shutdown;
- # Node2、Node3节点复制状态
- mydba@192.168.85.133,3307 [replcrash]> pager cat | egrep 'Master_Log_File|Relay_Master_Log_File|Read_Master_Log_Pos|Exec_Master_Log_Pos|Running'
- PAGER set to 'cat | egrep 'Master_Log_File|Relay_Master_Log_File|Read_Master_Log_Pos|Exec_Master_Log_Pos|Running''
- mydba@192.168.85.133,3307 [replcrash]> show slave status\G
- Master_Log_File: mysql-bin.000004
- Read_Master_Log_Pos: 973
- Relay_Master_Log_File: mysql-bin.000004
- Slave_IO_Running: No
- Slave_SQL_Running: Yes
- Exec_Master_Log_Pos: 973
- Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
- 1 row in set (0.00 sec)
- mydba@192.168.85.133,3307 [replcrash]>
- mydba@192.168.85.134,3307 [replcrash]> pager cat | egrep 'Master_Log_File|Relay_Master_Log_File|Read_Master_Log_Pos|Exec_Master_Log_Pos|Running'
- PAGER set to 'cat | egrep 'Master_Log_File|Relay_Master_Log_File|Read_Master_Log_Pos|Exec_Master_Log_Pos|Running''
- mydba@192.168.85.134,3307 [replcrash]> show slave status\G
- Master_Log_File: mysql-bin.000004
- Read_Master_Log_Pos: 643
- Relay_Master_Log_File: mysql-bin.000004
- Slave_IO_Running: No
- Slave_SQL_Running: Yes
- Exec_Master_Log_Pos: 643
- Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
- 1 row in set (0.00 sec)
- mydba@192.168.85.134,3307 [replcrash]>
此时,是否开启从库的io_thread没啥影响,主库已经down掉,从库的io_thread肯定是连不上去
• 手动Failover脚本,指定新Master为Node3
- # Node1节点手动故障切换
- [root@ZST3 app1]# masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.85.132 --dead_master_port= --master_state=dead --new_master_host=192.168.85.134 --new_master_port= --ignore_last_failover
此时复制结构为Node1->{Node2、Node3},手动故障切换后结构为:Node3->{Node2}
2.2、切换流程
手动Failover日志输出
- # 手动Failover
- [root@ZST3 app1]# masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.85.132 --dead_master_port= --master_state=dead --new_master_host=192.168.85.134 --new_master_port= --ignore_last_failover
- --dead_master_ip=<dead_master_ip> is not set. Using 192.168.85.132.
- Wed Mar :: - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
- Wed Mar :: - [info] Reading application default configuration from /etc/masterha/app1.conf..
- Wed Mar :: - [info] Reading server configuration from /etc/masterha/app1.conf..
- Wed Mar :: - [info] MHA::MasterFailover version 0.56.
- Wed Mar :: - [info] Starting master failover.
- Wed Mar :: - [info]
- ==================== 、配置检查阶段,Start ====================
- Wed Mar :: - [info] * Phase : Configuration Check Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [debug] Connecting to servers..
- Wed Mar :: - [debug] Connected to: 192.168.85.133(192.168.85.133:), user=mydba
- Wed Mar :: - [debug] Number of slave worker threads on host 192.168.85.133(192.168.85.133:):
- Wed Mar :: - [debug] Connected to: 192.168.85.134(192.168.85.134:), user=mydba
- Wed Mar :: - [debug] Number of slave worker threads on host 192.168.85.134(192.168.85.134:):
- Wed Mar :: - [debug] Comparing MySQL versions..
- Wed Mar :: - [debug] Comparing MySQL versions done.
- Wed Mar :: - [debug] Connecting to servers done.
- Wed Mar :: - [info] GTID failover mode =
- Wed Mar :: - [info] Dead Servers:
- Wed Mar :: - [info] 192.168.85.132(192.168.85.132:)
- Wed Mar :: - [info] Checking master reachability via MySQL(double check)...
- Wed Mar :: - [info] ok.
- Wed Mar :: - [info] Alive Servers:
- Wed Mar :: - [info] 192.168.85.133(192.168.85.133:)
- Wed Mar :: - [info] 192.168.85.134(192.168.85.134:)
- Wed Mar :: - [info] Alive Slaves:
- Wed Mar :: - [info] 192.168.85.133(192.168.85.133:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Wed Mar :: - [debug] Relay log info repository: FILE
- Wed Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Wed Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- Wed Mar :: - [info] 192.168.85.134(192.168.85.134:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Wed Mar :: - [debug] Relay log info repository: FILE
- Wed Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Wed Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- ******************** 选择是否继续进行 ********************
- Master 192.168.85.132(192.168.85.132:) is dead. Proceed? (yes/NO): yes
- Wed Mar :: - [info] Starting Non-GTID based failover.
- Wed Mar :: - [info]
- Wed Mar :: - [info] ** Phase : Configuration Check Phase completed.
- ==================== 、配置检查阶段,End ====================
- Wed Mar :: - [info]
- ==================== 、故障Master关闭阶段,Start ====================
- Wed Mar :: - [info] * Phase : Dead Master Shutdown Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [debug] Stopping IO thread on 192.168.85.133(192.168.85.133:)..
- Wed Mar :: - [debug] Stopping IO thread on 192.168.85.134(192.168.85.134:)..
- Wed Mar :: - [debug] Stop IO thread on 192.168.85.134(192.168.85.134:) done.
- Wed Mar :: - [debug] Stop IO thread on 192.168.85.133(192.168.85.133:) done.
- Wed Mar :: - [debug] SSH connection test to 192.168.85.132, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=, timeout
- Wed Mar :: - [info] HealthCheck: SSH to 192.168.85.132 is reachable.
- Wed Mar :: - [info] Forcing shutdown so that applications never connect to the current master..
- Wed Mar :: - [info] Executing master IP deactivation script:
- Wed Mar :: - [info] /etc/masterha/master_ip_failover --orig_master_host=192.168.85.132 --orig_master_ip=192.168.85.132 --orig_master_port= --command=stopssh --ssh_user=root
- Wed Mar :: - [info] done.
- Wed Mar :: - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
- Wed Mar :: - [info] * Phase : Dead Master Shutdown Phase completed.
- ==================== 、故障Master关闭阶段,End ====================
- Wed Mar :: - [info]
- ==================== 、新Master恢复阶段,Start ====================
- Wed Mar :: - [info] * Phase : Master Recovery Phase..
- Wed Mar :: - [info]
- ==================== 3.1、获取最新的Slave ====================
- ******************** 最新Slave,用途1:用于补全其他Slave缺少的relay-log;用途2:用于save故障Master的binlog的起始点 ********************
- Wed Mar :: - [info] * Phase 3.1: Getting Latest Slaves Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [debug] Fetching current slave status..
- Wed Mar :: - [debug] Fetching current slave status done.
- Wed Mar :: - [info] The latest binary log file/position on all slaves is mysql-bin.:
- Wed Mar :: - [info] Latest slaves (Slaves that received relay log files to the latest):
- Wed Mar :: - [info] 192.168.85.133(192.168.85.133:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Wed Mar :: - [debug] Relay log info repository: FILE
- Wed Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Wed Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- Wed Mar :: - [info] The oldest binary log file/position on all slaves is mysql-bin.:
- Wed Mar :: - [info] Oldest slaves:
- Wed Mar :: - [info] 192.168.85.134(192.168.85.134:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Wed Mar :: - [debug] Relay log info repository: FILE
- Wed Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Wed Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- Wed Mar :: - [info]
- ==================== 3.2、保存故障Master的binlog ====================
- Wed Mar :: - [info] * Phase 3.2: Saving Dead Master''s Binlog Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [info] Fetching dead master''s binary logs..
- ******************** 在故障Master执行,取最新Slave之后的部分 ********************
- Wed Mar :: - [info] Executing command on the dead master 192.168.85.132(192.168.85.132:): save_binary_logs --command=save --start_file=mysql-bin. --start_pos= --binlog_dir=/data/mysql/mysql3307/logs --output_file=/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog --handle_raw_binlog= --disable_log_bin= --manager_version=0.56 --debug
- Creating /var/log/masterha/app1 if not exists.. ok.
- Concat binary/relay logs from mysql-bin. pos to mysql-bin. EOF into /var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog ..
- parse_init_headers: file=mysql-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=mysql-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Got previous gtids log event: .
- parse_init_headers: file=mysql-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Dumping binlog format description event, from position to .. ok.
- Dumping effective binlog data from /data/mysql/mysql3307/logs/mysql-bin. position to tail().. ok.
- parse_init_headers: file=saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Got previous gtids log event: .
- parse_init_headers: file=saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Concat succeeded.
- saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog % .5KB/s :
- ******************** 将得到的Master binlog scp到 管理节点mha-manage/手动failover 运行的工作目录 ********************
- Wed Mar :: - [info] scp from root@192.168.85.132:/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog to local:/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog succeeded.
- Wed Mar :: - [debug] SSH connection test to 192.168.85.133, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=, timeout
- Wed Mar :: - [info] HealthCheck: SSH to 192.168.85.133 is reachable.
- Wed Mar :: - [debug] SSH connection test to 192.168.85.134, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=, timeout
- Wed Mar :: - [info] HealthCheck: SSH to 192.168.85.134 is reachable.
- Wed Mar :: - [info]
- ==================== 3.3、选举新Master ====================
- Wed Mar :: - [info] * Phase 3.3: Determining New Master Phase..
- Wed Mar :: - [info]
- ******************** 查找最新的Slave是否包含其他Slave缺失的Relay-log ********************
- Wed Mar :: - [info] Finding the latest slave that has all relay logs for recovering other slaves..
- Wed Mar :: - [info] Checking whether 192.168.85.133 has relay logs from the oldest position..
- Wed Mar :: - [info] Executing command: apply_diff_relay_logs --command=find --latest_mlf=mysql-bin. --latest_rmlp= --target_mlf=mysql-bin. --target_rmlp= --server_id= --workdir=/var/log/masterha/app1 --timestamp= --manager_version=0.56 --relay_log_info=/data/mysql/mysql3307/data/relay-log.info --relay_dir=/data/mysql/mysql3307/data/ --debug :
- Opening /data/mysql/mysql3307/data/relay-log.info ... ok.
- Relay log found at /data/mysql/mysql3307/data, up to relay-bin.
- Fast relay log position search succeeded.
- Target relay log file/position found. start_file:relay-bin., start_pos:.
- Target relay log FOUND!
- Wed Mar :: - [info] OK. 192.168.85.133 has all relay logs.
- Wed Mar :: - [info] 192.168.85.134 can be new master.
- Wed Mar :: - [info] New master is 192.168.85.134(192.168.85.134:)
- Wed Mar :: - [info] Starting master failover..
- Wed Mar :: - [info]
- From:
- 192.168.85.132(192.168.85.132:) (current master)
- +--192.168.85.133(192.168.85.133:)
- +--192.168.85.134(192.168.85.134:)
- To:
- 192.168.85.134(192.168.85.134:) (new master)
- +--192.168.85.133(192.168.85.133:)
- ******************** 选择是否进行切换 ********************
- Starting master switch from 192.168.85.132(192.168.85.132:) to 192.168.85.134(192.168.85.134:)? (yes/NO): yes
- Wed Mar :: - [info] New master decided manually is 192.168.85.134(192.168.85.134:)
- Wed Mar :: - [info]
- Wed Mar :: - [info] * Phase 3.3: New Master Diff Log Generation Phase..
- Wed Mar :: - [info]
- ******************** 在最新的Slave,产生新Master与最新的Slave缺失的Relay-log ********************
- Wed Mar :: - [info] Server 192.168.85.134 received relay logs up to: mysql-bin.:
- Wed Mar :: - [info] Need to get diffs from the latest slave(192.168.85.133) up to: mysql-bin.: (using the latest slave''s relay logs)
- Wed Mar :: - [info] Connecting to the latest slave host 192.168.85.133, generating diff relay log files..
- Wed Mar :: - [info] Executing command: apply_diff_relay_logs --command=generate_and_send --scp_user=root --scp_host=192.168.85.134 --latest_mlf=mysql-bin. --latest_rmlp= --target_mlf=mysql-bin. --target_rmlp= --server_id= --diff_file_readtolatest=/var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog --workdir=/var/log/masterha/app1 --timestamp= --handle_raw_binlog= --disable_log_bin= --manager_version=0.56 --relay_log_info=/data/mysql/mysql3307/data/relay-log.info --relay_dir=/data/mysql/mysql3307/data/ --debug
- Wed Mar :: - [info]
- Opening /data/mysql/mysql3307/data/relay-log.info ... ok.
- Relay log found at /data/mysql/mysql3307/data, up to relay-bin.
- Fast relay log position search succeeded.
- Target relay log file/position found. start_file:relay-bin., start_pos:.
- Concat binary/relay logs from relay-bin. pos to relay-bin. EOF into /var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog ..
- parse_init_headers: file=relay-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=relay-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Got previous gtids log event: .
- parse_init_headers: file=relay-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- parse_init_headers: file=relay-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=relay-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- parse_init_headers: file=relay-bin. event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Dumping binlog format description event, from position to .. ok.
- Dumping effective binlog data from /data/mysql/mysql3307/data/relay-bin. position to tail().. ok.
- parse_init_headers: file=relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Got previous gtids log event: .
- parse_init_headers: file=relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- parse_init_headers: file=relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- parse_init_headers: file=relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Concat succeeded.
- Generating diff relay log succeeded. Saved at /var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog .
- ******************** 将得到的relay-log scp到新Master工作目录 ********************
- scp ZST2:/var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog to root@192.168.85.134() succeeded.
- Wed Mar :: - [info] Generating diff files succeeded.
- Wed Mar :: - [info] Sending binlog..
- saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog % .5KB/s :
- ******************** 从管理节点mha-manage/手动failover运行的工作目录scp故障Master的binlog到新Master工作目录 ********************
- Wed Mar :: - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog to root@192.168.85.134:/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog succeeded.
- Wed Mar :: - [info]
- ==================== 3.4、新Master应用差异log ====================
- Wed Mar :: - [info] * Phase 3.4: Master Log Apply Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
- Wed Mar :: - [info] Starting recovery on 192.168.85.134(192.168.85.134:)..
- Wed Mar :: - [info] Generating diffs succeeded.
- ******************** 等待新Master应用完自己的relay-log ********************
- Wed Mar :: - [info] Waiting until all relay logs are applied.
- Wed Mar :: - [info] done.
- Wed Mar :: - [debug] Stopping SQL thread on 192.168.85.134(192.168.85.134:)..
- Wed Mar :: - [debug] done.
- Wed Mar :: - [info] Getting slave status..
- Wed Mar :: - [info] This slave(192.168.85.134)''s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.:). No need to recover from Exec_Master_Log_Pos.
- Wed Mar :: - [debug] Current max_allowed_packet is .
- Wed Mar :: - [debug] Tentatively setting max_allowed_packet to 1GB succeeded.
- Wed Mar :: - [info] Connecting to the target slave host 192.168.85.134, running recover script..
- ******************** 新Master按顺序应用与最新的Slave缺失的relay-log,以及故障Master保存的binlog ********************
- Wed Mar :: - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='mydba' --slave_host=192.168.85.134 --slave_ip=192.168.85.134 --slave_port= --apply_files=/var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog,/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog --workdir=/var/log/masterha/app1 --target_version=5.7.-log --timestamp= --handle_raw_binlog= --disable_log_bin= --manager_version=0.56 --debug --slave_pass=xxx
- Wed Mar :: - [info]
- ******************** 将所有缺失的relay-log、binlog汇总到total_binlog ********************
- Concat all apply files to /var/log/masterha/app1/total_binlog_for_192.168.85.134_3307..binlog ..
- Copying the first binlog file /var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog to /var/log/masterha/app1/total_binlog_for_192.168.85.134_3307..binlog.. ok.
- Dumping binlog head events (rotate events), skipping format description events from /var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog.. parse_init_headers: file=saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Binlog Checksum enabled
- parse_init_headers: file=saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- Got previous gtids log event: .
- parse_init_headers: file=saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog event_type= server_id= length= nextmpos= prevrelay= cur(post)relay=
- dumped up to pos . ok.
- /var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog has effective binlog events from pos .
- Dumping effective binlog data from /var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog position to tail().. ok.
- Concat succeeded.
- All apply target binary logs are concatinated at /var/log/masterha/app1/total_binlog_for_192.168.85.134_3307..binlog .
- MySQL client version is 5.7.. Using --binary-mode.
- Applying differential binary/relay log files /var/log/masterha/app1/relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog,/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog on 192.168.85.134:. This may take long time...
- Applying log files succeeded.
- Wed Mar :: - [debug] Setting max_allowed_packet back to succeeded.
- Wed Mar :: - [info] All relay logs were successfully applied.
- ******************** 新Master应用完所有的relay-log、binlog,得到当前位置 ********************
- Wed Mar :: - [info] Getting new master''s binlog name and position..
- Wed Mar :: - [info] mysql-bin.:
- Wed Mar :: - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.85.134', MASTER_PORT=, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=, MASTER_USER='repl', MASTER_PASSWORD='xxx';
- ******************** 开启虚拟IP,新Master可以对外提供服务 ********************
- Wed Mar :: - [info] Executing master IP activate script:
- Wed Mar :: - [info] /etc/masterha/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.85.132 --orig_master_ip=192.168.85.132 --orig_master_port= --new_master_host=192.168.85.134 --new_master_ip=192.168.85.134 --new_master_port= --new_master_user='mydba' --new_master_password='mysql5721'
- Set read_only= on the new master.
- Wed Mar :: - [info] OK.
- Wed Mar :: - [info] ** Finished master recovery successfully.
- Wed Mar :: - [info] * Phase : Master Recovery Phase completed.
- ==================== 、新Master恢复阶段,End ====================
- Wed Mar :: - [info]
- ==================== 、Slave恢复阶段,Start ====================
- ******************** Slave恢复过程类似新Master,首先得到与最新的Slave差异relay-log,然后获取故障Master的binlog ********************
- Wed Mar :: - [info] * Phase : Slaves Recovery Phase..
- Wed Mar :: - [info]
- ==================== 4.1、生成最新Slave和Slave之间的差异log ====================
- Wed Mar :: - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [info] -- Slave diff file generation on host 192.168.85.133(192.168.85.133:) started, pid: . Check tmp log /var/log/masterha/app1/192.168..133_3307_20180328160107.log if it takes time..
- Wed Mar :: - [info]
- Wed Mar :: - [info] Log messages from 192.168.85.133 ...
- Wed Mar :: - [info]
- Wed Mar :: - [info] This server has all relay logs. No need to generate diff files from the latest slave.
- Wed Mar :: - [info] End of log messages from 192.168.85.133.
- Wed Mar :: - [info] -- 192.168.85.133(192.168.85.133:) has the latest relay log events.
- Wed Mar :: - [info] Generating relay diff files from the latest slave succeeded.
- Wed Mar :: - [info]
- ==================== 4.2、Slave应用差异log ====================
- Wed Mar :: - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
- Wed Mar :: - [info]
- Wed Mar :: - [info] -- Slave recovery on host 192.168.85.133(192.168.85.133:) started, pid: . Check tmp log /var/log/masterha/app1/192.168..133_3307_20180328160107.log if it takes time..
- saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog % .5KB/s :
- Wed Mar :: - [debug] Explicitly disabled relay_log_purge.
- Wed Mar :: - [info]
- Wed Mar :: - [info] Log messages from 192.168.85.133 ...
- Wed Mar :: - [info]
- Wed Mar :: - [info] Sending binlog..
- ******************** 从管理节点mha-manage/手动failover运行的工作目录scp故障Master的binlog到Slave工作目录 ********************
- Wed Mar :: - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog to root@192.168.85.133:/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog succeeded.
- Wed Mar :: - [info] Starting recovery on 192.168.85.133(192.168.85.133:)..
- Wed Mar :: - [info] Generating diffs succeeded.
- Wed Mar :: - [info] Waiting until all relay logs are applied.
- Wed Mar :: - [info] done.
- Wed Mar :: - [debug] Stopping SQL thread on 192.168.85.133(192.168.85.133:)..
- Wed Mar :: - [debug] done.
- Wed Mar :: - [info] Getting slave status..
- Wed Mar :: - [info] This slave(192.168.85.133)''s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.:). No need to recover from Exec_Master_Log_Pos.
- Wed Mar :: - [debug] Current max_allowed_packet is .
- Wed Mar :: - [debug] Tentatively setting max_allowed_packet to 1GB succeeded.
- Wed Mar :: - [info] Connecting to the target slave host 192.168.85.133, running recover script..
- ******************** Slave按顺序应用与最新的Slave缺失的relay-log,以及故障Master保存的binlog ********************
- Wed Mar :: - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='mydba' --slave_host=192.168.85.133 --slave_ip=192.168.85.133 --slave_port= --apply_files=/var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog --workdir=/var/log/masterha/app1 --target_version=5.7.-log --timestamp= --handle_raw_binlog= --disable_log_bin= --manager_version=0.56 --debug --slave_pass=xxx
- Wed Mar :: - [info]
- MySQL client version is 5.7.. Using --binary-mode.
- Applying differential binary/relay log files /var/log/masterha/app1/saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog on 192.168.85.133:. This may take long time...
- Applying log files succeeded.
- Wed Mar :: - [debug] Setting max_allowed_packet back to succeeded.
- Wed Mar :: - [info] All relay logs were successfully applied.
- Wed Mar :: - [info] Resetting slave 192.168.85.133(192.168.85.133:) and starting replication from the new master 192.168.85.134(192.168.85.134:)..
- Wed Mar :: - [debug] Stopping slave IO/SQL thread on 192.168.85.133(192.168.85.133:)..
- Wed Mar :: - [debug] done.
- Wed Mar :: - [info] Executed CHANGE MASTER.
- Wed Mar :: - [debug] Starting slave IO/SQL thread on 192.168.85.133(192.168.85.133:)..
- Wed Mar :: - [debug] done.
- Wed Mar :: - [info] Slave started.
- Wed Mar :: - [info] End of log messages from 192.168.85.133.
- Wed Mar :: - [info] -- Slave recovery on host 192.168.85.133(192.168.85.133:) succeeded.
- Wed Mar :: - [info] All new slave servers recovered successfully.
- ==================== 、Slave恢复阶段,End ====================
- Wed Mar :: - [info]
- ==================== 、新Master清理阶段,Start ====================
- Wed Mar :: - [info] * Phase : New master cleanup phase..
- Wed Mar :: - [info]
- Wed Mar :: - [info] Resetting slave info on the new master..
- Wed Mar :: - [debug] Clearing slave info..
- Wed Mar :: - [debug] Stopping slave IO/SQL thread on 192.168.85.134(192.168.85.134:)..
- Wed Mar :: - [debug] done.
- Wed Mar :: - [debug] SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK.
- Wed Mar :: - [info] 192.168.85.134: Resetting slave info succeeded.
- ==================== 、新Master清理阶段,End ====================
- Wed Mar :: - [info] Master failover to 192.168.85.134(192.168.85.134:) completed successfully.
- Wed Mar :: - [debug] Disconnected from 192.168.85.133(192.168.85.133:)
- Wed Mar :: - [debug] Disconnected from 192.168.85.134(192.168.85.134:)
- Wed Mar :: - [info]
- ----- Failover Report -----
- app1: MySQL Master failover 192.168.85.132(192.168.85.132:) to 192.168.85.134(192.168.85.134:) succeeded
- Master 192.168.85.132(192.168.85.132:) is down!
- Check MHA Manager logs at ZST3 for details.
- Started manual(interactive) failover.
- Invalidated master IP address on 192.168.85.132(192.168.85.132:)
- The latest slave 192.168.85.133(192.168.85.133:) has all relay logs for recovery.
- Selected 192.168.85.134(192.168.85.134:) as a new master.
- 192.168.85.134(192.168.85.134:): OK: Applying all logs succeeded.
- 192.168.85.134(192.168.85.134:): OK: Activated master IP address.
- 192.168.85.133(192.168.85.133:): This host has the latest relay log events.
- Generating relay diff files from the latest slave succeeded.
- 192.168.85.133(192.168.85.133:): OK: Applying all logs succeeded. Slave started, replicating from 192.168.85.134(192.168.85.134:)
- 192.168.85.134(192.168.85.134:): Resetting slave info succeeded.
- Master failover to 192.168.85.134(192.168.85.134:) completed successfully.
- [root@ZST3 app1]#
手动Failover流程
- 手动Failover(传统)
- 、配置检查:连接各实例,检查服务状态,检查主从关系
- 、故障Master关闭:停止各Slave上的IO Thread,故障Master虚拟IP摘除(stopssh)
- 、新Master恢复
- 3.1、获取最新的Slave
- 用于补全新Master/其他Slave缺少的数据;用于save故障Master的binlog的起始点
- 3.2、保存故障Master的binlog
- 故障Master上执行save_binary_logs(只取最新Slave之后的部分)\n将得到的binlog scp到手动Failover运行的工作目录
- 3.3、选举新Master
- 查找最新的Slave是否包含最旧的Slave缺失的relay-log
- 确定新Master,得到切换前后结构
- 生成最新Slave和新Master之间的差异relay-log,并拷贝到新Master的工作目录
- 从手动Failover运行的工作目录scp故障Master的binlog到新Master工作目录
- 3.4、新Master应用差异log
- 等待新Master应用完自己的relay-log
- 按顺序应用与最新的Slave缺失的relay-log,以及故障Master保存的binlog
- 将所有缺失的relay-log、binlog汇总到total_binlog
- 得到新Master的binlog:pos,其他Slave将从这个位置开始复制
- 绑定虚拟IP,新Master可以对外提供服务
- 、其他Slave恢复
- 4.1、生成差异log
- 生成最新Slave和Slave之间的差异relay-log,并拷贝到Slave的工作目录;从手动Failover运行的工作目录scp故障Master的binlog到Slave工作目录
- 4.2、Slave应用差异log
- 等待Slave应用完自己的relay-log;按顺序应用与最新的Slave缺失的relay-log,以及故障Master保存的binlog;重置Slave上的复制到新Master~
- 4.3、如果存在多个Slaves,重复上述操作
- 、新Master清理:清理旧的复制信息STOP SLAVE;RESET SLAVE ALL;
2.3、目录文件
切换流程需要补全数据,会产生各类文件
- # 故障Master
- [root@ZST1 app1]# ll
- total
- -rw-r--r-- root root Mar : saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog
- [root@ZST1 app1]#
Dead Master
saved_master_binlog_from_**:故障Master与最新Slave之间的差异binlog,在故障Master生成,然后拷贝到 MHA管理节点/手动Failover 工作目录
- # 最新的Slave
- [root@ZST2 app1]# ll
- total
- -rw-r--r--. root root Mar : relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog
- -rw-r--r--. root root Mar : relay_log_apply_for_192.168.85.133_3307_20180328160107_err.log
- -rw-r--r--. root root Mar : saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog
- [root@ZST2 app1]#
Latest Slave
relay_from_read_to_latest_**:最新Slave与其他Slave之间的差异relay-log,在最新Slave生成,然后拷贝到其他对应Slave
saved_master_binlog_from_**:从管理节点拷贝过来,源头在故障Master
- # 新Master
- [root@ZST3 app1]# ll
- total
- -rw-r--r--. root root Mar : app1.failover.complete
- -rw-r--r--. root root Mar : relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog
- -rw-r--r--. root root Mar : relay_log_apply_for_192.168.85.134_3307_20180328160107_err.log
- -rw-r--r--. root root Mar : saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog
- -rw-r--r--. root root Mar : total_binlog_for_192.168.85.134_3307..binlog
- [root@ZST3 app1]#
New Master
relay_from_read_to_latest_**:从最新的Slave上拷贝过来
saved_master_binlog_from_ **:从管理节点拷贝过来,源头在故障Master
total_binlog_for_**:汇总所有缺失的relay-log、binlog信息
• 解析差异log,查看文件中的日志信息
- #最新Slave与其他Slave之间的差异relay-log
- [root@ZST3 app1]# mysqlbinlog -vv --base64-output=decode-rows relay_from_read_to_latest_192.168.85.134_3307_20180328160107.binlog
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
- /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
- DELIMITER /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x152b7e41 Start: binlog v , server v 5.7.-log created ::
- # This Format_description_event appears in a relay log and was generated by the slave thread.
- # at
- # :: server id end_log_pos CRC32 0x5ea2e9c6 Previous-GTIDs
- # [empty]
- # at
- # :: server id end_log_pos CRC32 0x2076d50b Rotate to mysql-bin. pos:
- # at
- # :: server id end_log_pos CRC32 0x9b1488de Start: binlog v , server v 5.7.-log created :: at startup
- ROLLBACK/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x838279dd Rotate to mysql-bin. pos:
- # at
- # :: server id end_log_pos CRC32 0x9fba3aa7 Anonymous_GTID last_committed= sequence_number= rbr_only=yes
- /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
- SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x112f5399 Query thread_id= exec_time= error_code=
- SET TIMESTAMP=/*!*/;
- SET @@session.pseudo_thread_id=/*!*/;
- SET @@session.foreign_key_checks=, @@session.sql_auto_is_null=, @@session.unique_checks=, @@session.autocommit=/*!*/;
- SET @@session.sql_mode=/*!*/;
- SET @@session.auto_increment_increment=, @@session.auto_increment_offset=/*!*/;
- /*!\C utf8 *//*!*/;
- SET @@session.character_set_client=,@@session.collation_connection=,@@session.collation_server=/*!*/;
- SET @@session.time_zone='SYSTEM'/*!*/;
- SET @@session.lc_time_names=/*!*/;
- SET @@session.collation_database=DEFAULT/*!*/;
- BEGIN
- /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x890cf300 Table_map: `replcrash`.`py_user` mapped to number
- # at
- # :: server id end_log_pos CRC32 0xccb038f5 Write_rows: table id flags: STMT_END_F
- ### INSERT INTO `replcrash`.`py_user`
- ### SET
- ### @= /* INT meta=0 nullable=0 is_null=0 */
- ### @='272f15ee-325d-11e8-88e6-000c29c1' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */
- ### @='2018-03-28 15:53:50' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
- ### @='' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */
- # at
- # :: server id end_log_pos CRC32 0xbfda64ba Xid =
- COMMIT/*!*/;
- SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
- DELIMITER ;
- # End of log file
- /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
- [root@ZST3 app1]#
- #故障Master与最新Slave之间的差异binlog
- [root@ZST3 app1]# mysqlbinlog -vv --base64-output=decode-rows saved_master_binlog_from_192.168.85.132_3307_20180328160107.binlog
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
- /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
- DELIMITER /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x9b1488de Start: binlog v , server v 5.7.-log created :: at startup
- ROLLBACK/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x37f9307d Previous-GTIDs
- # [empty]
- # at
- # :: server id end_log_pos CRC32 0x74680cfa Anonymous_GTID last_committed= sequence_number= rbr_only=yes
- /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
- SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x3774a1d0 Query thread_id= exec_time= error_code=
- SET TIMESTAMP=/*!*/;
- SET @@session.pseudo_thread_id=/*!*/;
- SET @@session.foreign_key_checks=, @@session.sql_auto_is_null=, @@session.unique_checks=, @@session.autocommit=/*!*/;
- SET @@session.sql_mode=/*!*/;
- SET @@session.auto_increment_increment=, @@session.auto_increment_offset=/*!*/;
- /*!\C utf8 *//*!*/;
- SET @@session.character_set_client=,@@session.collation_connection=,@@session.collation_server=/*!*/;
- SET @@session.time_zone='SYSTEM'/*!*/;
- SET @@session.lc_time_names=/*!*/;
- SET @@session.collation_database=DEFAULT/*!*/;
- BEGIN
- /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x1468e6b1 Table_map: `replcrash`.`py_user` mapped to number
- # at
- # :: server id end_log_pos CRC32 0x79523051 Write_rows: table id flags: STMT_END_F
- ### INSERT INTO `replcrash`.`py_user`
- ### SET
- ### @= /* INT meta=0 nullable=0 is_null=0 */
- ### @='2d8900cc-325d-11e8-88e6-000c29c1' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */
- ### @='2018-03-28 15:54:01' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
- ### @='' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */
- # at
- # :: server id end_log_pos CRC32 0xb93ce981 Xid =
- COMMIT/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x577dc41e Stop
- SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
- DELIMITER ;
- # End of log file
- /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
- [root@ZST3 app1]#
- #所有缺失的relay-log、binlog信息
- [root@ZST3 app1]# mysqlbinlog -vv --base64-output=decode-rows total_binlog_for_192.168.85.134_3307..binlog
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
- /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
- DELIMITER /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x152b7e41 Start: binlog v , server v 5.7.-log created ::
- # This Format_description_event appears in a relay log and was generated by the slave thread.
- # at
- # :: server id end_log_pos CRC32 0x5ea2e9c6 Previous-GTIDs
- # [empty]
- # at
- # :: server id end_log_pos CRC32 0x2076d50b Rotate to mysql-bin. pos:
- # at
- # :: server id end_log_pos CRC32 0x9b1488de Start: binlog v , server v 5.7.-log created :: at startup
- ROLLBACK/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x838279dd Rotate to mysql-bin. pos:
- # at
- # :: server id end_log_pos CRC32 0x9fba3aa7 Anonymous_GTID last_committed= sequence_number= rbr_only=yes
- /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
- SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x112f5399 Query thread_id= exec_time= error_code=
- SET TIMESTAMP=/*!*/;
- SET @@session.pseudo_thread_id=/*!*/;
- SET @@session.foreign_key_checks=, @@session.sql_auto_is_null=, @@session.unique_checks=, @@session.autocommit=/*!*/;
- SET @@session.sql_mode=/*!*/;
- SET @@session.auto_increment_increment=, @@session.auto_increment_offset=/*!*/;
- /*!\C utf8 *//*!*/;
- SET @@session.character_set_client=,@@session.collation_connection=,@@session.collation_server=/*!*/;
- SET @@session.time_zone='SYSTEM'/*!*/;
- SET @@session.lc_time_names=/*!*/;
- SET @@session.collation_database=DEFAULT/*!*/;
- BEGIN
- /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x890cf300 Table_map: `replcrash`.`py_user` mapped to number
- # at
- # :: server id end_log_pos CRC32 0xccb038f5 Write_rows: table id flags: STMT_END_F
- ### INSERT INTO `replcrash`.`py_user`
- ### SET
- ### @= /* INT meta=0 nullable=0 is_null=0 */
- ### @='272f15ee-325d-11e8-88e6-000c29c1' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */
- ### @='2018-03-28 15:53:50' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
- ### @='' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */
- # at
- # :: server id end_log_pos CRC32 0xbfda64ba Xid =
- COMMIT/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x74680cfa Anonymous_GTID last_committed= sequence_number= rbr_only=yes
- /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
- SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x3774a1d0 Query thread_id= exec_time= error_code=
- SET TIMESTAMP=/*!*/;
- BEGIN
- /*!*/;
- # at
- # :: server id end_log_pos CRC32 0x1468e6b1 Table_map: `replcrash`.`py_user` mapped to number
- # at
- # :: server id end_log_pos CRC32 0x79523051 Write_rows: table id flags: STMT_END_F
- ### INSERT INTO `replcrash`.`py_user`
- ### SET
- ### @= /* INT meta=0 nullable=0 is_null=0 */
- ### @='2d8900cc-325d-11e8-88e6-000c29c1' /* VARSTRING(96) meta=96 nullable=1 is_null=0 */
- ### @='2018-03-28 15:54:01' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
- ### @='' /* VARSTRING(30) meta=30 nullable=1 is_null=0 */
- # at
- # :: server id end_log_pos CRC32 0xb93ce981 Xid =
- COMMIT/*!*/;
- # at
- # :: server id end_log_pos CRC32 0x577dc41e Stop
- SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
- DELIMITER ;
- # End of log file
- /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
- /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
- [root@ZST3 app1]#
手动故障切换后结构为:Node3->{Node2},且数据进行了自动补全
三、GTID复制下手动Failover
3.1、MHA配置文件调整
MHA在GTID模式下,需要配置[binlog*],可以是单独的Binlog Server服务器,也可以是主库的binlog目录。如果不配置[binlog*],即使主服务器没挂,也不会从主服务器拉binlog,所有未传递到从库的日志将丢失
- #app1.conf尾部添加Binlog Server信息
- [root@ZST1 masterha]# cat app1.conf
- ...
- [binlog1]
- hostname=192.168.85.132
- master_binlog_dir=/data/mysql/mysql3307/logs
- no_master=
- [root@ZST1 masterha]#
3.2、手动Failover
基于Row+Gtid搭建的一主两从复制结构:Node1->{Node2、Node3},重新生成测试数据,关闭Node1节点数据库服务,执行手动Failover脚本
- # GTID+手动Failover
- [root@ZST1 masterha]# masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.85.132 --dead_master_port= --master_state=dead --new_master_host=192.168.85.134 --new_master_port= --ignore_last_failover
- --dead_master_ip=<dead_master_ip> is not set. Using 192.168.85.132.
- Thu Mar :: - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
- Thu Mar :: - [info] Reading application default configuration from /etc/masterha/app1.conf..
- Thu Mar :: - [info] Reading server configuration from /etc/masterha/app1.conf..
- Thu Mar :: - [info] MHA::MasterFailover version 0.56.
- Thu Mar :: - [info] Starting master failover.
- Thu Mar :: - [info]
- ==================== 、配置检查阶段,Start ====================
- Thu Mar :: - [info] * Phase : Configuration Check Phase..
- Thu Mar :: - [info]
- Thu Mar :: - [debug] SSH connection test to 192.168.85.132, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=, timeout
- Thu Mar :: - [info] HealthCheck: SSH to 192.168.85.132 is reachable.
- Thu Mar :: - [info] Binlog server 192.168.85.132 is reachable.
- Thu Mar :: - [debug] Connecting to servers..
- Thu Mar :: - [debug] Connected to: 192.168.85.133(192.168.85.133:), user=mydba
- Thu Mar :: - [debug] Number of slave worker threads on host 192.168.85.133(192.168.85.133:):
- Thu Mar :: - [debug] Connected to: 192.168.85.134(192.168.85.134:), user=mydba
- Thu Mar :: - [debug] Number of slave worker threads on host 192.168.85.134(192.168.85.134:):
- Thu Mar :: - [debug] Comparing MySQL versions..
- Thu Mar :: - [debug] Comparing MySQL versions done.
- Thu Mar :: - [debug] Connecting to servers done.
- Thu Mar :: - [info] GTID failover mode =
- Thu Mar :: - [info] Dead Servers:
- Thu Mar :: - [info] 192.168.85.132(192.168.85.132:)
- Thu Mar :: - [info] Checking master reachability via MySQL(double check)...
- Thu Mar :: - [info] ok.
- Thu Mar :: - [info] Alive Servers:
- Thu Mar :: - [info] 192.168.85.133(192.168.85.133:)
- Thu Mar :: - [info] 192.168.85.134(192.168.85.134:)
- Thu Mar :: - [info] Alive Slaves:
- Thu Mar :: - [info] 192.168.85.133(192.168.85.133:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Thu Mar :: - [info] GTID ON
- Thu Mar :: - [debug] Relay log info repository: FILE
- Thu Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Thu Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- Thu Mar :: - [info] 192.168.85.134(192.168.85.134:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Thu Mar :: - [info] GTID ON
- Thu Mar :: - [debug] Relay log info repository: FILE
- Thu Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Thu Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- ******************** 选择是否继续进行 ********************
- Master 192.168.85.132(192.168.85.132:) is dead. Proceed? (yes/NO): yes
- Thu Mar :: - [info] Starting GTID based failover.
- Thu Mar :: - [info]
- Thu Mar :: - [info] ** Phase : Configuration Check Phase completed.
- ==================== 、配置检查阶段,End ====================
- Thu Mar :: - [info]
- ==================== 、故障Master关闭阶段,Start ====================
- Thu Mar :: - [info] * Phase : Dead Master Shutdown Phase..
- Thu Mar :: - [info]
- Thu Mar :: - [debug] SSH connection test to 192.168.85.132, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=, timeout
- Thu Mar :: - [debug] Stopping IO thread on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] Stopping IO thread on 192.168.85.133(192.168.85.133:)..
- Thu Mar :: - [debug] Stop IO thread on 192.168.85.133(192.168.85.133:) done.
- Thu Mar :: - [debug] Stop IO thread on 192.168.85.134(192.168.85.134:) done.
- Thu Mar :: - [info] HealthCheck: SSH to 192.168.85.132 is reachable.
- Thu Mar :: - [info] Forcing shutdown so that applications never connect to the current master..
- Thu Mar :: - [info] Executing master IP deactivation script:
- Thu Mar :: - [info] /etc/masterha/master_ip_failover --orig_master_host=192.168.85.132 --orig_master_ip=192.168.85.132 --orig_master_port= --command=stopssh --ssh_user=root
- Thu Mar :: - [info] done.
- Thu Mar :: - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
- Thu Mar :: - [info] * Phase : Dead Master Shutdown Phase completed.
- ==================== 、故障Master关闭阶段,End ====================
- Thu Mar :: - [info]
- ==================== 、新Master恢复阶段,Start ====================
- Thu Mar :: - [info] * Phase : Master Recovery Phase..
- Thu Mar :: - [info]
- ==================== 3.1、获取最新的Slave ====================
- ******************** 最新Slave,用于补全New Master缺少的数据;用于save故障Master的binlog的起始点 ********************
- Thu Mar :: - [info] * Phase 3.1: Getting Latest Slaves Phase..
- Thu Mar :: - [info]
- Thu Mar :: - [debug] Fetching current slave status..
- Thu Mar :: - [debug] Fetching current slave status done.
- Thu Mar :: - [info] The latest binary log file/position on all slaves is mysql-bin.:
- Thu Mar :: - [info] Retrieved Gtid Set: 90b30799--11e7--000c29c1025c:-
- Thu Mar :: - [info] Latest slaves (Slaves that received relay log files to the latest):
- Thu Mar :: - [info] 192.168.85.133(192.168.85.133:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Thu Mar :: - [info] GTID ON
- Thu Mar :: - [debug] Relay log info repository: FILE
- Thu Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Thu Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- Thu Mar :: - [info] The oldest binary log file/position on all slaves is mysql-bin.:
- Thu Mar :: - [info] Retrieved Gtid Set: 90b30799--11e7--000c29c1025c:-
- Thu Mar :: - [info] Oldest slaves:
- Thu Mar :: - [info] 192.168.85.134(192.168.85.134:) Version=5.7.-log (oldest major version between slaves) log-bin:enabled
- Thu Mar :: - [info] GTID ON
- Thu Mar :: - [debug] Relay log info repository: FILE
- Thu Mar :: - [info] Replicating from 192.168.85.132(192.168.85.132:)
- Thu Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
- Thu Mar :: - [info]
- ==================== 3.3、选举新Master ====================
- Thu Mar :: - [info] * Phase 3.3: Determining New Master Phase..
- Thu Mar :: - [info]
- Thu Mar :: - [info] 192.168.85.134 can be new master.
- Thu Mar :: - [info] New master is 192.168.85.134(192.168.85.134:)
- Thu Mar :: - [info] Starting master failover..
- Thu Mar :: - [info]
- From:
- 192.168.85.132(192.168.85.132:) (current master)
- +--192.168.85.133(192.168.85.133:)
- +--192.168.85.134(192.168.85.134:)
- To:
- 192.168.85.134(192.168.85.134:) (new master)
- +--192.168.85.133(192.168.85.133:)
- ******************** 选择是否进行切换 ********************
- Starting master switch from 192.168.85.132(192.168.85.132:) to 192.168.85.134(192.168.85.134:)? (yes/NO): yes
- Thu Mar :: - [info] New master decided manually is 192.168.85.134(192.168.85.134:)
- Thu Mar :: - [info]
- Thu Mar :: - [info] * Phase 3.3: New Master Recovery Phase..
- Thu Mar :: - [info]
- ******************** 等待新Master应用完自己的relay-log ********************
- Thu Mar :: - [info] Waiting all logs to be applied..
- Thu Mar :: - [info] done.
- Thu Mar :: - [debug] Stopping slave IO/SQL thread on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [info] Replicating from the latest slave 192.168.85.133(192.168.85.133:) and waiting to apply..
- ******************** 等待最新的Slave应用完自己的relay-log ********************
- Thu Mar :: - [info] Waiting all logs to be applied on the latest slave..
- ******************** 将新Master change到最新的Slave,以补全差异数据 ********************
- Thu Mar :: - [info] Resetting slave 192.168.85.134(192.168.85.134:) and starting replication from the new master 192.168.85.133(192.168.85.133:)..
- Thu Mar :: - [debug] Stopping slave IO/SQL thread on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [info] Executed CHANGE MASTER.
- Thu Mar :: - [debug] Starting slave IO/SQL thread on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [info] Slave started.
- Thu Mar :: - [info] Waiting to execute all relay logs on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [info] master_pos_wait(mysql-bin.:) completed on 192.168.85.134(192.168.85.134:). Executed events.
- Thu Mar :: - [info] done.
- Thu Mar :: - [debug] Stopping SQL thread on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [info] done.
- Thu Mar :: - [info] -- Saving binlog from host 192.168.85.132 started, pid:
- Thu Mar :: - [info]
- Thu Mar :: - [info] Log messages from 192.168.85.132 ...
- Thu Mar :: - [info]
- ******************** 在故障Master/BinlogServer执行,取最新Slave之后的部分 ********************
- Thu Mar :: - [info] Fetching binary logs from binlog server 192.168.85.132..
- Thu Mar :: - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin. --start_pos= --output_file=/var/log/masterha/app1/saved_binlog_binlog1_20180329150032.binlog --handle_raw_binlog= --skip_filter= --disable_log_bin= --manager_version=0.56 --oldest_version=5.7.-log --debug --binlog_dir=/data/mysql/mysql3307/logs
- Creating /var/log/masterha/app1 if not exists.. ok.
- Concat binary/relay logs from mysql-bin. pos to mysql-bin. EOF into /var/log/masterha/app1/saved_binlog_binlog1_20180329150032.binlog ..
- Executing command: mysqlbinlog --start-position= /data/mysql/mysql3307/logs/mysql-bin. >> /var/log/masterha/app1/saved_binlog_binlog1_20180329150032.binlog
- Concat succeeded.
- ******************** 将得到的binlog scp到 手动failover 运行的工作目录 ********************
- Thu Mar :: - [info] scp from root@192.168.85.132:/var/log/masterha/app1/saved_binlog_binlog1_20180329150032.binlog to local:/var/log/masterha/app1/saved_binlog_192.168.85.132_binlog1_20180329150032.binlog succeeded.
- Thu Mar :: - [info] End of log messages from 192.168.85.132.
- Thu Mar :: - [info] Saved mysqlbinlog size from 192.168.85.132 is bytes.
- Thu Mar :: - [info] Applying differential binlog /var/log/masterha/app1/saved_binlog_192.168.85.132_binlog1_20180329150032.binlog ..
- Thu Mar :: - [info] Differential log apply from binlog server succeeded.
- ******************** 新Master应用完binlog,得到当前位置 ********************
- Thu Mar :: - [info] Getting new master''s binlog name and position..
- Thu Mar :: - [info] mysql-bin.:
- Thu Mar :: - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.85.134', MASTER_PORT=, MASTER_AUTO_POSITION=, MASTER_USER='repl', MASTER_PASSWORD='xxx';
- Thu Mar :: - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin., , 90b30799--11e7--000c29c1025c:-
- ******************** 开启虚拟IP,新Master可以对外提供服务 ********************
- Thu Mar :: - [info] Executing master IP activate script:
- Thu Mar :: - [info] /etc/masterha/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.85.132 --orig_master_ip=192.168.85.132 --orig_master_port= --new_master_host=192.168.85.134 --new_master_ip=192.168.85.134 --new_master_port= --new_master_user='mydba' --new_master_password='mysql5721'
- Set read_only= on the new master.
- RTNETLINK answers: Cannot assign requested address
- RTNETLINK answers: File exists
- Thu Mar :: - [info] OK.
- Thu Mar :: - [info] ** Finished master recovery successfully.
- Thu Mar :: - [info] * Phase : Master Recovery Phase completed.
- ==================== 、新Master恢复阶段,End ====================
- Thu Mar :: - [info]
- ==================== 、Slave恢复阶段,Start ====================
- Thu Mar :: - [info] * Phase : Slaves Recovery Phase..
- Thu Mar :: - [info]
- Thu Mar :: - [info]
- ==================== 4.1、Slave直接change master to New_Master ====================
- Thu Mar :: - [info] * Phase 4.1: Starting Slaves in parallel..
- Thu Mar :: - [info]
- Thu Mar :: - [info] -- Slave recovery on host 192.168.85.133(192.168.85.133:) started, pid: . Check tmp log /var/log/masterha/app1/192.168..133_3307_20180329150032.log if it takes time..
- Thu Mar :: - [info]
- Thu Mar :: - [info] Log messages from 192.168.85.133 ...
- Thu Mar :: - [info]
- Thu Mar :: - [info] Resetting slave 192.168.85.133(192.168.85.133:) and starting replication from the new master 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] Stopping slave IO/SQL thread on 192.168.85.133(192.168.85.133:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [info] Executed CHANGE MASTER.
- Thu Mar :: - [debug] Starting slave IO/SQL thread on 192.168.85.133(192.168.85.133:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [info] Slave started.
- Thu Mar :: - [info] gtid_wait(90b30799--11e7--000c29c1025c:-) completed on 192.168.85.133(192.168.85.133:). Executed events.
- Thu Mar :: - [info] End of log messages from 192.168.85.133.
- Thu Mar :: - [info] -- Slave on host 192.168.85.133(192.168.85.133:) started.
- Thu Mar :: - [info] All new slave servers recovered successfully.
- ==================== 、Slave恢复阶段,End ====================
- Thu Mar :: - [info]
- ==================== 、新Master清理阶段,Start ====================
- Thu Mar :: - [info] * Phase : New master cleanup phase..
- Thu Mar :: - [info]
- Thu Mar :: - [info] Resetting slave info on the new master..
- Thu Mar :: - [debug] Clearing slave info..
- Thu Mar :: - [debug] Stopping slave IO/SQL thread on 192.168.85.134(192.168.85.134:)..
- Thu Mar :: - [debug] done.
- Thu Mar :: - [debug] SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK.
- Thu Mar :: - [info] 192.168.85.134: Resetting slave info succeeded.
- ==================== 、新Master清理阶段,End ====================
- Thu Mar :: - [info] Master failover to 192.168.85.134(192.168.85.134:) completed successfully.
- Thu Mar :: - [debug] Disconnected from 192.168.85.133(192.168.85.133:)
- Thu Mar :: - [debug] Disconnected from 192.168.85.134(192.168.85.134:)
- Thu Mar :: - [info]
- ----- Failover Report -----
- app1: MySQL Master failover 192.168.85.132(192.168.85.132:) to 192.168.85.134(192.168.85.134:) succeeded
- Master 192.168.85.132(192.168.85.132:) is down!
- Check MHA Manager logs at ZST1 for details.
- Started manual(interactive) failover.
- Invalidated master IP address on 192.168.85.132(192.168.85.132:)
- Selected 192.168.85.134(192.168.85.134:) as a new master.
- 192.168.85.134(192.168.85.134:): OK: Applying all logs succeeded.
- 192.168.85.134(192.168.85.134:): OK: Activated master IP address.
- 192.168.85.133(192.168.85.133:): OK: Slave started, replicating from 192.168.85.134(192.168.85.134:)
- 192.168.85.134(192.168.85.134:): Resetting slave info succeeded.
- Master failover to 192.168.85.134(192.168.85.134:) completed successfully.
- [root@ZST1 masterha]#
手动Failover流程
- 手动Failover(GTID)
- 、配置检查:连接各实例,检查服务状态,检查主从关系
- 、故障Master关闭:停止各Slave上的IO Thread,故障Master虚拟IP摘除(stopssh)
- 、新Master恢复
- 3.1、获取最新的Slave
- 用于补全新Master缺少的数据;用于save故障Master的binlog的起始点
- 3.2、选举新Master
- 确定新Master,得到切换前后结构
- 3.3、新Master恢复
- 3.3.、补全新Master与最新Slave差异
- 等待新Master应用完自己的relay-log;等待最新Slave应用完自己的relay-log;将新Master change到最新Slave,以补全差异数据
- 3.3.、补全新Master与故障Master差异
- 故障Master/BinlogServer上执行save_binary_logs;将得到的binlog scp到手动Failover运行的工作目录;新Master应用完binlog,得到当前位置;绑定虚拟IP,新Master可以对外提供服务
- 、其他Slave恢复
- 4.1、重置复制,RESET SLAVE;CHANGE MASTER TO New Master;
- 4.2、如果存在多个Slaves,重复上述操作
- 、新Master清理:清理旧的复制信息STOP SLAVE;RESET SLAVE ALL;
3.3、传统和GTID下手动Failover流程区别
为了得到详细的切换日志,建议
• MHA配置文件开启log_level=debug
• Node1、Node2、Node3节点模拟数据差异
• New Master分别选择Node2、Node3
手动Failover(GTID),建议打开general-log,以查看New Master与Latest Slave之间数据补全方式
传统 | GTID | |
是否补全数据 | 只要主节点服务器没挂,默认会将所有数据补全 | 需在配置文件将master/binlog server配置到[binlog*],才能补全Dead Master上的差异log,否则只应用到Latest Slave |
补全数据的方式 | 新Master/其他Slave拉取Latest Slave的relay-log | 新master拉取Latest Slave的binlog |
所有的新Master/其他Slave生成与Latest Slave之间差异的relay-log,并应用这些relay-log(对应文件relay_from_read_to_latest_**) | 新Master change to Latest Slave,以补全与Latest Slave之间的差异数据 | |
新Master/其他Slave应用Latest Slave与Dead Master之间的差异binlog(对应文件saved_master_binlog_from_**) | 新Master追平Latest Slave后,再通过save_binary_logs生成与Dead Master之间的差异binlog,并应用(对应文件saved_binlog_binlog1_**) | |
其他Slave不需应用任何差异log,直接change master to new_master即可 | ||
生成的文件 | relay_from_read_to_latest_**:最新Slave与其他Slave之间的差异relay-log,在最新Slave生成,然后拷贝到其他对应Slave | saved_master_binlog_from_**:故障Master与最新Slave之间的差异binlog,在故障Master/BinlogServer生成,然后拷贝到手动Failover运行的工作目录 |
saved_master_binlog_from_**:故障Master与最新Slave之间的差异binlog,在故障Master生成,先拷贝到手动Failover运行的工作目录,然后拷贝到其他Slave | ||
文件可以使用mysqlbinlog解析~.~ | 文件不能使用mysqlbinlog解析(・ω・)也许是姿势不对~不过它们的命令确实稍有不同~~ |
GTID环境,只有在处理Dead Master数据时,才使用save_binary_logs的方式(主库挂掉,没法change),其他都是直接通过change master to利用复制线程补全数据。同时它也不再依赖Latest Slave的relay-log
总的来说GTID环境下MHA有点臃肿,有能力的可以自行写脚本处理:
确定Latest_Slave->New_Master:change master to Latest_Slave->mysqlbinlog ./binlogserver/binlog --start-positon>New_Master->Other_Slave change master to New_Master
如果使用增强半同步,基本能确保Dead_Master上的binlog全部传递到Latest_Slave,这种情况下进行故障切换更加简单(⊙_⊙)
MHA-手动Failover流程(传统复制>ID复制)的更多相关文章
- MHA集群(gtid复制)和vip漂移
在上一片博客中,讲述了怎么去配置MHA架构!这片博客不再细说,只说明其中MySQL主从搭建,这里使用的是gtid加上半同步复制! 步骤与上一片博客一样,不同之处在于MySQL主从的搭建!详细的gtid ...
- MySQL5.7不停业务将传统复制变更为GTID复制
由于GTID的优势,我们需要将传统基于file-pos的复制更改为基于GTID的复制,如何在线变更成为我们关心的一个点,如下为具体的方法: 目前我们有一个传统复制下的M-S结构: port 330 ...
- MySQL的GTID复制与传统复制的相互转换
主库:192.168.225.128:3307从库1:192.168.225.129:3307 Gtid作为5.6版本以来的杀手级特性,却因为不支持拓扑结构内开关而饱受诟病.如果你需要从未开启GTID ...
- MySQL的GTID复制与传统复制的相互切换
MySQL的GTID复制与传统复制的相互转换 1. GTID复制转换成传统复制 1.1 环境准备 1.2 停止slave 1.3 查看当前主从状态 1.4 change master 1.5 启动主从 ...
- GTID复制模式切换与传统主从复制间切换
GTID复制模式切换到传统主从复制主从复制环境:主库:10.18.10.11从库:10.18.10.12MySQL5.7.22 切换之前查看下主从gitd_mode参数值主服务器:gtid_mode值 ...
- Mysql基于GTID复制模式-运维小结 (完整篇)
先来看mysql5.6主从同步操作时遇到的一个报错:mysql> change master to master_host='192.168.10.59',master_user='repli' ...
- MySQL高可用方案MHA自动Failover与手动Failover的实践及原理
集群信息 角色 IP地址 ServerID 类型 Master ...
- 转 GTID复制的搭建和问题处理
########sample 1: 了解mysqldump 和 mysqlbackup 和 gtid_executed 和 gtid_purged https://www.linuxidc.com/ ...
- GTID复制的工作原理
参考自:https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-lifecycle.html 笔记说明: 本文翻译自官网,当然会根据语义做一 ...
随机推荐
- 在Python中调用C++模块
一.一般调用流程 http://www.cnblogs.com/huangshujia/p/4394276.html 二.Python读取图像并传入C++函数,再从C++返回结果图像给Python h ...
- codeforces604B
More Cowbell CodeForces - 604B Kevin Sun wants to move his precious collection of n cowbells from Na ...
- BZOJ3237 AHOI2013连通图(线段树分治+并查集)
把查询看做是在一条时间轴上.那么每条边都有几段存在时间.于是线段树分治就好了. 然而在bzoj上t掉了,不知道是常数大了还是写挂了. 以及brk不知道是啥做数组名过不了编译. #include< ...
- Play on Words HDU - 1116(欧拉路判断 + 并查集)
题意: 给出几个单词,求能否用所有的单词成语接龙 解析: 把每个单词的首字母和尾字母分别看作两个点u 和 v,输入每个单词后,u的出度++, v的入度++ 最后判断是否能组成欧拉路径 或 欧拉回路,当 ...
- MT【32】内外圆(Apollonius Circle)的几何证明
另一方面,如果 M 满足(1)式,那么M必然在以PQ为直径的圆上.事实上当M为P或者Q时,这是显然的.当M异于P,Q时,由$\frac{|MB|}{|MC|}=\frac{|PB|}{|PC|}=\l ...
- Hdoj 1009.FatMouse' Trade 题解
Problem Description FatMouse prepared M pounds of cat food, ready to trade with the cats guarding th ...
- [luogu4568][bzoj2763][JLOI2011]飞行路线
题目描述 Alice和Bob现在要乘飞机旅行,他们选择了一家相对便宜的航空公司.该航空公司一共在n个城市设有业务,设这些城市分别标记为00到n-1,一共有m种航线,每种航线连接两个城市,并且航线有一定 ...
- 洛谷 P1053 音乐会的等待 解题报告
P1823 音乐会的等待 题目描述 \(N\)个人正在排队进入一个音乐会.人们等得很无聊,于是他们开始转来转去,想在队伍里寻找自己的熟人.队列中任意两个人\(A\)和\(B\),如果他们是相邻或他们之 ...
- C# 类&结构体&枚举
类: class Lei //要和static void Main(string[] args)平级: { public int lei_int; //public是关键字,代表访问权限,这里是公 ...
- javascript高级程序设计第二章知识点提炼
这是我整理的javascript高级程序设计第二章的脑图,内容也是非常浅显与简单.希望您看了我的博客能够给我一些意见或者建议.