Redis-ha(sentinel)搭建

服务器描述：本次搭建是用来测试，所以是在一台服务器上搭建三个redis服务（一主两从）

服务角色	端口	Redis.conf名称	sentinel配置文件名称	sentinel端口	redis日志路径	sentinel路劲
主(master)	6379	redis.conf	sentinel.conf	26379	/home/zhangxs/data/redislog/redis_server/master.log	/home/zhangxs/data/redislog/sentinel/sentinel6379.log
从(slave)	6380	redis_slave6380.conf	Sentinel6380.conf	26380	/home/zhangxs/data/redislog/redis_server/slave6380.log	/home/zhangxs/data/redislog/sentinel/sentinel6380.log
从(slave)	6381	redis_slave6381.conf	Sentinel6381.conf	26381	/home/zhangxs/data/redislog/redis_server/slave6381.log	/home/zhangxs/data/redislog/sentinel/sentinel6381.log

修改配置文件

1： redis.conf

修改redis服务日志路径：logfile "/home/zhangxs/data/redislog/redis_server/master.log"

其他没有修改，使用的是默认配置

2：redis_slave6380.conf 和redis_slave6381.conf (copy redis_slave.conf)

设置他们指向的master服务的ip和端口：slaveof 127.0.0.1 6379(两个文件都配置)
修改redis_slave6380.conf 和redis_slave6381.conf 端口:redis_slave6380.conf 端口为26380 ; redis_slave6381.conf 端口为26381
设置slave服务日志路径： logfile /home/zhangxs/data/redislog/redis_server/slave6380.log 和 logfile /home/zhangxs/data/redislog/redis_server/slave6381.log

3：sentinel.conf

修改日志路径：logfile "/home/zhangxs/data/redislog/sentinel/sentinel6379.log"

4:Sentinel6380.conf 和 Sentinel6381.conf (copy sentinel.conf)

修改端口号：Sentinel6380.conf 改为 26380; Sentinel6381.conf 改为 26381;
修改日志路径：logfile "/home/zhangxs/data/redislog/sentinel/sentinel6380.log" 和 logfile "/home/zhangxs/data/redislog/sentinel/sentinel6381.log"

上面配置好后，启动redis服务

1：启动master

src/redis-server redis.conf&

2:启动从服务(slave6380)

src/redis-server redis_slave6380.conf &

查看主服务日志，会发现多出来一段

2755:M 29 Jul 00:13:25.469 * Starting BGSAVE for SYNC with target: disk

2755:M 29 Jul 00:13:25.502 * Background saving started by pid 2784

2784:C 29 Jul 00:13:25.608 * DB saved on disk

2784:C 29 Jul 00:13:25.608 * RDB: 6 MB of memory used by copy-on-write

2755:M 29 Jul 00:13:25.677 * Background saving terminated with success

2755:M 29 Jul 00:13:25.677 * Synchronization with slave 127.0.0.1:6380 succeeded   //同步到从服务127.0.0.1:6380成功

查看从服务日志slave6380.log

2780:S 29 Jul 00:13:25.436 * DB loaded from disk: 0.007 seconds

2780:S 29 Jul 00:13:25.436 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.

2780:S 29 Jul 00:13:25.436 * Ready to accept connections

2780:S 29 Jul 00:13:25.436 * Connecting to MASTER 127.0.0.1:6379 //正在连接master服务

2780:S 29 Jul 00:13:25.436 * MASTER <-> SLAVE sync started //master到slave 启动同步

2780:S 29 Jul 00:13:25.436 * Non blocking connect for SYNC fired the event.

2780:S 29 Jul 00:13:25.436 * Master replied to PING, replication can continue...

2780:S 29 Jul 00:13:25.436 * Trying a partial resynchronization (request a26696cff5d35f38896c4eb068f71adbb7cfc421:474104).

2780:S 29 Jul 00:13:25.577 * Full resync from master: 541cd938f43b4f144e647881af409fa1884ea5a4:0 //从master全量同步

2780:S 29 Jul 00:13:25.577 * Discarding previously cached master state.//丢弃之前缓存的master状态

2780:S 29 Jul 00:13:25.677 * MASTER <-> SLAVE sync: receiving 250 bytes from master //slave从master 同步250个字节

2780:S 29 Jul 00:13:25.677 * MASTER <-> SLAVE sync: Flushing old data

2780:S 29 Jul 00:13:25.677 * MASTER <-> SLAVE sync: Loading DB in memory

2780:S 29 Jul 00:13:25.678 * MASTER <-> SLAVE sync: Finished with success

3:启动从服务(slave6381)

src/redis-server redis_slave6381.conf &

查看主服务日志，会发现多出来一段

//从服务127.0.0.1:6381发送同步请求
2755:M 29 Jul 00:26:46.531 * Slave 127.0.0.1:6381 asks for synchronization

//接受127.0.1:6381的部分重同步请求。从偏移量1开始发送积压的1106字节

2755:M 29 Jul 00:26:46.531 * Partial resynchronization request from 127.0.0.1:6381 accepted. Sending 1106 bytes of backlog starting from offset 1.

查看从服务日志slave6381.log

2809:S 29 Jul 00:26:46.531 * DB loaded from disk: 0.000 seconds

2809:S 29 Jul 00:26:46.531 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.

2809:S 29 Jul 00:26:46.531 * Ready to accept connections

2809:S 29 Jul 00:26:46.531 * Connecting to MASTER 127.0.0.1:6379

2809:S 29 Jul 00:26:46.531 * MASTER <-> SLAVE sync started

2809:S 29 Jul 00:26:46.531 * Non blocking connect for SYNC fired the event.

2809:S 29 Jul 00:26:46.531 * Master replied to PING, replication can continue...

2809:S 29 Jul 00:26:46.531 * Trying a partial resynchronization (request 541cd938f43b4f144e647881af409fa1884ea5a4:1).

2809:S 29 Jul 00:26:46.531 * Successful partial resynchronization with master.

2809:S 29 Jul 00:26:46.532 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization

4：测试数据同步功能（使用redis-cli 连接服务器）

1：连接master

[root@vm1 src]# redis-cli -h 127.0.0.1 -p 6379

127.0.0.1:6379> set name fj

2:连接slave6380

[root@vm1 src]# redis-cli -h 127.0.0.1 -p 6380

127.0.0.1:6380> get name

"fj"

127.0.0.1:6380>

3:连接slave6381

[root@vm1 src]# redis-cli -h 127.0.0.1 -p 6381

127.0.0.1:6381> get name

"fj"

127.0.0.1:6381>

Ok 主从同步没有问题。

默认情况下从服务是不允许set数据的，测试下

127.0.0.1:6380> set name hello

(error) READONLY You can't write against a read only slave.

127.0.0.1:6380>

127.0.0.1:6381> set name hello

(error) READONLY You can't write against a read only slave.

127.0.0.1:6381>

启动各个服务的sentinel

启动sentinel6379

src/redis-sentinel sentinel.conf &

查看Sentinel6379.log

2908:X 29 Jul 01:01:32.838 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo

2908:X 29 Jul 01:01:32.839 # Redis version=4.0.10, bits=64, commit=00000000, modified=0, pid=2908, just started

2908:X 29 Jul 01:01:32.839 # Configuration loaded

2908:X 29 Jul 01:01:32.839 * Increased maximum number of open files to 10032 (it was originally set to 1024).

2908:X 29 Jul 01:01:32.840 * Running mode=sentinel, port=26379.

2908:X 29 Jul 01:01:32.840 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.

2908:X 29 Jul 01:01:32.855 # Sentinel ID is 1a77392638e41bb0ea0a865ffc93b8de6335227f

2908:X 29 Jul 01:01:32.855 # +monitor master mymaster 127.0.0.1 6379 quorum 2

//一个新的从服务器已经被sentinel识别并关联

2908:X 29 Jul 01:01:32.856 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

2908:X 29 Jul 01:01:32.858 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

启动sentinel6380

 src/redis-sentinel sentinel6380.conf &

查看 Sentinel6380.log

2937:X 29 Jul 01:08:14.325 # Redis version=4.0.10, bits=64, commit=00000000, modified=0, pid=2937, just started

2937:X 29 Jul 01:08:14.325 # Configuration loaded

2937:X 29 Jul 01:08:14.327 * Increased maximum number of open files to 10032 (it was originally set to 1024).

2937:X 29 Jul 01:08:14.377 * Running mode=sentinel, port=26380.

2937:X 29 Jul 01:08:14.377 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.

2937:X 29 Jul 01:08:14.379 # Sentinel ID is 4a6aebffdd1301bf054e722c34e8a6611418ba8a

2937:X 29 Jul 01:08:14.379 # +monitor master mymaster 127.0.0.1 6379 quorum 2

2937:X 29 Jul 01:08:14.380 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

2937:X 29 Jul 01:08:14.381 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

//一个新的sentinel（监控）已经被识别并关联

2937:X 29 Jul 01:08:14.919 * +sentinel sentinel 1a77392638e41bb0ea0a865ffc93b8de6335227f 127.0.0.1 26379 @ mymaster 127.0.0.1 6379

Sentinel6380启动后会发现，Sentinel6379.log 加了一段日志

//一个新的sentinel（监控）已经被识别并关联

2908:X 29 Jul 01:08:16.367 * +sentinel sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379

新启动一个sentinel，会通过发布订阅功能自动发现监控相同master下的其他sentinel。这一功能是通过向频道 sentinel:hello 发送信息来实现的。

启动sentinel6381

[root@vm1 redis-4.0.10]# src/redis-sentinel sentinel6381.conf &

查看 Sentinel6381.log

2961:X 29 Jul 01:11:09.823 # Configuration loaded

2961:X 29 Jul 01:11:09.823 * Increased maximum number of open files to 10032 (it was originally set to 1024).

2961:X 29 Jul 01:11:09.852 * Running mode=sentinel, port=26381.

2961:X 29 Jul 01:11:09.852 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.

2961:X 29 Jul 01:11:09.853 # Sentinel ID is 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030

2961:X 29 Jul 01:11:09.853 # +monitor master mymaster 127.0.0.1 6379 quorum 2

2961:X 29 Jul 01:11:09.853 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

2961:X 29 Jul 01:11:09.855 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

2961:X 29 Jul 01:11:10.334 * +sentinel sentinel 1a77392638e41bb0ea0a865ffc93b8de6335227f 127.0.0.1 26379 @ mymaster 127.0.0.1 6379

2961:X 29 Jul 01:11:11.446 * +sentinel sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379

新加入的sentinel6381， sentinel6379和sentinel6380 都会收到通知(//一个新的sentinel（监控）已经被识别并关联)

2908:X 29 Jul 01:11:11.880 * +sentinel sentinel 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030 127.0.0.1 26381 @ mymaster 127.0.0.1 6379

2937:X 29 Jul 01:11:11.878 * +sentinel sentinel 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030 127.0.0.1 26381 @ mymaster 127.0.0.1 6379

sentinel的状态都会记录到sentinel.conf文件中，用于启动后恢复状态，查看下各个sentinel.conf 文件变动后的部分

sentinel.conf

启动前：无启动后：sentinel myid 1a77392638e41bb0ea0a865ffc93b8de6335227f //自己的sentinel myid

启动前：无启动后：

# Generated by CONFIG REWRITE

#master下得两个从服务

sentinel known-slave mymaster 127.0.0.1 6380

sentinel known-slave mymaster 127.0.0.1 6381

#master下其他两个sentinel

sentinel known-sentinel mymaster 127.0.0.1 26380 4a6aebffdd1301bf054e722c34e8a6611418ba8a

sentinel known-sentinel mymaster 127.0.0.1 26381 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030

sentinel current-epoch 0

Sentinel6380.conf和Sentinel6381.conf改动和sentinel.conf 基本一样。不一样的就是记录自己sentinel myid和master下其他两个sentinel不一样，大同小异。

测试故障迁移

Sentinel 故障迁移我使用的是默认配置(不需要再配置，可以自定义修改)

//判断master失效，至少有两个sentinel同意才会执行故障迁移
sentinel monitor mymaster 127.0.0.1 6379 2

//如果在10秒内sentinel 都收到master的一次有效回复，就认为该master主观下线

sentinel down-after-milliseconds mymaster 60000

sentinel failover-timeout mymaster 180000
//在执行故障转移时，同时只有一个slave能对新的master进行数据同步
sentinel parallel-syncs mymaster 1 
sentinel monitor resque 192.168.1.3 6380 4 
sentinel down-after-milliseconds resque 10000 
sentinel failover-timeout resque 180000 
sentinel parallel-syncs resque 5

1：查看redis的相关服务

root       2755   2551  0 00:11 pts/2    00:00:06 src/redis-server 127.0.0.1:6379

root       2780   2551  0 00:13 pts/2    00:00:06 src/redis-server 127.0.0.1:6380

root       2809   2551  0 00:26 pts/2    00:00:05 src/redis-server 127.0.0.1:6381

root       2816   2529  0 00:30 pts/1    00:00:00 redis-cli -h 127.0.0.1 -p 6379

root       2822   2530  0 00:33 pts/0    00:00:00 redis-cli -h 127.0.0.1 -p 6380

root       2841   2823  0 00:34 pts/6    00:00:00 redis-cli -h 127.0.0.1 -p 6381

root       2908   2551  0 01:01 pts/2    00:00:07 src/redis-sentinel *:26379 [sentinel]

root       2937   2551  0 01:08 pts/2    00:00:06 src/redis-sentinel *:26380 [sentinel]

root       2961   2551  0 01:11 pts/2    00:00:06 src/redis-sentinel *:26381 [sentinel]

root       3000   2551  0 01:50 pts/2    00:00:00 grep --color=auto redis

2：查看整个备份状态

127.0.0.1:6379> info

# Server

# Clients

# Memory

# Persistence

# Stats

# Replication

role:master

connected_slaves:2

slave0:ip=127.0.0.1,port=6380,state=online,offset=544591,lag=0

slave1:ip=127.0.0.1,port=6381,state=online,offset=544591,lag=0

master_replid:541cd938f43b4f144e647881af409fa1884ea5a4

master_replid2:0000000000000000000000000000000000000000

master_repl_offset:544857

second_repl_offset:-1

repl_backlog_active:1

repl_backlog_size:1048576

repl_backlog_first_byte_offset:1

repl_backlog_histlen:544857

# CPU

# Cluster

# Keyspace

其他信息我都删掉了，只留下【Replication】的信息，其他信息可以在redis-cli 命令行中使用【info】命令查看

可以看到，6379是master角色，master下有两个从服务port=6380，port=6381

3: Kill 掉 master，观察日志

kill -9 2755

master被干掉了，所以master.log 没有日志，看其他两个从服务日志（截取部分）

redis_slave6380.log

///////////////////////////一分钟内

2780:S 29 Jul 01:58:53.414 # Connection with master lost.

2780:S 29 Jul 01:58:53.414 * Caching the disconnected master state.

2780:S 29 Jul 01:58:54.163 * Connecting to MASTER 127.0.0.1:6379

2780:S 29 Jul 01:58:54.163 * MASTER <-> SLAVE sync started

2780:S 29 Jul 01:58:54.164 # Error condition on socket for SYNC: Connection refused

2780:S 29 Jul 01:58:55.168 * Connecting to MASTER 127.0.0.1:6379

2780:S 29 Jul 01:58:55.169 * MASTER <-> SLAVE sync started

....

...

...

2780:S 29 Jul 01:59:22.381 # Error condition on socket for SYNC: Connection refused

2780:S 29 Jul 01:59:23.389 * Connecting to MASTER 127.0.0.1:6379

2780:S 29 Jul 01:59:23.389 * MASTER <-> SLAVE sync started

2780:S 29 Jul 01:59:23.389 # Error condition on socket for SYNC: Connection refused

///////////////////////////一分钟后

2780:S 29 Jul 01:59:24.321 * SLAVE OF 127.0.0.1:6381 enabled (user request from 'id=8 addr=127.0.0.1:52556 fd=11 name=sentinel-4a6aebff-cmd age=3070 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=133 qbuf-free=32635 obl=36 oll=0 omem=0 events=r cmd=exec')

2780:S 29 Jul 01:59:24.323 # CONFIG REWRITE executed with suc

2780:S 29 Jul 01:59:24.399 * Connecting to MASTER 127.0.0.1:6381

2780:S 29 Jul 01:59:24.399 * MASTER <-> SLAVE sync started

2780:S 29 Jul 01:59:24.399 * Non blocking connect for SYNC fired the event.

2780:S 29 Jul 01:59:24.399 * Master replied to PING, replication can continue...

2780:S 29 Jul 01:59:24.399 * Trying a partial resynchronization (request 541cd938f43b4f144e647881af409fa1884ea5a4:617714).

2780:S 29 Jul 01:59:24.400 * Successful partial resynchronization with master.

2780:S 29 Jul 01:59:24.400 # Master replication ID changed to 514edab0972b4b6e5388edc4f14fbdb4d223d39e

2780:S 29 Jul 01:59:24.400 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.

从01:58:53.414 到01:59:23.389 这一分钟内一直在尝试连接master，一分钟内没有连接成功后，sentinel 就会master判断为主观下线，看日志

Sentinel6379.log

//判定master 主观下线

2908:X 29 Jul 01:59:23.492 # +sdown master mymaster 127.0.0.1 6379

//当前的纪元（epoch）已经被更新。

2908:X 29 Jul 01:59:23.546 # +new-epoch 1

 //开始给sentinel6380投票,谁来主导这次故障转移

2908:X 29 Jul 01:59:23.549 # +vote-for-leader 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1

//判定master 客观观下线,已经有2个sentinel同意

2908:X 29 Jul 01:59:23.569 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2

2908:X 29 Jul 01:59:23.569 # Next failover delay: I will not start a failover before Sun Jul 29 02:05:23 2018

2908:X 29 Jul 01:59:24.328 # +config-update-from sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379

2908:X 29 Jul 01:59:24.328 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381

2908:X 29 Jul 01:59:24.328 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381

2908:X 29 Jul 01:59:24.328 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

2908:X 29 Jul 01:59:54.333 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Sentinel6380.log

//判定master 主观下线

2937:X 29 Jul 01:59:23.459 # +sdown master mymaster 127.0.0.1 6379

//判定master 客观观下线,已经有2个sentinel同意

2937:X 29 Jul 01:59:23.536 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2

2937:X 29 Jul 01:59:23.537 # +new-epoch 1

//尝试故障转移master

2937:X 29 Jul 01:59:23.537 # +try-failover master mymaster 127.0.0.1 6379

 //开始给sentinel6380投票,谁来主导这次故障转移

2937:X 29 Jul 01:59:23.540 # +vote-for-leader 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1

//其他两个sentinel 都投票给4a6aebffdd1301bf054e722c34e8a6611418ba8a 【6380sentinel】

2937:X 29 Jul 01:59:23.549 # 1db1a4dcdf0ecca00b64d9362c2a2dd338da0030 voted for 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1

2937:X 29 Jul 01:59:23.549 # 1a77392638e41bb0ea0a865ffc93b8de6335227f voted for 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1

//6379这个服务赢得选举可以进行故障转移

2937:X 29 Jul 01:59:23.619 # +elected-leader master mymaster 127.0.0.1 6379

//发现6379这个服务是故障转移状态，就开始选择master下得从服务

2937:X 29 Jul 01:59:23.619 # +failover-state-select-slave master mymaster 127.0.0.1 6379

//故障转移操作现在处于 select-slave 状态 —— Sentinel 正在寻找可以升级为主服务器的从服务器。(选择mymaster 127.0.0.1 6379 下 6381 的从服务)

2937:X 29 Jul 01:59:23.710 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

//Sentinel 正在将6379下的从服务器6381升级为主服务器，等待升级功能完成。

2937:X 29 Jul 01:59:23.710 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

//master下的从服务 6381  等待升级

2937:X 29 Jul 01:59:23.769 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

//升级master下从服务6381

2937:X 29 Jul 01:59:24.251 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

//故障转移状态切换到了 reconf-slaves 状态。（再次确认从服务器转为主服务器）

2937:X 29 Jul 01:59:24.251 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379

//牵头的sentinel 向6380从服务器发送slaveof 指令，将它设置为新的master

2937:X 29 Jul 01:59:24.321 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

//6379不再处于客观下线状态，客观下线状态只用于master服务，6379已经不是master了

2937:X 29 Jul 01:59:24.653 # -odown master mymaster 127.0.0.1 6379

//6380服务正在将自己设置为6381主服务的从服务器，还未完成

2937:X 29 Jul 01:59:25.270 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

//从服务器6380已经完成对新master服务的同步

2937:X 29 Jul 01:59:25.271 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379

//master6379 故障转移结束，所有的从服务器开始同步新的master

2937:X 29 Jul 01:59:25.347 # +failover-end master mymaster 127.0.0.1 6379

//配置变更主服务器的ip地址已经改变， 选择master 为6381

2937:X 29 Jul 01:59:25.347 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381

//6381下的两个从服务（新的从服务被识别并关联）

2937:X 29 Jul 01:59:25.347 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381

2937:X 29 Jul 01:59:25.347 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

//添加master下从服务6379 为客观下线

2937:X 29 Jul 01:59:55.401 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

Sentinel6381.log

//判定master 主观下线
2961:X 29 Jul 01:59:23.459 # +sdown master mymaster 127.0.0.1 6379

2961:X 29 Jul 01:59:23.545 # +new-epoch 1
//开始给6380投票

2961:X 29 Jul 01:59:23.548 # +vote-for-leader 4a6aebffdd1301bf054e722c34e8a6611418ba8a 1

2961:X 29 Jul 01:59:24.325 # +config-update-from sentinel 4a6aebffdd1301bf054e722c34e8a6611418ba8a 127.0.0.1 26380 @ mymaster 127.0.0.1 6379

2961:X 29 Jul 01:59:24.325 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381

2961:X 29 Jul 01:59:24.325 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381

2961:X 29 Jul 01:59:24.325 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

2961:X 29 Jul 01:59:54.348 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

可以看到在 01:59:23秒也就是一分钟之后，三个监控master的sentinel 都判定了master为主观下线(sdown)，我们配置的至少有2个sentinel 同意master 主观下线，master就会被切换到客观下线(odown) 【+odown master mymaster 127.0.0.1 6379 #quorum 2/2】。当判断master为客观下线后，sentinel 就开始选举出新的master，可以看到Sentinel6380.log 日志要比其他的sentinel.log多，因为整个选举的过程是Sentinel6380 在牵头执行。

4：在6381下查看整个服务的备份状态

# Replication

role:master

connected_slaves:1

slave0:ip=127.0.0.1,port=6380,state=online,offset=1125775,lag=0

master_replid:514edab0972b4b6e5388edc4f14fbdb4d223d39e

master_replid2:541cd938f43b4f144e647881af409fa1884ea5a4

master_repl_offset:1125775

second_repl_offset:617714

repl_backlog_active:1

repl_backlog_size:1048576

repl_backlog_first_byte_offset:77200

repl_backlog_histlen:1048576

可以看到6381的角色成为了master，只有一个slave，因为另一个挂了。

5：再次启动6379服务

查看6379服务的日志

3037:S 29 Jul 02:46:47.753 # CONFIG REWRITE executed with success.
3037:S 29 Jul 02:46:48.359 * Connecting to MASTER 127.0.0.1:6381
3037:S 29 Jul 02:46:48.360 * MASTER <-> SLAVE sync started
3037:S 29 Jul 02:46:48.360 * Non blocking connect for SYNC fired the event.
3037:S 29 Jul 02:46:48.361 * Master replied to PING, replication can continue...
3037:S 29 Jul 02:46:48.362 * Trying a partial resynchronization (request 7b0dc6ac9c2188e3c92eb29eea200ea6c572619c:1).
3037:S 29 Jul 02:46:48.608 * Full resync from master: 514edab0972b4b6e5388edc4f14fbdb4d223d39e:1178142
3037:S 29 Jul 02:46:48.608 * Discarding previously cached master state.
3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: receiving 253 bytes from master
3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: Flushing old data
3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: Loading DB in memory
3037:S 29 Jul 02:46:48.708 * MASTER <-> SLAVE sync: Finished with success
3037:S 29 Jul 02:46:48.709 * Background append only file rewriting started by pid 3042
3037:S 29 Jul 02:46:48.750 * AOF rewrite child asks to stop sending diffs.
3042:C 29 Jul 02:46:48.750 * Parent agreed to stop sending diffs. Finalizing AOF...
3042:C 29 Jul 02:46:48.750 * Concatenating 0.00 MB of AOF diff received from parent.
3042:C 29 Jul 02:46:48.750 * SYNC append only file rewrite performed
3042:C 29 Jul 02:46:48.750 * AOF rewrite: 6 MB of memory used by copy-on-write
3037:S 29 Jul 02:46:48.781 * Background AOF rewrite terminated with success
3037:S 29 Jul 02:46:48.782 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
3037:S 29 Jul 02:46:48.782 * Background AOF rewrite finished successfully

看到启动后重写配置文件，然后自动连接6381这个新的master服务，开始从master 上全量同步数据

查看6381这个新master日志

//响应6379的同步请求
2809:M 29 Jul 02:46:48.362 * Slave 127.0.0.1:6379 asks for synchronization

//不接受同步部分数据请求

2809:M 29 Jul 02:46:48.362 * Partial resynchronization not accepted: Replication ID mismatch (Slave asked for '7b0dc6ac9c2188e3c92eb29eea200ea6c572619c', my replication IDs are '514edab0972b4b6e5388edc4f14fbdb4d223d39e' and '541cd938f43b4f144e647881af409fa1884ea5a4')

//开始同步

2809:M 29 Jul 02:46:48.362 * Starting BGSAVE for SYNC with target: disk

2809:M 29 Jul 02:46:48.607 * Background saving started by pid 3041

3041:C 29 Jul 02:46:48.607 * DB saved on disk

#6m内存用于写复制

3041:C 29 Jul 02:46:48.608 * RDB: 6 MB of memory used by copy-on-write

//后台保存成功

2809:M 29 Jul 02:46:48.708 * Background saving terminated with success

2809:M 29 Jul 02:46:48.708 * Synchronization with slave 127.0.0.1:6379 succeeded

查看sentinel日志

6379sentinel.log

//减去6379服务的主观下线状态

2908:X 29 Jul 02:46:37.610 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

//转换为master6381 下的从服务

2908:X 29 Jul 02:46:47.628 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

6380sentinel.log

2937:X 29 Jul 02:46:37.767 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

6381sentinel.log

2961:X 29 Jul 02:46:38.023 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381

再看下6381这个新master的整个备份信息

# Replication

role:master

connected_slaves:2

slave0:ip=127.0.0.1,port=6380,state=online,offset=1348132,lag=0

slave1:ip=127.0.0.1,port=6379,state=online,offset=1348132,lag=0

master_replid:514edab0972b4b6e5388edc4f14fbdb4d223d39e

master_replid2:541cd938f43b4f144e647881af409fa1884ea5a4

master_repl_offset:1348132

second_repl_offset:617714

repl_backlog_active:1

repl_backlog_size:1048576

repl_backlog_first_byte_offset:299557

repl_backlog_histlen:1048576

新的master 增加了一个slave6379 从服务

我们再搭建前做的redis配置，当故障转移成功后，这些配置会被重写，重写的内容基本都在配置文件的最后

Redis.conf配置文件，多了

# Generated by CONFIG REWRITE

slaveof 127.0.0.1 6381

Redis6381.conf配置文件没有变

# Master-Slave replication. Use slaveof to make a Redis instance a copy of

# slaveof <masterip> <masterport>

slaveof 127.0.0.1 6381

新的master配置，也就是redis_slave6381.conf 已经没有了slaveof 配置

# Master-Slave replication. Use slaveof to make a Redis instance a copy of

# slaveof <masterip> <masterport>

Sentinel.conf 也会发生变化，可以自己去看看

6：再次测试故障转移后的同步功能

之前的master已经不再支持set

127.0.0.1:6379> set name zhangxs

(error) READONLY You can't write against a read only slave.

新master set成功

127.0.0.1:6381> set name zhangxs

OK

127.0.0.1:6379> get name

"zhangxs"

127.0.0.1:6380> get name

"zhangxs"

同步没问题。

转移后的服务器变成了

服务角色	端口	Redis.conf名称	sentinel配置文件名称	sentinel端口	redis日志路径	sentinel路劲
从(master)	6379	redis.conf	sentinel.conf	26379	/home/zhangxs/data/redislog/redis_server/master.log	/home/zhangxs/data/redislog/sentinel/sentinel6379.log
从(slave)	6380	redis_slave6380.conf	Sentinel6380.conf	26380	/home/zhangxs/data/redislog/redis_server/slave6380.log	/home/zhangxs/data/redislog/sentinel/sentinel6380.log
主(slave)	6381	redis_slave6381.conf	Sentinel6381.conf	26381	/home/zhangxs/data/redislog/redis_server/slave6381.log	/home/zhangxs/data/redislog/sentinel/sentinel6381.log

参考文档：http://www.redis.cn/topics

Redis-ha(sentinel)搭建的更多相关文章

redis复制+Sentinel搭建
1:实验环境测试环境两台: master:172.16.16.34 slave:172.16.16.35 redis版本:redis3.2 要搭建的环境是,redis简单主从复制 2:安装redis ...
redis单点、redis主从、redis哨兵sentinel，redis集群cluster配置搭建与使用
目录 redis单点.redis主从.redis哨兵 sentinel,redis集群cluster配置搭建与使用 1 .redis 安装及配置 1.1 redis 单点 1.1.2 在命令窗口操作r ...
Redis哨兵(sentinel)模式搭建
一.Sentinel介绍之前骚了一波Redis的简介及应用场景,今天试了下他的哨兵模式: Sentinel是Redis的高可用性(HA)解决方案,由一个或多个Sentinel实例组成的Sentine ...
Redis for OPS 05：哨兵HA Sentinel
写在前面的话上一节的主从环境能够解决我们保证数据安全性的问题,但是却无法解决我们在主节点挂掉的时候服务继续使用的问题,同时也不能自动切换新的主. 我们运维的目的肯定是希望即使主库挂掉一个,服务依旧能 ...
【Redis】使用Redis Sentinel实现Redis HA
阅读目录 1 sentinel down-after-milliseconds mymaster 30000 sentinel failover-timeout mymaster 18000 sent ...
Redis集群部署（redis + cluster + sentinel）
概述说明说明:本次实验采用c1.c2.c3三台虚拟机完成,每台服务器上都部署一个master.一个slave和一个sentinel.当某主节点的挂了,相应的从节点替位:当某主节点及主节点对应的从节点 ...
使用Spring-data-redis操作Redis的Sentinel
介绍 Spring-Data-Redis项目(简称SDR) 是对Redis的Key-Value数据存储操作提供了更高层次的抽象,提供了一个对几种主要的redis的Java客户端(例如:jedis,j ...
Redis之Sentinel高可用安装部署
背景: 之前通过Redis Sentinel 高可用实现说明和Redis 复制.Sentinel的搭建和原理说明大致已经介绍了sentinel的原理和实现,本文再次做个简单的回顾.具体的信息见前面的两 ...
Redis Cluster的搭建与部署，实现redis的分布式方案
前言上篇Redis Sentinel安装与部署,实现redis的高可用实现了redis的高可用,针对的主要是master宕机的情况,我们发现所有节点的数据都是一样的,那么一旦数据量过大,redi也会 ...
Redis笔记-Sentinel哨兵模式
Redis以主从的模式搭建集群后,如果主节点Master挂掉,虽然可以实现将备用节点Slave切换成主节点,但是Redis本身并没有自动监控机制,需要借助Sentinel哨兵模式,实现监控并实现自动切 ...

随机推荐

Android进阶笔记：AIDL内部实现详解（二）
接着上一篇分析的aidl的流程解析.知道了aidl主要就是利用Ibinder来实现跨进程通信的.既然是通过对Binder各种方法的封装,那也可以不使用aidl自己通过Binder来实现跨进程通讯.那么 ...
基于Nginx的开墙方案
Kubernetes集群内部通过服务名能进行相互调用,但如果Kubernetes中的pod需要调用外部服务,而且这些外部服务是属于不同的安全区域,就面临开墙的问题,因为Kubernetes Pod能够 ...
非docker的jenkins的master如何使用docker的jenkins的slave
前提 1.存在jenkins的master,这个master不是docker的,是通过yum install jenkins安装的 2.使用docker创建n个jenkins,方法是docker pu ...
用google mock模拟C++对象
google mock是用来配合google test对C++项目做单元测试的.它依赖于googletest(参见我上篇文章<如何用googletest写单元测试>: http://blo ...
【转载】C/C++语言分析 & 每年学一种编程语言 & git历史
http://blog.csdn.net/turingbook/article/details/1778867 <程序员修炼之路>英文注释版作者提出的经营之道是:——Invest Reg ...
安装pip源
国内源地址: 豆瓣(douban) http://pypi.douban.com/simple/ 阿里云 http://mirrors.aliyun.com/pypi/simple/ 中国科技大学 h ...
Axios使用文档总结
vue更新到2.0之后,作者就宣告不再对vue-resource更新,而是推荐的axios.Axios 是一个基于 promise 的 HTTP 库,可以用在浏览器和 node.js 中. 一.安装 ...
sqlite 二进制字段 (zz)
有时我们用数据库存储文件,需要用到二进制字段,下面列常用方法. 1.写二进制数据 sqlite3 * db; int result; char **errmsg =NULL; result = sql ...
LeetCode——3Sum & 3Sum Closest
3Sum 题目 Given an array S of n integers, are there elements a,b,c in S such that a + b + c = 0? Find ...
MAVEN 配置阿里巴巴镜像
配置修改maven根目录下的conf文件夹中的setting.xml文件,内容如下: <mirrors> <mirror> <id>alimaven</id ...

Redis-ha(sentinel)搭建

Redis-ha(sentinel)搭建的更多相关文章

随机推荐

热门专题