Redis Cluster 自动化安装,扩容和缩容

之前写过一篇基于python的redis集群自动化安装的实现,基于纯命令的集群实现还是相当繁琐的,因此官方提供了redis-trib.rb这个工具
虽然官方的的redis-trib.rb提供了集群创建、 检查、 修复、均衡等命令行工具,之所个人接受不了redis-trib.rb,原因在于redis-trib.rb无法自定义实现集群中节点的主从关系。
比如ABCDEF6个节点,在创建集群的过程中必然要明确指定哪些是主,哪些是从,主从对应关系,可惜通过redis-trib.rb无法自定义控制,参考如下截图。
更多的时候,是需要明确指明哪些机器作为主节点,哪些作为从节点,redis-trib.rb做不到自动控制集群中的哪些机器(实例)作为主,哪些机器(实例)作为从。
如果使用redis-trib.rb,还需要解决ruby的环境依赖,因此个人不太接受使用redis-trib.rb搭建集群。

引用《Redis开发与运维》里面的原话:
如果部署节点使用不同的IP地址, redis-trib.rb会尽可能保证主从节点不分配在同一机器下, 因此会重新排序节点列表顺序。
节点列表顺序用于确定主从角色, 先主节点之后是从节点。
这说明:使用redis-trib.rb是无法人为地完全控制主从节点的分配的。

后面redis 5.0版本的Redis-cli --cluster已经实现了集群的创建,无需依赖redis-trib.rb,包括ruby环境,redis 5.0版本Redis-cli --cluster本身已经实现了集群等相关功能
但是基于纯命令本身还是比较复杂的,尤其是在较为复杂的生产环境,通过手动方式来创建集群,扩容或者缩容,会存在一系列的手工操作,以及一些不安全因素。
所以,自动化的集群创建 ,扩容以及缩容是有必要的。

测试环境

这里基于python3,以redis-cli --cluster命令为基础,实现redis自动化集群,自动化扩容,自动化缩容

测试环境以单机多实例为示例,一共8个节点,
1,自动化集群的创建,6各节点(10001~10006)创建为3主(10001~10002)3从(10004~10006)的集群
2,集群的自动化扩容,增加新节点10007为主节点,同时添加10008为10007节点的slave节点
3,集群的自动化缩容,与2相反,移除集群中的10007以及其slave的10008节点

Redis集群创建

集群的本质是执行两组命令,一个是将主节点加入到集群中,一个是依次对主节点添加slave节点。
但是期间会涉及到找到各个节点id的逻辑,因此手动实现的话,比较繁琐。
主要命令如下:

################# create cluster #################
redis-cli --cluster create 127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003 -a ****** --cluster-yes
################# add slave nodes #################
redis-cli --cluster add-node 127.0.0.1:10004 127.0.0.1:10001 --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******
redis-cli --cluster add-node 127.0.0.1:10005 127.0.0.1:10002 --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******
redis-cli --cluster add-node 127.0.0.1:10006 127.0.0.1:10003 --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******

这里使用python创建的过程中打印出来redis-cli --cluster 命令的日志信息

[root@JD redis_install]# python3 create_redis_cluster.py
################# flush master/slave slots #################
################# create cluster #################
redis-cli --cluster create 127.0.0.1: 127.0.0.1: 127.0.0.1: -a ****** --cluster-yes
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on nodes...
Master[] -> Slots -
Master[] -> Slots -
Master[] -> Slots -
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 127.0.0.1:)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All slots covered. ################# add slave nodes #################
redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1: to cluster 127.0.0.1:
>>> Performing Cluster Check (using node 127.0.0.1:)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.
Waiting for the cluster to join >>> Configure node as replica of 127.0.0.1:.
[OK] New node added correctly. redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1: to cluster 127.0.0.1:
>>> Performing Cluster Check (using node 127.0.0.1:)
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:
slots: ( slots) slave
replicates 6164025849a8ff9297664fc835bc851af5004f61
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.
Waiting for the cluster to join >>> Configure node as replica of 127.0.0.1:.
[OK] New node added correctly. redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1: to cluster 127.0.0.1:
>>> Performing Cluster Check (using node 127.0.0.1:)
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:
slots: ( slots) slave
replicates 64e634307bdc339b503574f5a77f1b156c021358
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:
slots: ( slots) slave
replicates 6164025849a8ff9297664fc835bc851af5004f61
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.
Waiting for the cluster to join >>> Configure node as replica of 127.0.0.1:.
[OK] New node added correctly. ################# cluster nodes info: #################
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:@ myself,master - connected -
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:@ master - connected -
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:@ slave 64e634307bdc339b503574f5a77f1b156c021358 connected
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:@ master - connected -
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:@ slave 6164025849a8ff9297664fc835bc851af5004f61 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:@ slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a connected [root@JD redis_install]#

Redis集群扩容

redis扩容主要分为两步:
1,增加主节点,同时为主节点增加从节点。
2,重新分配slot到新增加的master节点上。

主要命令如下:

增加主节点到集群中
redis-cli --cluster add-node 127.0.0.1:10007 127.0.0.1:10001 -a ******
为增加的主节点添加从节点
redis-cli --cluster add-node 127.0.0.1:10008 127.0.0.1:10007 --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******

重新分片slot
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10001 --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10002 --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10003 --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1

################# cluster nodes info: #################
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575960493000 64 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575960493849 66 connected
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575960494852 65 connected 6826-10922
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575960492000 65 connected
4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:10008@20008 slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 0 1575960493000 67 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575960493000 66 connected 12288-16383
3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007@20007 myself,master - 0 1575960493000 67 connected 0-1364 5461-6825 10923-12287
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575960492848 64 connected 1365-5460
可见新加的节点成功重新分配了slot,集群扩容成功。

这里有几个需要注意的两个问题,如果是自动化安装的话:
1,add-node之后(不管是柱节点还是从节点),要sleep足够长的时间(这里是20秒),让集群中所有的节点都meet到新节点,否则会扩容失败
2,新节点的reshard之后要sleep足够长的时间(这里是20秒),否则继续reshard其他节点的slot会导致上一个reshared失败

整个过程如下

[root@JD redis_install]# python3 create_redis_cluster.py
#########################cleanup instance#################################
#########################add node into cluster#################################
redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: -a redis@password
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1: to cluster 127.0.0.1:
>>> Performing Cluster Check (using node 127.0.0.1:)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:
slots: ( slots) slave
replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:
slots: ( slots) slave
replicates 6164025849a8ff9297664fc835bc851af5004f61
S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:
slots: ( slots) slave
replicates 64e634307bdc339b503574f5a77f1b156c021358
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.
[OK] New node added correctly. redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 127.0.0.1: to cluster 127.0.0.1:
>>> Performing Cluster Check (using node 127.0.0.1:)
M: 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:
slots: ( slots) master
S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:
slots: ( slots) slave
replicates 6164025849a8ff9297664fc835bc851af5004f61
S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:
slots: ( slots) slave
replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a
M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:
slots: ( slots) slave
replicates 64e634307bdc339b503574f5a77f1b156c021358
M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:
slots:[-] ( slots) master
additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All slots covered.
>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.
Waiting for the cluster to join >>> Configure node as replica of 127.0.0.1:.
[OK] New node added correctly. #########################reshard slots#################################
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1: --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots --cluster-yes --cluster-timeout --cluster-pipeline --cluster-replace >/dev/null >&
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1: --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots --cluster-yes --cluster-timeout --cluster-pipeline --cluster-replace >/dev/null >&
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1: --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots --cluster-yes --cluster-timeout --cluster-pipeline --cluster-replace >/dev/null >&
################# cluster nodes info: #################
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:@ slave 6164025849a8ff9297664fc835bc851af5004f61 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:@ slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a connected
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:@ master - connected -
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:@ slave 64e634307bdc339b503574f5a77f1b156c021358 connected
4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:@ slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:@ master - connected -
3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:@ myself,master - connected - - -
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:@ master - connected - [root@JD redis_install]#

Redis集群缩容

缩容按道理是扩容的反向操作.
从这个命令就可以看出来:del-node host:port node_id #删除给定的一个节点,成功后关闭该节点服务。
缩容就缩容了,从集群中移除掉(cluster forget nodeid)某个主节点就行了,为什么还要关闭?因此本文不会采用redis-cli --cluster del-node的方式缩容,而是通过普通命令行来缩容。

这里的自定义缩容实质上分两步
1,将移除的主节点的slot分配回集群中其他节点,这里测试四个主节点缩容为三个主节点,实际上执行命令如下。
2,集群中的节点依次执行cluster forget master_node_id(slave_node_id)

############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1

{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12

完整代码如下

[root@JD redis_install]# python3 create_redis_cluster.py
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
################# cluster nodes info: #################
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575968426000 76 connected
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575968422619 75 connected
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 myself,master - 0 1575968426000 75 connected 0-5460
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575968425000 77 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575968427626 77 connected 10923-16383
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575968426000 76 connected 5461-10922

[root@JD redis_install]#

其实到这里并没有结束,这里要求缩容之后集群中的所有节点都要成功地执行cluster forget master_node_id(和slave_node_id)
否则其他节点仍然有10007节点的心跳信息,超过1分钟之后,仍旧会将已经踢出集群的10007节点(以及从节点10008)会被添加回来
这就一开始就遇到一个奇葩问题,因为没有在缩容后的集群的slave节点上执行cluster forget,被移除的节点,会不断地被添加回来……。
参考这里:http://www.redis.cn/commands/cluster-forget.html

完整的代码实现如下

import os
import time
import redis
from time import ctime,sleep def create_redis_cluster(list_master_node,list_slave_node):
print('################# flush master/slave slots #################')
for node in list_master_node:
currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
currenrt_conn.execute_command('flushall')
currenrt_conn.execute_command('cluster reset') for node in list_slave_node:
currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
#currenrt_conn.execute_command('flushall')
currenrt_conn.execute_command('cluster reset') print('################# create cluster #################')
master_nodes = ''
for node in list_master_node:
master_nodes = master_nodes + node["host"] + ':' + str(node["port"]) + ' '
command = "redis-cli --cluster create {0} -a ****** --cluster-yes".format(master_nodes)
print(command)
msg = os.system(command)
print(msg)
time.sleep(5) print('################# add slave nodes #################')
counter = 0
for node in list_master_node:
currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
current_master_node = node["host"] + ':' + str(node["port"])
current_slave_node = list_slave_node[counter]["host"] + ':' + str(list_slave_node[counter]["port"])
myid = currenrt_conn.cluster('myid')
#slave 节点在前,master节点在后
command = "redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(current_slave_node,current_master_node,myid)
print(command)
msg = os.system(command)
counter = counter + 1
print(msg)
# show cluster nodes info
time.sleep(10)
print("################# cluster nodes info: #################")
cluster_nodes = currenrt_conn.execute_command('cluster nodes')
print(cluster_nodes) # 返回扩容后,原始节点中,每个主节点需要迁出的slot数量
def get_migrated_slot(list_master_node,n):
migrated_slot_count = int(16384/len(list_master_node)) - int(16384/(len(list_master_node)+n))
return migrated_slot_count def redis_cluster_expansion(list_master_node,dict_master_node,dict_slave_node):
new_master_node = dict_master_node["host"] + ':' + str(dict_master_node["port"])
new_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"]) print("#########################cleanup instance#################################")
new_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True)
new_master_conn.execute_command('flushall')
new_master_conn.execute_command('cluster reset')
new_master_id = new_master_conn.cluster('myid') new_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True)
new_slave_conn.execute_command('cluster reset')
new_slave_id = new_slave_conn.cluster('myid')
#new_slave_conn.execute_command('slaveof no one') # 判断新增的节点是否归属于当前集群,
# 如果已经归属于当前集群且不占用slot,则先踢出当前集群 cluster forget nodeid,或者终止,给出告警,总之,怎么开心怎么来
# 登录集群中的任何一个节点
cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True)
dict_node_info = cluster_node_conn.cluster('nodes')
'''dict_node_info format example :
{
'127.0.0.1:10008@20008': {'node_id': '1d10c3ce3b9b7f956a26122980827fe6ce623d22', 'flags': 'master', 'master_id': '-','last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True},
'127.0.0.1:10002@20002': {'node_id': '64e634307bdc339b503574f5a77f1b156c021358', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '7', 'slots': [['5461', '10922']], 'connected': True},
'127.0.0.1:10001@20001': {'node_id': '6164025849a8ff9297664fc835bc851af5004f61', 'flags': 'myself,master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599438000', 'epoch': '6', 'slots': [['0', '5460']], 'connected': True},
'127.0.0.1:10007@20007': {'node_id': '307f589ec7b1eb7bd65c680527afef1e30ce2303', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599443599', 'epoch': '5', 'slots': [], 'connected': True},
'127.0.0.1:10005@20005': {'node_id': '23e1871c4e1dc1047ce567326e74a6194589146c', 'flags': 'slave', 'master_id': '64e634307bdc339b503574f5a77f1b156c021358', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599441000', 'epoch': '7', 'slots': [], 'connected': True},
'127.0.0.1:10004@20004': {'node_id': '026f0179631f50ca858d46c2b2829b3af71af2c8', 'flags': 'slave', 'master_id': '6164025849a8ff9297664fc835bc851af5004f61', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599440000', 'epoch': '6', 'slots': [], 'connected': True},
'127.0.0.1:10006@20006': {'node_id': '9f265545ebb799d2773cfc20c71705cff9d733ae', 'flags': 'slave', 'master_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True},
'127.0.0.1:10003@20003': {'node_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442599', 'epoch': '8', 'slots': [['10923', '16383']], 'connected': True}
}
'''
dict_master_node_in_cluster = 0
dict_slave_node_in_cluster = 0 for key_node in dict_node_info:
if new_master_node in key_node:
dict_master_node_in_cluster = 1
if len(dict_node_info[key_node]['slots']) > 0:
print('error: ' +new_master_node + ' already existing in cluster and alloted slots,execute break......')
return
if new_slave_node in key_node:
dict_slave_node_in_cluster = 1
if len(dict_node_info[key_node]['slots']) > 0:
print('error: ' +new_slave_node + ' already existing in cluster and alloted slots,execute break......')
return if dict_master_node_in_cluster == 1:
for master_node in list_master_node:
key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)
print('waring: ' + new_master_node + ' already existing in cluster,cluster forget it......')
forget_command = 'cluster forget {0}'.format(new_master_id)
key_node_conn.execute_command(forget_command)
if dict_slave_node_in_cluster == 1:
for master_node in list_master_node:
key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)
print('waring: ' + new_slave_node + ' already existing in cluster,forget it......')
forget_command = 'cluster forget {0}'.format(new_slave_id)
key_node_conn.execute_command(forget_command) print("#########################add node into cluster#################################")
try:
cluster_node = list_master_node[0]["host"] + ':' + str(list_master_node[0]["port"])
# 1,待加入节点在前,第二个节点为集群中的任意一个节点
add_node_command = " redis-cli --cluster add-node {0} {1} -a ****** ".format(new_master_node,cluster_node)
print(add_node_command)
print(os.system(add_node_command))
time.sleep(20)
# slave 节点在前,master节点在后
add_node_command = " redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(new_slave_node,new_master_node,new_master_id)
print(add_node_command)
print(os.system(add_node_command))
time.sleep(20)
except Exception as e:
print('add new node error,the reason is:')
print(e) print("#########################reshard slots#################################")
migrated_slot_count = get_migrated_slot(list_master_node,1)
for node in list_master_node:
current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
current_master_node = node["host"] + ':' + str(node["port"])
current_master_node_id = current_master_conn.cluster('myid')
'''
example:3节点-->扩容4节点,每个迁移1365
'''
try:
command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots {3} --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1 '''. format(current_master_node,current_master_node_id,new_master_id,migrated_slot_count)
print('############################ execute reshard #########################################')
print(command)
msg = os.system(command)
time.sleep(20)
except Exception as e:
print('reshard slots error,the reason is:')
print(e) print("################# cluster nodes info: #################")
cluster_nodes = new_master_conn.execute_command('cluster nodes')
print(cluster_nodes) def redis_cluster_shrinkage(list_master_node,list_slave_node,dict_master_node,dict_slave_node):
# 判断新增的节点是否归属于当前集群,
# 如果不归属当前集群,则退出
cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True)
dict_node_info = cluster_node_conn.cluster('nodes') removed_master_node = dict_master_node["host"] + ':' + str(dict_master_node["port"])+'@'+str(dict_master_node["port"]+10000)
removed_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"])+'@'+str(dict_slave_node["port"]+10000) if not removed_master_node in dict_node_info.keys():
print('Error:'+ str(removed_master_node) +' not in cluster,exiting')
return
if not removed_slave_node in dict_node_info.keys():
print('Error:' + str(removed_slave_node) + ' not in cluster,exiting')
return removed_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True)
removed_master_id = removed_master_conn.cluster('myid')
removed_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True)
removed_slave_id = removed_slave_conn.cluster('myid') for node in list_master_node:
current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)
current_master_node = node["host"] + ':' + str(node["port"])
current_master_node_id = current_master_conn.cluster('myid')
'''
4节点-->缩容3节点,平均将slot归还到三个master节点
'''
try:
command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1 '''.\
format(current_master_node, removed_master_id, current_master_node_id)
print('############################ execute reshard #########################################')
print(command)
msg = os.system(command)
time.sleep(10)
except Exception as e:
print('reshard slots error,the reason is:')
print(e) removed_master_conn.execute_command('cluster reset')
removed_slave_conn.execute_command('cluster reset') for master_node in list_master_node:
master_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)
foget_master_command = 'cluster forget {0}'.format(removed_master_id)
foget_slave_command = 'cluster forget {0}'.format(removed_slave_id)
print(str(master_node)+ '--->' + foget_master_command)
print(str(master_node)+ '--->' + foget_slave_command)
master_node_conn.execute_command(foget_master_command)
master_node_conn.execute_command(foget_slave_command) for slave_node in list_slave_node:
slave_node_conn = redis.StrictRedis(host=slave_node["host"], port=slave_node["port"], password=slave_node["password"], decode_responses=True)
foget_master_command = 'cluster forget {0}'.format(removed_master_id)
foget_slave_command = 'cluster forget {0}'.format(removed_slave_id)
print(str(slave_node)+ '--->' +foget_master_command)
print(str(slave_node)+ '--->' +foget_slave_command)
slave_node_conn.execute_command(foget_master_command)
slave_node_conn.execute_command(foget_slave_command) print("################# cluster nodes info: #################")
cluster_nodes = cluster_node_conn.execute_command('cluster nodes')
print(cluster_nodes) if __name__ == '__main__':
# master
node_1 = {'host': '127.0.0.1', 'port': 10001, 'password': '******'}
node_2 = {'host': '127.0.0.1', 'port': 10002, 'password': '******'}
node_3 = {'host': '127.0.0.1', 'port': 10003, 'password': '******'}
# slave
node_4 = {'host': '127.0.0.1', 'port': 10004, 'password': '******'}
node_5 = {'host': '127.0.0.1', 'port': 10005, 'password': '******'}
node_6 = {'host': '127.0.0.1', 'port': 10006, 'password': '******'}
# 主从节点个数必须相同
list_master_node = [node_1, node_2, node_3]
list_slave_node = [node_4, node_5, node_6] # 自动化集群创建
#create_redis_cluster(list_master_node,list_slave_node) # 自动化扩容
node_1 = {'host': '127.0.0.1', 'port': 10007, 'password': '******'}
node_2 = {'host': '127.0.0.1', 'port': 10008, 'password': '******'}
redis_cluster_expansion(list_master_node,node_1,node_2) # 自动化缩容,
#redis_cluster_shrinkage(list_master_node,list_slave_node,node_1,node_2)

 

参考:https://www.cnblogs.com/zhoujinyi/p/11606935.html

Redis Cluster 自动化安装,扩容和缩容的更多相关文章

  1. Redis Cluster 集群扩容与收缩

    http://blog.csdn.net/men_wen/article/details/72896682 Redis 学习笔记(十五)Redis Cluster 集群扩容与收缩 标签: redis集 ...

  2. k8s Pod 扩容和缩容

    在生产环境下,在面临服务需要扩容的场景时,可以使用Deployment/RC的Scale机制来实现.Kubernetes支持对Pod的手动扩容和自动扩容. 手动扩容缩容 通过执行扩容命令,对某个dep ...

  3. Kubernetes 笔记 012 Pod 的自动扩容与缩容

    本文首发于我的公众号 Linux云计算网络(id: cloud_dev),专注于干货分享,号内有 10T 书籍和视频资源,后台回复「1024」即可领取,欢迎大家关注,二维码文末可以扫. Hi,大家好, ...

  4. 023.掌握Pod-Pod扩容和缩容

    一 Pod的扩容和缩容 Kubernetes对Pod的扩缩容操作提供了手动和自动两种模式,手动模式通过执行kubectl scale命令或通过RESTful API对一个Deployment/RC进行 ...

  5. Kubernetes 笔记 11 Pod 扩容与缩容 双十一前后的忙碌

    本文首发于我的公众号 Linux云计算网络(id: cloud_dev),专注于干货分享,号内有 10T 书籍和视频资源,后台回复「1024」即可领取,欢迎大家关注,二维码文末可以扫. Hi,大家好, ...

  6. Kubernetes---Pod的扩容和缩容

    用RC的Scale机制来实现Pod的扩容和缩容 把redis-slave的pod扩展到3个  ,  kubectl scale rc redis-slave --replicas=3 现在来缩容,把 ...

  7. Docker Kubernetes 容器扩容与缩容

    Docker Kubernetes 容器扩容与缩容 环境: 系统:Centos 7.4 x64 Docker版本:18.09.0 Kubernetes版本:v1.8 管理节点:192.168.1.79 ...

  8. docker微服务部署之:七、Rancher进行微服务扩容和缩容

    docker微服务部署之:六.Rancher管理部署微服务 Rancher有两个特色用起来很方便,那就是扩容和缩容. 一.扩容前的准备工作 为了能直观的查看效果,需要修改下demo_article项目 ...

  9. 生产调优4 HDFS-集群扩容及缩容(含服务器间数据均衡)

    目录 HDFS-集群扩容及缩容 添加白名单 配置白名单的步骤 二次配置白名单 增加新服务器 需求 环境准备 服役新节点具体步骤 问题1 服务器间数据均衡 问题2 105是怎么关联到集群的 服务器间数据 ...

随机推荐

  1. 史上最全的excel读写技术分享

    目录 简介 导出excel常用的几种方法 POI CSV jxl jxls easyexcel 快速入门 代码解读 总结 常用API 单元格样式 合并单元格 数据样式 多sheet设置 单元格添加超链 ...

  2. P2579 [ZJOI2005]沼泽鳄鱼(邻接矩阵,快速幂)

    题目简洁明了(一点都不好伐) 照例,化简题目 给一张图,每一个时间点有一些点不能走,(有周期性),求从起点第k秒恰好在终点的方案数,可重复,不可停留. 额dp实锤 于是就被打脸了.... 有一种东西叫 ...

  3. 【解决】Got permission denied while trying to connect to the Docker daemon socket at......dial unix /var/run/docker.sock: permission denied

    >>> 问题:搭建Portainer时,选择本地连接报错? >>>分析: 根据报错信息可知是权限问题. 可能原因一:使用了非root用户启用或连接docker &g ...

  4. linux下 驱动模块编译步骤

    本文将直接了当的带你进入linux的模块编译.当然在介绍的过程当中,我也会添加一些必要的注释,以便初学者能够看懂.之所以要写这篇文章,主要是因为从书本上学的话,可能要花更长的时间才能学会整个过程,因为 ...

  5. 你了解MySQL的加锁规则吗?

    注:加锁规则指的是next-key lock,如果还不了解next-key lock,请阅读上一篇博客 加锁规则可以概括为:两个原则.两个优化和一个bug: 原则1:加锁的基本单位是next-key ...

  6. css3 transform做动画

    css3 transform做动画第一种用关键帧 这里就不说了 就说第二种方法用 transition属性 ps:1jquery anim不支持transform动画 但css还是支. 2 css3关 ...

  7. 使用 Scrapy 爬取去哪儿网景区信息

    Scrapy 是一个使用 Python 语言开发,为了爬取网站数据,提取结构性数据而编写的应用框架,它用途广泛,比如:数据挖掘.监测和自动化测试.安装使用终端命令 pip install Scrapy ...

  8. C语言程序设计100例之(14):丑数

    例14   丑数 问题描述 丑数是其质因子只可能是2,3或5的数.前10个丑数分别为1, 2, 3, 4, 5, 6, 8, 9, 10, 12.输入一个正整数n,求第n个丑数. 输入格式 每行为一个 ...

  9. thinkphp 5.1 去掉 .html 后缀

    thinkphp 5.1 去掉 .html 后缀  

  10. poj 3281 Dining (Dinic)

    Dining Time Limit: 2000MS   Memory Limit: 65536K Total Submissions: 22572   Accepted: 10015 Descript ...