Redis Cluster 自动化安装，扩容和缩容

之前写过一篇基于python的redis集群自动化安装的实现，基于纯命令的集群实现还是相当繁琐的，因此官方提供了redis-trib.rb这个工具
虽然官方的的redis-trib.rb提供了集群创建、检查、修复、均衡等命令行工具，之所个人接受不了redis-trib.rb，原因在于redis-trib.rb无法自定义实现集群中节点的主从关系。
比如ABCDEF6个节点，在创建集群的过程中必然要明确指定哪些是主，哪些是从，主从对应关系，可惜通过redis-trib.rb无法自定义控制，参考如下截图。
更多的时候，是需要明确指明哪些机器作为主节点，哪些作为从节点，redis-trib.rb做不到自动控制集群中的哪些机器（实例）作为主，哪些机器（实例）作为从。
如果使用redis-trib.rb，还需要解决ruby的环境依赖，因此个人不太接受使用redis-trib.rb搭建集群。

引用《Redis开发与运维》里面的原话：
如果部署节点使用不同的IP地址， redis-trib.rb会尽可能保证主从节点不分配在同一机器下，因此会重新排序节点列表顺序。
节点列表顺序用于确定主从角色，先主节点之后是从节点。
这说明：使用redis-trib.rb是无法人为地完全控制主从节点的分配的。

后面redis 5.0版本的Redis-cli --cluster已经实现了集群的创建，无需依赖redis-trib.rb，包括ruby环境，redis 5.0版本Redis-cli --cluster本身已经实现了集群等相关功能
但是基于纯命令本身还是比较复杂的,尤其是在较为复杂的生产环境，通过手动方式来创建集群，扩容或者缩容，会存在一系列的手工操作，以及一些不安全因素。
所以，自动化的集群创建，扩容以及缩容是有必要的。

测试环境

这里基于python3，以redis-cli --cluster命令为基础，实现redis自动化集群，自动化扩容，自动化缩容

测试环境以单机多实例为示例，一共8个节点，
1，自动化集群的创建，6各节点（10001~10006）创建为3主（10001~10002）3从（10004~10006）的集群
2，集群的自动化扩容，增加新节点10007为主节点，同时添加10008为10007节点的slave节点
3，集群的自动化缩容，与2相反，移除集群中的10007以及其slave的10008节点

Redis集群创建

集群的本质是执行两组命令，一个是将主节点加入到集群中，一个是依次对主节点添加slave节点。
但是期间会涉及到找到各个节点id的逻辑，因此手动实现的话，比较繁琐。
主要命令如下：

################# create cluster #################
redis-cli --cluster create 127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003 -a ****** --cluster-yes
################# add slave nodes #################
redis-cli --cluster add-node 127.0.0.1:10004 127.0.0.1:10001 --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******
redis-cli --cluster add-node 127.0.0.1:10005 127.0.0.1:10002 --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******
redis-cli --cluster add-node 127.0.0.1:10006 127.0.0.1:10003 --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******

这里使用python创建的过程中打印出来redis-cli --cluster 命令的日志信息

[root@JD redis_install]# python3 create_redis_cluster.py

################# flush master/slave slots #################

################# create cluster #################

redis-cli --cluster create 127.0.0.1: 127.0.0.1: 127.0.0.1:   -a ****** --cluster-yes

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

>>> Performing hash slots allocation on  nodes...

Master[] -> Slots  -

Master[] -> Slots  -

Master[] -> Slots  -

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

>>> Nodes configuration updated

>>> Assign a different config epoch to each node

>>> Sending CLUSTER MEET messages to join the cluster

Waiting for the cluster to join

.

>>> Performing Cluster Check (using node 127.0.0.1:)

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All  slots covered.

################# add slave nodes #################

redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 6164025849a8ff9297664fc835bc851af5004f61 -a ******

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

>>> Adding node 127.0.0.1: to cluster 127.0.0.1:

>>> Performing Cluster Check (using node 127.0.0.1:)

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All  slots covered.

>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.

Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:.

[OK] New node added correctly.

redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 64e634307bdc339b503574f5a77f1b156c021358 -a ******

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

>>> Adding node 127.0.0.1: to cluster 127.0.0.1:

>>> Performing Cluster Check (using node 127.0.0.1:)

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:

   slots: ( slots) slave

   replicates 6164025849a8ff9297664fc835bc851af5004f61

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All  slots covered.

>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.

Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:.

[OK] New node added correctly.

redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a -a ******

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

>>> Adding node 127.0.0.1: to cluster 127.0.0.1:

>>> Performing Cluster Check (using node 127.0.0.1:)

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:

   slots: ( slots) slave

   replicates 64e634307bdc339b503574f5a77f1b156c021358

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:

   slots: ( slots) slave

   replicates 6164025849a8ff9297664fc835bc851af5004f61

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All  slots covered.

>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.

Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:.

[OK] New node added correctly.

################# cluster nodes info: #################

8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:@ myself,master -    connected -

64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:@ master -    connected -

23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:@ slave 64e634307bdc339b503574f5a77f1b156c021358    connected

6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:@ master -    connected -

026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:@ slave 6164025849a8ff9297664fc835bc851af5004f61    connected

9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:@ slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a    connected

[root@JD redis_install]#

Redis集群扩容

redis扩容主要分为两步：
1，增加主节点，同时为主节点增加从节点。
2，重新分配slot到新增加的master节点上。

主要命令如下：

增加主节点到集群中
redis-cli --cluster add-node 127.0.0.1:10007 127.0.0.1:10001 -a ******
为增加的主节点添加从节点
redis-cli --cluster add-node 127.0.0.1:10008 127.0.0.1:10007 --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******

重新分片slot
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10001 --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10002 --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a redis@password --cluster reshard 127.0.0.1:10003 --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1

################# cluster nodes info: #################
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575960493000 64 connected
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575960493849 66 connected
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575960494852 65 connected 6826-10922
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575960492000 65 connected
4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:10008@20008 slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 0 1575960493000 67 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575960493000 66 connected 12288-16383
3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:10007@20007 myself,master - 0 1575960493000 67 connected 0-1364 5461-6825 10923-12287
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 master - 0 1575960492848 64 connected 1365-5460
可见新加的节点成功重新分配了slot，集群扩容成功。

这里有几个需要注意的两个问题，如果是自动化安装的话：
1，add-node之后（不管是柱节点还是从节点），要sleep足够长的时间（这里是20秒），让集群中所有的节点都meet到新节点，否则会扩容失败
2，新节点的reshard之后要sleep足够长的时间（这里是20秒），否则继续reshard其他节点的slot会导致上一个reshared失败

整个过程如下

[root@JD redis_install]# python3 create_redis_cluster.py

#########################cleanup instance#################################

#########################add node into cluster#################################

 redis-cli --cluster add-node 127.0.0.1: 127.0.0.1:  -a redis@password

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

>>> Adding node 127.0.0.1: to cluster 127.0.0.1:

>>> Performing Cluster Check (using node 127.0.0.1:)

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:

   slots: ( slots) slave

   replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:

   slots: ( slots) slave

   replicates 6164025849a8ff9297664fc835bc851af5004f61

S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:

   slots: ( slots) slave

   replicates 64e634307bdc339b503574f5a77f1b156c021358

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All  slots covered.

>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.

[OK] New node added correctly.

 redis-cli --cluster add-node 127.0.0.1: 127.0.0.1: --cluster-slave --cluster-master-id 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 -a ******

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

>>> Adding node 127.0.0.1: to cluster 127.0.0.1:

>>> Performing Cluster Check (using node 127.0.0.1:)

M: 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:

   slots: ( slots) master

S: 026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:

   slots: ( slots) slave

   replicates 6164025849a8ff9297664fc835bc851af5004f61

S: 9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:

   slots: ( slots) slave

   replicates 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a

M: 64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

S: 23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:

   slots: ( slots) slave

   replicates 64e634307bdc339b503574f5a77f1b156c021358

M: 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

M: 6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:

   slots:[-] ( slots) master

    additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All  slots covered.

>>> Send CLUSTER MEET to node 127.0.0.1: to make it join the cluster.

Waiting for the cluster to join

>>> Configure node as replica of 127.0.0.1:.

[OK] New node added correctly.

#########################reshard slots#################################

############################ execute reshard #########################################

redis-cli -a redis@password --cluster reshard 127.0.0.1: --cluster-from 6164025849a8ff9297664fc835bc851af5004f61 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots  --cluster-yes --cluster-timeout  --cluster-pipeline    --cluster-replace  >/dev/null >&

############################ execute reshard #########################################

redis-cli -a redis@password --cluster reshard 127.0.0.1: --cluster-from 64e634307bdc339b503574f5a77f1b156c021358 --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots  --cluster-yes --cluster-timeout  --cluster-pipeline    --cluster-replace  >/dev/null >&

############################ execute reshard #########################################

redis-cli -a redis@password --cluster reshard 127.0.0.1: --cluster-from 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-to 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-slots  --cluster-yes --cluster-timeout  --cluster-pipeline    --cluster-replace  >/dev/null >&

################# cluster nodes info: #################

026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:@ slave 6164025849a8ff9297664fc835bc851af5004f61    connected

9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:@ slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a    connected

64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:@ master -    connected -

23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:@ slave 64e634307bdc339b503574f5a77f1b156c021358    connected

4854375c501c3dbfb4e2d94d50e62a47520c4f12 127.0.0.1:@ slave 3645e00a8ec3a902bd6effb4fc20c56a00f2c982    connected

8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:@ master -    connected -

3645e00a8ec3a902bd6effb4fc20c56a00f2c982 127.0.0.1:@ myself,master -    connected - - -

6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:@ master -    connected -

[root@JD redis_install]#

Redis集群缩容

缩容按道理是扩容的反向操作.
从这个命令就可以看出来：del-node host:port node_id #删除给定的一个节点，成功后关闭该节点服务。
缩容就缩容了，从集群中移除掉（cluster forget nodeid）某个主节点就行了，为什么还要关闭？因此本文不会采用redis-cli --cluster del-node的方式缩容，而是通过普通命令行来缩容。

这里的自定义缩容实质上分两步
1，将移除的主节点的slot分配回集群中其他节点，这里测试四个主节点缩容为三个主节点，实际上执行命令如下。
2，集群中的节点依次执行cluster forget master_node_id(slave_node_id)

############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1

{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12

完整代码如下

[root@JD redis_install]# python3 create_redis_cluster.py
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10001 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 6164025849a8ff9297664fc835bc851af5004f61 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10002 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 64e634307bdc339b503574f5a77f1b156c021358 --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
############################ execute reshard #########################################
redis-cli -a ****** --cluster reshard 127.0.0.1:10003 --cluster-from 3645e00a8ec3a902bd6effb4fc20c56a00f2c982 --cluster-to 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000 --cluster-replace >/dev/null 2>&1
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10001, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10002, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10003, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10004, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10005, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 3645e00a8ec3a902bd6effb4fc20c56a00f2c982
{'host': '127.0.0.1', 'port': 10006, 'password': '******'}--->cluster forget 4854375c501c3dbfb4e2d94d50e62a47520c4f12
################# cluster nodes info: #################
23e1871c4e1dc1047ce567326e74a6194589146c 127.0.0.1:10005@20005 slave 64e634307bdc339b503574f5a77f1b156c021358 0 1575968426000 76 connected
026f0179631f50ca858d46c2b2829b3af71af2c8 127.0.0.1:10004@20004 slave 6164025849a8ff9297664fc835bc851af5004f61 0 1575968422619 75 connected
6164025849a8ff9297664fc835bc851af5004f61 127.0.0.1:10001@20001 myself,master - 0 1575968426000 75 connected 0-5460
9f265545ebb799d2773cfc20c71705cff9d733ae 127.0.0.1:10006@20006 slave 8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 0 1575968425000 77 connected
8b75325c59a7242344d0ebe5ee1e0068c66ffa2a 127.0.0.1:10003@20003 master - 0 1575968427626 77 connected 10923-16383
64e634307bdc339b503574f5a77f1b156c021358 127.0.0.1:10002@20002 master - 0 1575968426000 76 connected 5461-10922

[root@JD redis_install]#

其实到这里并没有结束，这里要求缩容之后集群中的所有节点都要成功地执行cluster forget master_node_id(和slave_node_id)
否则其他节点仍然有10007节点的心跳信息，超过1分钟之后，仍旧会将已经踢出集群的10007节点(以及从节点10008)会被添加回来
这就一开始就遇到一个奇葩问题，因为没有在缩容后的集群的slave节点上执行cluster forget，被移除的节点，会不断地被添加回来……。
参考这里：http://www.redis.cn/commands/cluster-forget.html

完整的代码实现如下

import os

import time

import redis

from time import ctime,sleep

def create_redis_cluster(list_master_node,list_slave_node):

    print('################# flush master/slave slots #################')

    for node in list_master_node:

        currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)

        currenrt_conn.execute_command('flushall')

        currenrt_conn.execute_command('cluster reset')

    for node in list_slave_node:

        currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)

        #currenrt_conn.execute_command('flushall')

        currenrt_conn.execute_command('cluster reset')

    print('################# create cluster #################')

    master_nodes = ''

    for node in list_master_node:

        master_nodes = master_nodes + node["host"] + ':' + str(node["port"]) + ' '

    command = "redis-cli --cluster create {0}  -a ****** --cluster-yes".format(master_nodes)

    print(command)

    msg = os.system(command)

    print(msg)

    time.sleep(5)

    print('################# add slave nodes #################')

    counter = 0

    for node in list_master_node:

        currenrt_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)

        current_master_node = node["host"] + ':' + str(node["port"])

        current_slave_node = list_slave_node[counter]["host"] + ':' + str(list_slave_node[counter]["port"])

        myid = currenrt_conn.cluster('myid')

        #slave 节点在前，master节点在后

        command = "redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(current_slave_node,current_master_node,myid)

        print(command)

        msg = os.system(command)

        counter = counter + 1

        print(msg)

    # show cluster nodes info

    time.sleep(10)

    print("################# cluster nodes info: #################")

    cluster_nodes = currenrt_conn.execute_command('cluster nodes')

    print(cluster_nodes)

# 返回扩容后，原始节点中，每个主节点需要迁出的slot数量

def get_migrated_slot(list_master_node,n):

    migrated_slot_count = int(16384/len(list_master_node)) - int(16384/(len(list_master_node)+n))

    return migrated_slot_count

def redis_cluster_expansion(list_master_node,dict_master_node,dict_slave_node):

    new_master_node =  dict_master_node["host"] + ':' + str(dict_master_node["port"])

    new_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"])

    print("#########################cleanup instance#################################")

    new_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True)

    new_master_conn.execute_command('flushall')

    new_master_conn.execute_command('cluster reset')

    new_master_id = new_master_conn.cluster('myid')

    new_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True)

    new_slave_conn.execute_command('cluster reset')

    new_slave_id = new_slave_conn.cluster('myid')

    #new_slave_conn.execute_command('slaveof no one')

    # 判断新增的节点是否归属于当前集群，

    # 如果已经归属于当前集群且不占用slot，则先踢出当前集群 cluster forget nodeid,或者终止，给出告警，总之，怎么开心怎么来

    # 登录集群中的任何一个节点

    cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True)

    dict_node_info = cluster_node_conn.cluster('nodes')

    '''dict_node_info format example :

    {

    '127.0.0.1:10008@20008': {'node_id': '1d10c3ce3b9b7f956a26122980827fe6ce623d22', 'flags': 'master', 'master_id': '-','last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True},

    '127.0.0.1:10002@20002': {'node_id': '64e634307bdc339b503574f5a77f1b156c021358', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '7', 'slots': [['5461', '10922']], 'connected': True},

    '127.0.0.1:10001@20001': {'node_id': '6164025849a8ff9297664fc835bc851af5004f61', 'flags': 'myself,master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599438000', 'epoch': '6', 'slots': [['0', '5460']], 'connected': True},

    '127.0.0.1:10007@20007': {'node_id': '307f589ec7b1eb7bd65c680527afef1e30ce2303', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599443599', 'epoch': '5', 'slots': [], 'connected': True},

    '127.0.0.1:10005@20005': {'node_id': '23e1871c4e1dc1047ce567326e74a6194589146c', 'flags': 'slave', 'master_id': '64e634307bdc339b503574f5a77f1b156c021358', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599441000', 'epoch': '7', 'slots': [], 'connected': True},

    '127.0.0.1:10004@20004': {'node_id': '026f0179631f50ca858d46c2b2829b3af71af2c8', 'flags': 'slave', 'master_id': '6164025849a8ff9297664fc835bc851af5004f61', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599440000', 'epoch': '6', 'slots': [], 'connected': True},

    '127.0.0.1:10006@20006': {'node_id': '9f265545ebb799d2773cfc20c71705cff9d733ae', 'flags': 'slave', 'master_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442000', 'epoch': '8', 'slots': [], 'connected': True},

    '127.0.0.1:10003@20003': {'node_id': '8b75325c59a7242344d0ebe5ee1e0068c66ffa2a', 'flags': 'master', 'master_id': '-', 'last_ping_sent': '0', 'last_pong_rcvd': '1575599442599', 'epoch': '8', 'slots': [['10923', '16383']], 'connected': True}

    }

    '''

    dict_master_node_in_cluster = 0

    dict_slave_node_in_cluster = 0

    for key_node in dict_node_info:

        if new_master_node in key_node:

            dict_master_node_in_cluster = 1

            if len(dict_node_info[key_node]['slots']) > 0:

                print('error: ' +new_master_node + ' already existing in cluster and alloted slots,execute break......')

                return

        if new_slave_node in key_node:

            dict_slave_node_in_cluster = 1

            if len(dict_node_info[key_node]['slots']) > 0:

                print('error: ' +new_slave_node + ' already existing in cluster and alloted slots,execute break......')

                return

    if dict_master_node_in_cluster == 1:

        for master_node in list_master_node:

            key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)

            print('waring: ' + new_master_node + ' already existing in cluster,cluster forget it......')

            forget_command = 'cluster forget {0}'.format(new_master_id)

            key_node_conn.execute_command(forget_command)

    if dict_slave_node_in_cluster == 1:

        for master_node in list_master_node:

            key_node_conn = redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)

            print('waring: ' + new_slave_node + ' already existing in cluster,forget it......')

            forget_command = 'cluster forget {0}'.format(new_slave_id)

            key_node_conn.execute_command(forget_command)

    print("#########################add node into cluster#################################")

    try:

        cluster_node = list_master_node[0]["host"] + ':' + str(list_master_node[0]["port"])

        # 1,待加入节点在前，第二个节点为集群中的任意一个节点

        add_node_command = " redis-cli --cluster add-node {0} {1}  -a ****** ".format(new_master_node,cluster_node)

        print(add_node_command)

        print(os.system(add_node_command))

        time.sleep(20)

        # slave 节点在前，master节点在后

        add_node_command = " redis-cli --cluster add-node {0} {1} --cluster-slave --cluster-master-id {2} -a ****** ". format(new_slave_node,new_master_node,new_master_id)

        print(add_node_command)

        print(os.system(add_node_command))

        time.sleep(20)

    except Exception as e:

        print('add new node error,the reason is:')

        print(e)

    print("#########################reshard slots#################################")

    migrated_slot_count = get_migrated_slot(list_master_node,1)

    for node in list_master_node:

        current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)

        current_master_node = node["host"] + ':' + str(node["port"])

        current_master_node_id = current_master_conn.cluster('myid')

        '''

        example:3节点-->扩容4节点，每个迁移1365

        '''

        try:

            command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots {3} --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1 '''. format(current_master_node,current_master_node_id,new_master_id,migrated_slot_count)

            print('############################ execute reshard #########################################')

            print(command)

            msg = os.system(command)

            time.sleep(20)

        except Exception as e:

            print('reshard slots error,the reason is:')

            print(e)

    print("################# cluster nodes info: #################")

    cluster_nodes = new_master_conn.execute_command('cluster nodes')

    print(cluster_nodes)

def redis_cluster_shrinkage(list_master_node,list_slave_node,dict_master_node,dict_slave_node):

    # 判断新增的节点是否归属于当前集群，

    # 如果不归属当前集群，则退出

    cluster_node_conn = redis.StrictRedis(host=list_master_node[0]["host"], port=list_master_node[0]["port"], password=list_master_node[0]["password"],decode_responses=True)

    dict_node_info = cluster_node_conn.cluster('nodes')

    removed_master_node = dict_master_node["host"] + ':' + str(dict_master_node["port"])+'@'+str(dict_master_node["port"]+10000)

    removed_slave_node = dict_slave_node["host"] + ':' + str(dict_slave_node["port"])+'@'+str(dict_slave_node["port"]+10000)

    if not removed_master_node in dict_node_info.keys():

        print('Error:'+ str(removed_master_node) +' not in cluster,exiting')

        return

    if not removed_slave_node in dict_node_info.keys():

        print('Error:' + str(removed_slave_node) + ' not in cluster,exiting')

        return

    removed_master_conn = redis.StrictRedis(host=dict_master_node["host"], port=dict_master_node["port"], password=dict_master_node["password"], decode_responses=True)

    removed_master_id = removed_master_conn.cluster('myid')

    removed_slave_conn = redis.StrictRedis(host=dict_slave_node["host"], port=dict_slave_node["port"], password=dict_slave_node["password"], decode_responses=True)

    removed_slave_id = removed_slave_conn.cluster('myid')

    for node in list_master_node:

        current_master_conn = redis.StrictRedis(host=node["host"], port=node["port"], password=node["password"], decode_responses=True)

        current_master_node = node["host"] + ':' + str(node["port"])

        current_master_node_id = current_master_conn.cluster('myid')

        '''

        4节点-->缩容3节点，平均将slot归还到三个master节点

        '''

        try:

            command = r'''redis-cli -a ****** --cluster reshard {0} --cluster-from {1} --cluster-to {2} --cluster-slots 1365 --cluster-yes --cluster-timeout 50000 --cluster-pipeline 10000   --cluster-replace  >/dev/null 2>&1 '''.\

                format(current_master_node, removed_master_id, current_master_node_id)

            print('############################ execute reshard #########################################')

            print(command)

            msg = os.system(command)

            time.sleep(10)

        except Exception as e:

            print('reshard slots error,the reason is:')

            print(e)

    removed_master_conn.execute_command('cluster reset')

    removed_slave_conn.execute_command('cluster reset')

    for master_node in list_master_node:

        master_node_conn =  redis.StrictRedis(host=master_node["host"], port=master_node["port"],password=master_node["password"], decode_responses=True)

        foget_master_command = 'cluster forget {0}'.format(removed_master_id)

        foget_slave_command = 'cluster forget {0}'.format(removed_slave_id)

        print(str(master_node)+ '--->' + foget_master_command)

        print(str(master_node)+ '--->' + foget_slave_command)

        master_node_conn.execute_command(foget_master_command)

        master_node_conn.execute_command(foget_slave_command)

    for slave_node in list_slave_node:

        slave_node_conn = redis.StrictRedis(host=slave_node["host"], port=slave_node["port"], password=slave_node["password"], decode_responses=True)

        foget_master_command = 'cluster forget {0}'.format(removed_master_id)

        foget_slave_command = 'cluster forget {0}'.format(removed_slave_id)

        print(str(slave_node)+ '--->' +foget_master_command)

        print(str(slave_node)+ '--->' +foget_slave_command)

        slave_node_conn.execute_command(foget_master_command)

        slave_node_conn.execute_command(foget_slave_command)

    print("################# cluster nodes info: #################")

    cluster_nodes = cluster_node_conn.execute_command('cluster nodes')

    print(cluster_nodes)

if __name__ == '__main__':

    # master

    node_1 = {'host': '127.0.0.1', 'port': 10001, 'password': '******'}

    node_2 = {'host': '127.0.0.1', 'port': 10002, 'password': '******'}

    node_3 = {'host': '127.0.0.1', 'port': 10003, 'password': '******'}

    # slave

    node_4 = {'host': '127.0.0.1', 'port': 10004, 'password': '******'}

    node_5 = {'host': '127.0.0.1', 'port': 10005, 'password': '******'}

    node_6 = {'host': '127.0.0.1', 'port': 10006, 'password': '******'}

    # 主从节点个数必须相同

    list_master_node = [node_1, node_2, node_3]

    list_slave_node = [node_4, node_5, node_6]

    # 自动化集群创建

    #create_redis_cluster(list_master_node,list_slave_node)

    # 自动化扩容

    node_1 = {'host': '127.0.0.1', 'port': 10007, 'password': '******'}

    node_2 = {'host': '127.0.0.1', 'port': 10008, 'password': '******'}

    redis_cluster_expansion(list_master_node,node_1,node_2)

    # 自动化缩容，

    #redis_cluster_shrinkage(list_master_node,list_slave_node,node_1,node_2)

参考：https://www.cnblogs.com/zhoujinyi/p/11606935.html

Redis Cluster 自动化安装，扩容和缩容的更多相关文章

Redis Cluster 集群扩容与收缩
http://blog.csdn.net/men_wen/article/details/72896682 Redis 学习笔记(十五)Redis Cluster 集群扩容与收缩标签: redis集 ...
k8s Pod 扩容和缩容
在生产环境下,在面临服务需要扩容的场景时,可以使用Deployment/RC的Scale机制来实现.Kubernetes支持对Pod的手动扩容和自动扩容. 手动扩容缩容通过执行扩容命令,对某个dep ...
Kubernetes 笔记 012 Pod 的自动扩容与缩容
本文首发于我的公众号 Linux云计算网络(id: cloud_dev),专注于干货分享,号内有 10T 书籍和视频资源,后台回复「1024」即可领取,欢迎大家关注,二维码文末可以扫. Hi,大家好, ...
023.掌握Pod-Pod扩容和缩容
一 Pod的扩容和缩容 Kubernetes对Pod的扩缩容操作提供了手动和自动两种模式,手动模式通过执行kubectl scale命令或通过RESTful API对一个Deployment/RC进行 ...
Kubernetes 笔记 11 Pod 扩容与缩容双十一前后的忙碌
本文首发于我的公众号 Linux云计算网络(id: cloud_dev),专注于干货分享,号内有 10T 书籍和视频资源,后台回复「1024」即可领取,欢迎大家关注,二维码文末可以扫. Hi,大家好, ...
Kubernetes---Pod的扩容和缩容
用RC的Scale机制来实现Pod的扩容和缩容把redis-slave的pod扩展到3个 , kubectl scale rc redis-slave --replicas=3 现在来缩容,把 ...
Docker Kubernetes 容器扩容与缩容
Docker Kubernetes 容器扩容与缩容环境: 系统:Centos 7.4 x64 Docker版本:18.09.0 Kubernetes版本:v1.8 管理节点:192.168.1.79 ...
docker微服务部署之：七、Rancher进行微服务扩容和缩容
docker微服务部署之:六.Rancher管理部署微服务 Rancher有两个特色用起来很方便,那就是扩容和缩容. 一.扩容前的准备工作为了能直观的查看效果,需要修改下demo_article项目 ...
生产调优4 HDFS-集群扩容及缩容(含服务器间数据均衡)
目录 HDFS-集群扩容及缩容添加白名单配置白名单的步骤二次配置白名单增加新服务器需求环境准备服役新节点具体步骤问题1 服务器间数据均衡问题2 105是怎么关联到集群的服务器间数据 ...

随机推荐

史上最全的excel读写技术分享
目录简介导出excel常用的几种方法 POI CSV jxl jxls easyexcel 快速入门代码解读总结常用API 单元格样式合并单元格数据样式多sheet设置单元格添加超链 ...
scss新手使用指南
还在用死的css写样式吗?那可太麻烦了,各种长串选择器不说,还有各种继承权重有时候还有可能不生效我的小程序项目也结束了,是时候总结一下scss语法了,毕竟用起来更加方便而且还能精简一点代码,好处多多 ...
Java多线程-CountDownLatch、CyclicBarrier、Semaphore
上次简单了解了多线程中锁的类型,今天要简单了解下多线程并发控制的一些工具类了. 1. 概念说明: CountDownLatch:相当于一个待执行线程计数器,当计数减为零时表示所有待执行线程都已执行完毕 ...
TCP--文件上传
客户端 public class Test2_UpdateClient { public static void main(String[] args) throws UnknownHostExcep ...
Java虚拟机-字节码指令
目录字节码指令字节码与数据类型加载和存储指令运算指令类型转换指令对象创建与访问指令操作数栈管理指令控制转移指令方法调用和返回指令异常处理指令同步指令字节码指令 Java虚拟机的 ...
Laravel框架安装RabbitMQ消息中间件步骤
Laravel5.6 整合 RabbitMQ 消息队列简介: Laravel 队列为不同的后台队列服务提供了统一的 API,例如 Beanstalk,Amazon SQS,Redis,甚至其他基于关 ...
yum 配置文件以及语法
yum的配置文件 #vi /etc/yum.conf [main] cachedir=/var/cache/yum/$basearch/$releasever keepcache= debugleve ...
setState何时同步，何时异步，为什么？
setState何时同步,何时异步,为什么答案:在React库控制时,异步:否则同步. 示例代码如下: constructor(props){ super(porps); this.state = ...
PHP的两种选择防止sql注入
1.使用PDO: $stmt = $pdo->prepare('SELECT * FROM user WHERE name = :name'); $stmt->execute(array( ...
opencv随笔1
图像处理技术一般包括图像压缩,增强和复原,匹配描述和l识别 3 个部分. 图像处理一般指数字图像处理 ( Digitallmage Processing). 其中,数字图像是指用工业相机.摄像机.扫 ...

Redis Cluster 自动化安装，扩容和缩容

Redis Cluster 自动化安装，扩容和缩容的更多相关文章

随机推荐

热门专题