redis analyst (3)- auto deploy redis cluster

在adopt redis cluster之前,第一件需要做的事情是自动化部署redis cluster, 基本流程如下:

一般安装流程:

a. prepare all nodes hardware
b. delete all old configures(such as nodes.conf) and persistence file(such as rdb/aof)
c. auto install all redis package beside ruby
d. change all nodes' configure to enable cluster
e. startup all nodes
f. use redis-trib.rb to create cluster

重建cluster流程:

a. flush all nodes' data
b. cluster reset all nodes.
c. redis-trbie.rb create cluster.

下面提及几个自动部署redis cluster的要点:

(1)关于授权问题:
官方的cluster管理工具/opt/redis/bin/redis-trib.rb不支持密码,所以现在一般文章提到的做法是准备好一批redis nodes后,先不做授权,然后等create cluster之后,逐个将所有的node加密,保存密码。

config set 
config rewrite

但是如果按照这样的自动化的问题是,假设重新部署,需要重新run redis-trib.rb的时候,仍然需要去掉密码,同时使用其他功能时,比如fix, check等也需要去掉密码,比较繁琐。所以自动化这步:
可以继续按照传统的方式来做:先设置密码,后调用redis-trib.rb来创建,这里有两种方法实现:
a)直接修改redis-trib.rb,加上使用密码的功能:
参考文章:
https://trodzen.wordpress.com/2017/02/09/redis-cluster-with-passwords/

b)修改redis-trib.rb调用的ruby lib,加上密码的功能,这样也可以一劳永逸:

/usr/lib/ruby/gems/1.8/gems/redis-3.3.3/lib/redis/client.rb

    DEFAULTS = {
      :url => lambda { ENV["REDIS_URL"] },
      :scheme => "redis",
      :host => "127.0.0.1",
      :port => 6379,
      :path => nil,
      :timeout => 5.0,
      :password => "{{password}}",
      :db => 0,
      :driver => nil,
      :id => nil,
      :tcp_keepalive => 0,
      :reconnect_attempts => 1,
      :inherit_socket => false
    }

(2) 关于cluster的重建
在已有的cluster的基础上,直接重新建立cluster会报错:

echo yes | /opt/redis/bin/redis-trib.rb create --replicas 1 10.224.2.141:8690 10.224.2.142:8690 10.224.2.143:8690 10.224.2.144:8690 10.224.2.145:8690 10.224.2.146:8690
[ERR] Node 10.224.2.141:8690 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0.

顾名思义,这里有2种情况:
a)存在已有数据: 针对这种情况需要清理数据:/opt/redis/bin/redis-cli -p 8690 -a P@ss123 flushall
b)已经是cluster了:针对此情况需要重置cluster: /opt/redis/bin/redis-cli -p 8690 -a P@ss123 cluster reset

同时,还可能遇到这种错误:

opt/redis/bin/redis-cli -h 10.224.2.146 -p 8690 -a P@ss123 cluster reset
["ERR CLUSTER RESET can't be called with master nodes containing keys\n", '\n']

因为需要保持以上2个命令的顺序步骤来做。

(3)关于重新部署的数据清理:
如果重新装包之后,直接启动,仍然会存在一些数据,因为redis cluster可能会存在rdb/aof文件在磁盘上,在启动时,会读取这些文件,所以直接重新装包在原来目录,什么配置都不变情况下,会导致读取过去的数据,所以需要清理掉数据,当然既然是重新部署,所以保存cluster信息的nodes.conf文件也需要清理:

rm -rf /etc/redis/nodes.conf
rm -rf /opt/redis/dump.rdb
rm -rf /opt/redis/appendonly.aof

(4) 关于日志的rotate
既然自动化部署,需要长久运行,需要日志rotate,以防止log越来越多。

1)在redis的配置文件中指定日志文件:

#级别不能设置太高,否则log太多,使用默认即可:
loglevel verbose
logfile "/var/redis/log/redis.log"

2)创建rotate配置:
在/etc/logrotate.d/目录下创建文件,例如redis_log_rotate

//每天归档,保存15天。
/var/redis/log/redis*.log {
    daily
    rotate 15  
    copytruncate
    delaycompress
    compress
    notifempty
    missingok
}

(5) 关于启动、关闭,查看redis服务脚本与自动重启

需要写一个集中的管理脚本来维护redis的启动、关闭等,例如


for ARG in $@ $ARGS
do
	case $ARG in
	start)
		echo "##################begin to start redis server##################"

		#setting the value of os parameter
		#sh /opt/redis/bin/set_os_parms.sh
		   
		#start redis server
		/opt/redis/bin/redis-server /etc/redis/wbx-redis.conf
		echo "##################complete to start redis server##################"
		;;
	stop)
		echo "##################begin to stop redis server##################"
		dtpid=`ps -efw --width 1024 |grep redis-server |grep -v grep |awk '{print $2}'`
		dtpid=`echo $dtpid`
		if [ "x$dtpid" = "x" ]
		then
			echo "INFO: Redis Server is not running."
			echo "##################complete to stop redis server##################"
			exit 0
		else
			/opt/redis/bin/redis-shutdown  wbx-redis
			echo "##################complete to stop redis server##################"
		fi
		;;
	status)
	
		echo "##################begin to check redis server status##################"
		dtpid=`ps -efw --width 1024|grep redis-server |grep -v grep|awk '{print $2}'`
		dtpid=`echo $dtpid`
		if [ "x$dtpid" != "x" ]
		        then
		                echo "[INFO] Redis Server($dtpid) is started."
				echo "##################complete to check redis server status ##################"
		        else
		                echo "[INFO] Redis Server cannot be started."
                                echo "##################complete to check redis server status ##################"
		                exit 1;
		fi
		;;
	*)

echo "Usage: $0 (start|stop|status)"
cat <<EOF

start		- start Redis Server
stop		- stop  Redis Server
status      - check Redis Server status

EOF
	;;

	esac

done

写完后,可以绑定守护程序来保持redis service挂了后,自动拉起服务。这种情况,对于纯当cache的redis cluster比较实用。

(6) 创建create cluster命令:

最终我们要得到一个cluster create 的命令,但是在自动化部署,所以需要动态拼接处redis cluster创建命令,例如:

/opt/redis/bin/redis-trib.rb create –replicas 1 10.224.2.141:8690 10.224.2.142:8690 10.224.2.143:8690 10.224.2.144:8690 10.224.2.145:8690 10.224.2.146:8690

因为事先不定知道机器多少,或者说,最好不要关心有多少节点,只需要保持已有的节点数可以除以replicas的配比(例如1主1从时,保持机器数是2个倍数即可)就可以了。例如可以使用下面的脚本,来动态拼接一个create cluster的命令:

#!/usr/bin/python
import os
import string

print {{system.boxList}}

def check_cluster(host_port):
 check_command = "/opt/redis/bin/redis-trib.rb check " + host_port
 result = os.popen(check_command).readlines()
 print result
 return result[-1] == "[OK] All 16384 slots covered.\n"


def destory_cluster():
 box_list = {{system.boxList}}
 for box in box_list:
 i = 0
 while i &amp;lt; 100:
 flush_command = "/opt/redis/bin/redis-cli -h " + box["ip"] + " -p {{port}} -a {{password}} flushall"
 print flush_command
 result = os.popen(flush_command).readlines()
 print result
 cluster_reset_command = "/opt/redis/bin/redis-cli -h " + box["ip"] + " -p {{port}} -a {{password}} cluster reset"
 print cluster_reset_command 
 result = os.popen(cluster_reset_command ).readlines()
 print result
 if string.find(" ".join(str(x) for x in result),"containing keys") == -1:
 break
 print "##########try again....times: " + str(i)
 i = i + 1

 

def stop_servers():
 print "##########stop_servers...."
 box_list = {{system.boxList}}
 for box in box_list:
 stop_command = ""  //stop command need to change here
 print stop_command
 result = os.popen(stop_command).readlines()
 print result

def start_servers():
 print "##########start_servers...."
 box_list = {{system.boxList}}
 for box in box_list:
 start_command = ""  //start command need to change here
 print start_command
 result = os.popen(start_command).readlines()
 print result
 
def clean_servers():
 print "##########clean servers's dump file...."
 box_list = {{system.boxList}}
 for box in box_list:
 clean_command = "rm -rf /opt/redis/*.rdb" 
 print clean_command
 result = os.popen(clean_command).readlines()
 print result


def create_cluster():
 box_list = {{system.boxList}}
 new_box_list = []
 for box in box_list:
 if check_cluster(box["ip"] + ":{{port}}"):
 return True
 new_box_list.append(box["ip"] + ":{{port}}")

 print "##########check complete...."
 print "##########begin to execute create cluster command...."

 create_command = "echo yes | /opt/redis/bin/redis-trib.rb create --replicas 1 " + " ".join(new_box_list)
 print create_command
 result = os.popen(create_command).readlines()[-1]
 print result
 return string.find(result,"ERR") == -1

print "##########clean all servers..."
stop_servers()
clean_servers()
start_servers()
print "##########destroy old cluster..."
destory_cluster()
print "##########create new cluster...."
if create_cluster():
 print "##########success to complete create cluster...."
else:
 print "##########fail to complete create cluster...."
 exit(1)


对于交互式的命令,可以使用echo yes |, 例如:create_command = “echo yes | /opt/redis/bin/redis-trib.rb create –replicas 1 ” + ” “.join(new_box_list)

同时考虑flushall命令在太多数据时会阻塞,让cluster切换slave,然后slave变成master后又重复,所以直接先停掉所有机器,然后删除rdb file,确保所有数据清楚。然后再启动,这样不仅可以保持数据清空,同时也保证了,所有机器都是启动状态;

另外,cluster reset的时候,为了确保用户刚好在flush数据插入了新的数据,可以尝试100次来确保rest不会出现:


ERR CLUSTER RESET can't be called with master nodes containing keys

(7)考虑需要可以配置的内容:
redis有太多配置,有一些配置项最后暴露出来可以配置,例如:

a) port和password: 安全考虑
b) loglevel: 产线环境和测试环境可以设置不同
c) metric内容: 如果有监控,一般都是通过通过info命令来实现,监控的项目要么全部配齐,要么可配
d) maxmemory: 不同机器的内存大小不同,需要设置成不同。

最终成功后:

/opt/redis/bin/redis-trib.rb create --replicas 1 10.224.2.141:8690 10.224.2.142:8690 10.224.2.143:8690 10.224.2.144:8690 10.224.2.145:8690 10.224.2.146:8690
[OK] All 16384 slots covered.

效果图:

redis analyst (2)- use cluster

1 At least 6 nodes are required(3 masters and 3 slaves):

if use followed configure

# Settings
PORT=30000
TIMEOUT=2000
NODES=4 //all nodes number include master
REPLICAS=1

it will popup error:

[root@wbxperf001 create-cluster]# ./create-cluster start
Starting 30001
Starting 30002
Starting 30003
Starting 30004
[root@wbxperf001 create-cluster]# ./create-cluster create
*** ERROR: Invalid configuration for cluster creation.
*** Redis Cluster requires at least 3 master nodes.
*** This is not possible with 4 nodes and 1 replicas per node.
*** At least 6 nodes are required.

root cause:
redis-trib.rb

    def check_create_parameters
        masters = @nodes.length/(@replicas+1)
        if masters < 3
            puts "*** ERROR: Invalid configuration for cluster creation."
            puts "*** Redis Cluster requires at least 3 master nodes."
            puts "*** This is not possible with #{@nodes.length} nodes and #{@replicas} replicas per node."
            puts "*** At least #{3*(@replicas+1)} nodes are required."
            exit 1
        end
    end

2 pkill master, slave will on, then start the old master, it will be slave

[root@wbxperf001 ~]# cluster nodes
1fbaae0a5e98f2f4bb299139ffd126811d68fbbe 127.0.0.1:<span style="color: #0000ff;">30005</span> <span style="color: #0000ff;">slave</span> b067d0613418238a688f18ebd6a8e3612c1bb54b 0 1496327104008 5 connected
d8fd8928d10e11c9eddb0e80f40a5345f092d40c 127.0.0.1:30004 slave 7cf0fd6e2dce793161f0764ff7ef6e7f83b54c05 0 1496327104008 4 connected
c5a97527cdaa550d89e7282c0e888ea3b0d53a29 127.0.0.1:30006 slave b934ab47ef37a374498067f864a067a36674debc 0 1496327103909 6 connected
b934ab47ef37a374498067f864a067a36674debc 127.0.0.1:30003 myself,master - 0 0 3 connected 10923-16383
7cf0fd6e2dce793161f0764ff7ef6e7f83b54c05 127.0.0.1:30001 master - 0 1496327104009 1 connected 0-5460
b067d0613418238a688f18ebd6a8e3612c1bb54b 127.0.0.1:<span style="color: #ff0000;">30002</span> <span style="color: #ff0000;">master</span> - 0 1496327104008 2 connected 5461-10922

[root@wbxperf001 ~]# ps -ef|grep redis
root 9590 1 0 06:22 ? 00:01:33 ../../src/redis-server *:30001 [cluster]
root <span style="color: #ff0000;">9592</span> 1 0 06:22 ? 00:01:31 ../../src/redis-server *:<span style="color: #ff0000;">30002</span> [cluster]
root 9598 1 0 06:22 ? 00:01:32 ../../src/redis-server *:30003 [cluster]
root 9602 1 0 06:22 ? 00:01:32 ../../src/redis-server *:30004 [cluster]
root 9606 1 0 06:22 ? 00:01:27 ../../src/redis-server *:30005 [cluster]
root 9610 1 0 06:22 ? 00:01:31 ../../src/redis-server *:30006 [cluster]

[root@wbxperf001 ~]# kill -1 <span style="color: #ff0000;">9592</span>

[root@wbxperf001 ~]# cluster nodes
1fbaae0a5e98f2f4bb299139ffd126811d68fbbe 127.0.0.1:<span style="color: #0000ff;">30005 master </span>- 0 1496327180237 7 connected 5461-10922
d8fd8928d10e11c9eddb0e80f40a5345f092d40c 127.0.0.1:30004 slave 7cf0fd6e2dce793161f0764ff7ef6e7f83b54c05 0 1496327180237 4 connected
c5a97527cdaa550d89e7282c0e888ea3b0d53a29 127.0.0.1:30006 slave b934ab47ef37a374498067f864a067a36674debc 0 1496327180237 6 connected
b934ab47ef37a374498067f864a067a36674debc 127.0.0.1:30003 myself,master - 0 0 3 connected 10923-16383
7cf0fd6e2dce793161f0764ff7ef6e7f83b54c05 127.0.0.1:30001 master - 0 1496327180237 1 connected 0-5460
b067d0613418238a688f18ebd6a8e3612c1bb54b 127.0.0.1:<span style="color: #ff0000;">30002</span> <span style="color: #ff0000;">master</span>,<span style="color: #ff0000;">fail</span> - 1496327177426 1496327177227 2 disconnected

//after startup 30002, it will always be slave.

so if want to set it as master instead of keep slave, use cluster failover:


[root@wbxperf001 src]# ./redis-cli  -p 30002
127.0.0.1:30002> cluster failover

127.0.0.1:30002> cluster nodes
c5a97527cdaa550d89e7282c0e888ea3b0d53a29 127.0.0.1:30006 slave b934ab47ef37a374498067f864a067a36674debc 0 1496328987578 6 connected
b934ab47ef37a374498067f864a067a36674debc 127.0.0.1:30003 master - 0 1496328987578 3 connected 10923-16383
7cf0fd6e2dce793161f0764ff7ef6e7f83b54c05 127.0.0.1:30001 master - 0 1496328987578 1 connected 0-5460
b067d0613418238a688f18ebd6a8e3612c1bb54b 127.0.0.1:30002 myself,master - 0 0 8 connected 5461-10922
d8fd8928d10e11c9eddb0e80f40a5345f092d40c 127.0.0.1:30004 slave 7cf0fd6e2dce793161f0764ff7ef6e7f83b54c05 0 1496328987578 4 connected
1fbaae0a5e98f2f4bb299139ffd126811d68fbbe 127.0.0.1:30005 slave b067d0613418238a688f18ebd6a8e3612c1bb54b 0 1496328987578 8 connected

3 if one of masters down, if cluster down dependent on parameter: cluster-require-full-coverage

cluster-require-full-coverage <yes/no>: If this is set to yes, as it is by default, the cluster stops accepting writes if some percentage of the key space is not covered by any node. If the option is set to no, the cluster will still serve queries even if only requests about a subset of keys can be processed.

4 Cluster doesn’t support ?
(1) multi-key operation.
(2) select db, just use db 0:

Redis Cluster does not support multiple databases like the stand alone version of Redis. There is just database 0 and the SELECT command is not allowed.

5 some command can’t run any nodes:

127.0.0.1:7001> cluster failover
(error) ERR You should send CLUSTER FAILOVER to a slave

6 is redis-benchmark support cluster test?

no. but if using default parameter with -e. It won’t popup any error, so you may believe it can support cluster’s test

" -e                 If server replies with errors, show them on stdout.\n"
"                    (no more than 1 error per second is displayed)\n"

it is interesting for the error number control.

                if (config.showerrors) {
                    static time_t lasterr_time = 0;
                    time_t now = time(NULL);
                    redisReply *r = reply;
                    if (r->type == REDIS_REPLY_ERROR && lasterr_time != now) {
                        lasterr_time = now;
                        printf("Error from server: %s\n", r->str);
                    }
                }

7 cluster上任意节点执行keys *

返回的是当前节点负责的数据区域(slots),而不是整个cluster的数据。

另外set key value时,如果当前的数据不应该由自己负责存储,则返回moved error异常:

127.0.0.1:7001> keys *
1) "fujian1234"
127.0.0.1:7001> 
127.0.0.1:7001> 
127.0.0.1:7001> set fujian1234 value
(error) MOVED 15336 10.224.2.144:7001

加-c后可以自动redirect:
The redis-cli utility in the unstable branch of the Redis repository at GitHub implements a very basic cluster support when started with the -c switch.

[root@redis001 ~]# redis-cli  -p 7001 -a P@ss123 -c
127.0.0.1:7001> get fujian1234
-> Redirected to slot [15336] located at 10.224.2.144:7001
"value"

此外,在cluster slave上执行读请求,即使数据在所在的数据区域,也可能moved error.同样可以加-c,这种情况下,也可以执行readonly。

slave moved error:
[root@redis004 bin]# ./redis-cli  -p 7001 -a P@ss123 
127.0.0.1:7001> get fujian1234
(error) MOVED 15336 10.224.2.141:7001
127.0.0.1:7001> readonly
OK
127.0.0.1:7001> get fujian1234
"xinxiu"

Read queries against a Redis Cluster slave node are disabled by default, but you can use the READONLY command to change this behavior on a per- connection basis. The READWRITE command resets the readonly mode flag of a connection back to readwrite.
READWRITE Disables read queries for a connection to a Redis Cluster slave node.

所以引发另外一个问题的思考: 配置里面的slave_read-only和READONLY的关系和区别:

答案引用一段话:


Please take note that slave-read-only config refers to replication and READONLY refers to the redis-cluster command.

If you are not using redis-cluster, you can safely ignore the READONLY command documentation. Refer to https://raw.githubusercontent.com/antirez/redis/2.8/redis.conf instead. Writes should not replicate nor require lookups to the master. My wireshark dumps on redis with slave-read-only no shows no indication of any communication with master as a consequence of writes to the slave itself.

If you are using redis-cluster on the other hand, and referring to the READWRITE behavior: Cluster nodes' communication with each other for hash slot updates and other cluster specific messages are optimized to use minimal bandwidth and the least processing time. Communicating hash slot updates most likely do not happen for every write on the slave.

slave-read-only实际测试对redis cluster而言并无作用。