摘要:最近工作搭建redis集群时候的笔记
正文: Redis集群搭建 版本
系统: CentOS 7.4
Redis: redis-4.0.2
ruby: 2.4.2
安装gcc 1 2 3 4 rpm -ivh gcc-c++-4.8.5-16.el7.x86_64.rpm --nodeps Preparing... ################################# [100%] Updating / installing... 1:gcc-c++-4.8.5-16.el7 ################################# [100%]
安装Redis 1 2 3 4 cd /opt tar xzf redis-4.0.2.tar.gz cd redis-4.0.2 make
如果因为编译失败可以使用 make distclean
创建节点
1 2 3 mkdir /opt/redis-4.0.2/redis-cluster cd /opt/redis-4.0.2/redis-cluster mkdir 7100 7101 7102
分别修改这三个配置文件,把如下redis.conf 配置
内容粘贴进去
1 2 3 vi 7100/redis.conf vi 7101/redis.conf vi 7102/redis.conf
1 2 3 4 5 6 7 8 port 7100 bind 192.168.103.14 daemonize yes pidfile /var/run/redis_7100.pid cluster-enabled yes cluster-config-file nodes_7100.conf cluster-node-timeout 20100 appendonly yes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 # 端口7100,7101,7102 port 7100 # 默认ip为127.0.0.1,需要改为其他节点机器可访问的ip,否则创建集群时无法访问对应的端口,无法创建集群 bind 192.168.103.14 # redis后台运行 daemonize yes # pidfile文件对应7100,7101,7102 pidfile /var/run/redis_7100.pid # 开启集群,把注释 cluster-enabled yes # 集群的配置,配置文件首次启动自动生成 7100,7101,7102 cluster-config-file nodes_7100.conf # 请求超时,默认15秒,可自行设置 cluster-node-timeout 20100 # aof日志开启,有需要就开启,它会每次写操作都记录一条日志 appendonly yes
在另外一台机器上重复以上操作,目录和端口改为7103、7104、7105
启动集群 1 2 3 4 5 # 第一台机器上执行 3个节点 for((i=0;i<=2;i++)); do /opt/redis-4.0.2/src/redis-server /opt/redis-4.0.2/redis-cluster/710$i/redis.conf; done # 第二台机器上执行 3个节点 for((i=3;i<=5;i++)); do /opt/redis-4.0.2/src/redis-server /opt/redis-4.0.2/redis-cluster/710$i/redis.conf; done
检查服务 1 2 ps -ef | grep redis //redis是否启动成功 netstat -tnlp | grep redis //监听redis端口
搭建集群 现在我们已经有了六个正在运行中的 Redis 实例,通过使用 Redis 集群命令行工具 redis-trib , 编写节点配置文件的工作可以非常容易地完成: redis-trib 位于 Redis 源码的 src 文件夹中, 它是一个 Ruby 程序, 这个程序通过向实例发送特殊命令来完成创建新集群, 检查集群, 或者对集群进行重新分片(reshared)等工作。所以我们先来安装ruby。
安装ruby 通过yum安装的ruby往往版本较低,这里使用安装包安装
下载地址
1 2 3 4 5 tar -xvzf ruby-2.4.2.tar.gz cd ruby-2.4.2 ./configure make sudo make install
安装完成后,可以查看是否安装成功,若遇到没有输出版本可以重新打开命令窗口试试
接下来我们安装redis依赖
创建集群 接下来我们使用 Redis 集群命令行工具 redis-trib,在其中一台机器上运行如下命令
1 /opt/redis-4.0.2/src/redis-trib.rb create --replicas 1 192.168.103.14:7100 192.168.103.14:7101 192.168.103.14:7102 192.168.103.28:7103 192.168.103.28:7104 192.168.103.28:7105
会出现如下内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 > >> Creating cluster > >> Performing hash slots allocation on 6 nodes... Using 3 masters: 192.168.103.14:7100 192.168.103.28:7103 192.168.103.14:7101 Adding replica 192.168.103.28:7104 to 192.168.103.14:7100 Adding replica 192.168.103.14:7102 to 192.168.103.28:7103 Adding replica 192.168.103.28:7105 to 192.168.103.14:7101 M: c190d12629fd227c909caa96f5e978ff996364ed 192.168.103.14:7100 slots:0-5460 (5461 slots) master M: 77ea96b2eb31b0dd44acc986fe8484358cd9863f 192.168.103.14:7101 slots:10923-16383 (5461 slots) master S: b08c0a2e59ef7564eeda7f8d7ca08ec7f3766c1d 192.168.103.14:7102 replicates 76d59f4caaf766bea9122b1e6327e13721c8ca3b M: 76d59f4caaf766bea9122b1e6327e13721c8ca3b 192.168.103.28:7103 slots:5461-10922 (5462 slots) master S: 77abb939328acb2198fe3e4a495c217d25b91cda 192.168.103.28:7104 replicates c190d12629fd227c909caa96f5e978ff996364ed S: 82f48a3c8d0d684fe31254fd4115d8c3b5622f4e 192.168.103.28:7105 replicates 77ea96b2eb31b0dd44acc986fe8484358cd9863f Can I set the above configuration? (type 'yes' to accept): yes
输入yes继续
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > >> Nodes configuration updated > >> Assign a different config epoch to each node > >> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join.... > >> Performing Cluster Check (using node 192.168.103.14:7100) M: c190d12629fd227c909caa96f5e978ff996364ed 192.168.103.14:7100 slots:0-5460 (5461 slots) master 1 additional replica(s) M: 77ea96b2eb31b0dd44acc986fe8484358cd9863f 192.168.103.14:7101 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 82f48a3c8d0d684fe31254fd4115d8c3b5622f4e 192.168.103.28:7105 slots: (0 slots) slave replicates 77ea96b2eb31b0dd44acc986fe8484358cd9863f S: b08c0a2e59ef7564eeda7f8d7ca08ec7f3766c1d 192.168.103.14:7102 slots: (0 slots) slave replicates 76d59f4caaf766bea9122b1e6327e13721c8ca3b M: 76d59f4caaf766bea9122b1e6327e13721c8ca3b 192.168.103.28:7103 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 77abb939328acb2198fe3e4a495c217d25b91cda 192.168.103.28:7104 slots: (0 slots) slave replicates c190d12629fd227c909caa96f5e978ff996364ed [OK] All nodes agree about slots configuration. > >> Check for open slots... > >> Check slots coverage... [OK] All 16384 slots covered.
可以看到master节点分别是14:7100(0-5460)
、14:7101(10923-16383)
、28:7103(5461-10922)
,salve节点分别是28:7150
、14:7102
、28:7104
对应关系可以根据上面可以看到分别是3主3从
若输入yes后出现
1 2 3 4 5 6 7 8 9 10 11 12 13 Can I set the above configuration? (type 'yes' to accept): yes /usr/local/lib/ruby/gems/2.4.0/gems/redis-4.0.1/lib/redis/client.rb:119:in `call': ERR Slot 5798 is already busy (Redis::CommandError) from /usr/local/lib/ruby/gems/2.4.0/gems/redis-4.0.1/lib/redis.rb:2764:in `block in method_missing' from /usr/local/lib/ruby/gems/2.4.0/gems/redis-4.0.1/lib/redis.rb:45:in `block in synchronize' from /usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize' from /usr/local/lib/ruby/gems/2.4.0/gems/redis-4.0.1/lib/redis.rb:45:in `synchronize' from /usr/local/lib/ruby/gems/2.4.0/gems/redis-4.0.1/lib/redis.rb:2763:in `method_missing' from /opt/redis-4.0.2/src/redis-trib.rb:212:in `flush_node_config' from /opt/redis-4.0.2/src/redis-trib.rb:776:in `block in flush_nodes_config' from /opt/redis-4.0.2/src/redis-trib.rb:775:in `each' from /opt/redis-4.0.2/src/redis-trib.rb:775:in `flush_nodes_config' from /opt/redis-4.0.2/src/redis-trib.rb:1296:in `create_cluster_cmd' from /opt/redis-4.0.2/src/redis-trib.rb:1700:in `<main>'
解决办法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 # 每个节点执行以下命令,然后重新执行创建集群命令 /opt/redis-4.0.2/src/redis-cli -h 192.168.103.14 -p 7100 192.168.103.14:7100> flushall OK 192.168.103.14:7100> cluster reset soft OK 192.168.103.14:7100> exit ... /opt/redis-4.0.2/src/redis-cli -h 192.168.103.28 -p 7103 192.168.103.28:7103> flushall OK 192.168.103.28:7103> cluster reset soft OK 192.168.103.28:7103> exit
集群验证 连接集群测试 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # 选择一个节点set 值 /opt/redis-4.0.2/src/redis-cli -c -h 192.168.103.28 -p 7104 192.168.103.28:7104> set name admin -> Redirected to slot [5798] located at 192.168.103.28:7103 OK 192.168.103.28:7103> get name "admin" 192.168.103.28:7103> exit # 换个节点测试 [root@server28 /]# /opt/redis-4.0.2/src/redis-cli -c -h 192.168.103.14 -p 7101 192.168.103.14:7101> get name -> Redirected to slot [5798] located at 192.168.103.28:7103 "admin" 192.168.103.28:7103> exit
可以发现name "admin"
被放置在28:7103
主节点上,槽位是(5461-10922)
注: Redis 集群没有使用一致性hash, 而是引入了 哈希槽 的概念.
Redis 集群有16384个哈希槽,每个key通过CRC16校验后对16384取模来决定放置哪个槽.集群的每个节点负责一部分hash槽,举个例子,比如当前集群有3个节点,那么:
节点 A 包含 0 到 5500号哈希槽.
节点 B 包含5501 到 11000 号哈希槽.
节点 C 包含11001 到 16384号哈希槽.
这种结构很容易添加或者删除节点. 比如如果我想新添加个节点D, 我需要从节点 A, B, C中得部分槽到D上. 如果我想移除节点A,需要将A中的槽移到B和C节点上,然后将没有任何槽的A节点从集群中移除即可. 由于从一个节点将哈希槽移动到另一个节点并不会停止服务,所以无论添加删除或者改变某个节点的哈希槽的数量都不会造成集群不可用的状态.
检查集群状态 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 /opt/redis-4.0.2/src/redis-trib.rb check 192.168.103.14:7100 > >> Performing Cluster Check (using node 192.168.103.14:7100) M: c190d12629fd227c909caa96f5e978ff996364ed 192.168.103.14:7100 slots:0-5460 (5461 slots) master 1 additional replica(s) M: 77ea96b2eb31b0dd44acc986fe8484358cd9863f 192.168.103.14:7101 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 82f48a3c8d0d684fe31254fd4115d8c3b5622f4e 192.168.103.28:7105 slots: (0 slots) slave replicates 77ea96b2eb31b0dd44acc986fe8484358cd9863f S: b08c0a2e59ef7564eeda7f8d7ca08ec7f3766c1d 192.168.103.14:7102 slots: (0 slots) slave replicates 76d59f4caaf766bea9122b1e6327e13721c8ca3b M: 76d59f4caaf766bea9122b1e6327e13721c8ca3b 192.168.103.28:7103 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 77abb939328acb2198fe3e4a495c217d25b91cda 192.168.103.28:7104 slots: (0 slots) slave replicates c190d12629fd227c909caa96f5e978ff996364ed
列出集群节点 1 2 3 4 5 6 7 8 /opt/redis-4.0.2/src/redis-cli -c -h 192.168.103.14 -p 7101 192.168.103.14:7101> cluster nodes 77abb939328acb2198fe3e4a495c217d25b91cda 192.168.103.28:7104@17104 slave c190d12629fd227c909caa96f5e978ff996364ed 0 1523435804915 5 connected 76d59f4caaf766bea9122b1e6327e13721c8ca3b 192.168.103.28:7103@17103 master - 0 1523435804000 4 connected 5461-10922 82f48a3c8d0d684fe31254fd4115d8c3b5622f4e 192.168.103.28:7105@17105 slave 77ea96b2eb31b0dd44acc986fe8484358cd9863f 0 1523435805918 6 connected b08c0a2e59ef7564eeda7f8d7ca08ec7f3766c1d 192.168.103.14:7102@17102 slave 76d59f4caaf766bea9122b1e6327e13721c8ca3b 0 1523435803000 4 connected 77ea96b2eb31b0dd44acc986fe8484358cd9863f 192.168.103.14:7101@17101 myself,master - 0 1523435803000 2 connected 10923-16383 c190d12629fd227c909caa96f5e978ff996364ed 192.168.103.14:7100@17100 master - 0 1523435806920 1 connected 0-5460
打印集群信息 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192.168.103.14:7101> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:6 cluster_my_epoch:2 cluster_stats_messages_ping_sent:1605 cluster_stats_messages_pong_sent:1682 cluster_stats_messages_meet_sent:5 cluster_stats_messages_sent:3292 cluster_stats_messages_ping_received:1681 cluster_stats_messages_pong_received:1610 cluster_stats_messages_meet_received:1 cluster_stats_messages_received:3292
节点 cluster meet <ip> <port>
将 ip 和 port 所指定的节点添加到集群当中
cluster forget <node_id>
从集群中移除 node_id 指定的节点
1 2 3 4 5 6 7 8 9 10 11 12 192.168.103.14:7100> cluster meet 192.168.103.28 7106 OK 192.168.103.14:7100> cluster nodes 047f60047efa74c6f597e935b8b5896c15057cf6 192.168.103.28:7106@17106 master - 0 1523448401000 0 connected 77ea96b2eb31b0dd44acc986fe8484358cd9863f 192.168.103.14:7101@17101 master - 0 1523448401961 2 connected 10923-16383 82f48a3c8d0d684fe31254fd4115d8c3b5622f4e 192.168.103.28:7105@17105 slave 77ea96b2eb31b0dd44acc986fe8484358cd9863f 0 1523448402964 6 connected b08c0a2e59ef7564eeda7f8d7ca08ec7f3766c1d 192.168.103.14:7102@17102 slave 76d59f4caaf766bea9122b1e6327e13721c8ca3b 0 1523448403965 4 connected 76d59f4caaf766bea9122b1e6327e13721c8ca3b 192.168.103.28:7103@17103 master - 0 1523448402000 4 connected 5461-10922 77abb939328acb2198fe3e4a495c217d25b91cda 192.168.103.28:7104@17104 slave c190d12629fd227c909caa96f5e978ff996364ed 0 1523448404969 5 connected c190d12629fd227c909caa96f5e978ff996364ed 192.168.103.14:7100@17100 myself,master - 0 1523448401000 1 connected 0-5460 192.168.103.14:7100> cluster forget 047f60047efa74c6f597e935b8b5896c15057cf6 OK
cluster nodes
命令的结果含义如下:
节点ID
IP:端口
标志: master, slave, myself, fail, …
如果是个从节点, 这里是它的主节点的NODE ID
集群最近一次向节点发送 PING 命令之后, 过去了多长时间还没接到回复。.
节点最近一次返回 PONG 回复的时间。
节点的配置纪元(configuration epoch):详细信息请参考 Redis 集群规范 。
本节点的网络连接情况:例如 connected 。
节点目前包含的槽:例如 192.168.103.28:7103 目前包含号码为 5960 至 10921 的哈希槽。
使用redis-trib.rb 新增节点
添加master节点
1 /opt/redis-4.0.2/src/redis-trib.rb add-node 192.168.103.28:7106 192.168.103.14:7100
添加salve节点(随机选一个主节点),前提是节点要为空
1 /opt/redis-4.0.2/src/redis-trib.rb add-node --slave 192.168.103.28:7106 192.168.103.14:7100
添加salve节点(指定主节点为192.168.103.14:7101)
1 /opt/redis-4.0.2/src/redis-trib.rb add-node --slave --master-id 77ea96b2eb31b0dd44acc986fe8484358cd9863f 192.168.103.28:7106 192.168.103.14:7100
也可以使用cluster replicate
1 192.168.103.28:7106> cluster replicate 77ea96b2eb31b0dd44acc986fe8484358cd9863f
使用redis-trib.rb 移除节点
1 /opt/redis-4.0.2/src/redis-trib.rb del-node 192.168.103.14:7100 `<node-id>`
参考链接 CentOs7.3 搭建 Redis-4.0.1 Cluster 集群服务
Redis 集群教程