NoSQL数据库Redis(四):哨兵集群

张开发
2026/4/17 16:41:47 15 分钟阅读

分享文章

NoSQL数据库Redis(四):哨兵集群
Redis哨兵集群配置一、哨兵集群核心原理Redis哨兵Sentinel是实现高可用性的分布式系统通过监控、通知、自动故障转移三大功能保障服务连续性监控机制哨兵节点每秒向主/从节点发送PING命令检测存活状态响应超时判定为主观下线SDOWN。当多数哨兵quorum确认主节点下线触发客观下线ODOWN。故障转移流程领导者选举使用Raft协议在哨兵间选举领导者新主节点选择根据优先级、复制偏移量等规则选择最优从节点配置更新将新主节点信息同步至所有节点客户端重定向通过PUBLISH通知客户端更新连接状态同步哨兵通过__sentinel__:hello频道发布自身信息实现节点间状态共享。二、OpenEuler环境准备# 所有节点执行 sudo dnf install -y gcc make tcl wget https://download.redis.io/releases/redis-7.0.12.tar.gz tar xzf redis-7.0.12.tar.gz cd redis-7.0.12 make sudo make install # 创建专用用户 sudo groupadd redis sudo useradd -r -g redis -s /sbin/nologin redis三、节点角色规划IP地址角色端口分配192.168.64.128主节点 哨兵6379/26379192.168.64.129从节点 哨兵6379/26379192.168.64.130从节点 哨兵6379/26379四、Redis主从配置主节点配置 (128)# /etc/redis/redis.conf bind 192.168.64.128 port 6379 daemonize yes pidfile /var/run/redis_6379.pid logfile /var/log/redis/redis.log dir /var/lib/redis masterauth S3cr3tPss # 主从认证密码 requirepass S3cr3tPss # 客户端访问密码从节点配置 (129/130)slaveof 192.168.64.128 6379 masterauth S3cr3tPss replica-read-only yes五、哨兵集群配置所有节点# /etc/redis/sentinel.conf port 26379 daemonize yes logfile /var/log/redis/sentinel.log sentinel monitor mymaster 192.168.64.128 6379 2 # quorum2 sentinel auth-pass mymaster S3cr3tPss sentinel down-after-milliseconds mymaster 5000 # 5秒判定下线 sentinel failover-timeout mymaster 60000 # 故障转移超时 sentinel parallel-syncs mymaster 1 # 并行同步数六、关键参数解析quorum值主要作用是把来自多个哨兵节点的独立判断主观下线汇聚成一个集群层面的共同决策客观下线从而有效防止因单点网络抖动等原因导致的误判故障转移超时failover-timeout控制故障转移各阶段超时阈值包含领导者选举、从节点晋升、配置传播三阶段网络分区处理graph LR A[主节点] --|网络中断| B[分区A] C[哨兵集群] --|网络中断| D[分区B] B--|多数哨兵在线| E[触发故障转移] D--|少数哨兵在线| F[维持原状态]七、启动与验证# 启动Redis服务 sudo systemctl start redis-server # 启动哨兵服务 redis-sentinel /etc/redis/sentinel.conf # 验证主从状态从节点执行 redis-cli -h 192.168.64.129 -a S3cr3tPss INFO replication # 输出示例 role:slave master_host:192.168.64.128 master_link_status:up # 检查哨兵拓扑 redis-cli -p 26379 SENTINEL MASTER mymaster预期输出包含主节点IP、从节点数量、哨兵节点数等关键信息。八、故障转移测试模拟主节点宕机# 在主节点执行 sudo systemctl stop redis-server观察日志# sentinel.log sdown master mymaster 192.168.64.128 6379 odown master mymaster #quorum 2/2 try-failover master mymaster vote-for-leader 5a3b4c6d 1 # 领导者选举 elected-leader master mymaster selected-slave 192.168.64.129 # 新主节点 failover-state-send-slaveof-noone 192.168.64.129 failover-state-wait-promotion 192.168.64.129 promoted-slave 192.168.64.129 # 晋升成功验证新拓扑redis-cli -p 26379 SENTINEL GET-MASTER-ADDR-BY-NAME mymaster 1) 192.168.64.129 2) 6379九、生产环境优化建议内核参数调优# /etc/sysctl.conf net.core.somaxconn 2048 vm.overcommit_memory 1哨兵部署策略避免与Redis进程同机部署物理隔离跨机房部署时设置sentinel announce-ip声明公网IP客户端重连策略# Python示例 from redis.sentinel import Sentinel sentinel Sentinel([(192.168.64.128, 26379), (192.168.64.129, 26379), (192.168.64.130, 26379)], socket_timeout0.5) master sentinel.master_for(mymaster, passwordS3cr3tPss) master.set(key, value) # 自动路由到主节点十、常见故障排查脑裂问题现象客户端同时向两个主节点写入解决设置min-slaves-to-write 1确保至少一个从节点同步配置不同步检查哨兵节点间网络连通性验证sentinel current-epoch值是否一致日志分析要点grep -E sdown|odown|failover /var/log/redis/sentinel.log十一、监控告警配置Prometheus监控指标# redis_exporter配置 - targets: [192.168.64.128:9121, 192.168.64.129:9121, 192.168.64.130:9121] labels: group: redis-sentinel关键告警规则- alert: RedisSentinelQuorumLost expr: redis_sentinel_master_quorum_status 0 for: 5m labels: severity: critical十二、安全加固措施防火墙配置# OpenEuler防火墙规则 sudo firewall-cmd --permanent --add-port6379/tcp sudo firewall-cmd --permanent --add-port26379/tcp sudo firewall-cmd --reloadACL访问控制# redis.conf aclfile /etc/redis/users.acl# users.acl user default on S3cr3tPss ~* all user monitor on MonitorPss resetchannels -all info

更多文章