下午同事反馈,某业务场景性能测试过程中,出现异常,提供日志报: Redis command timed out


1. 先看下日志

org.springframework.dao.QueryTimeoutException: Redis command timed out; nested exception is io.lettuce.core.RedisCommandTimeoutException: Command timed out after 1 minute(s)
at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:70) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:271) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceConnection.await(LettuceConnection.java:1062) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceConnection.lambda$doInvoke$4(LettuceConnection.java:919) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceInvoker$Synchronizer.invoke(LettuceInvoker.java:665) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceInvoker.just(LettuceInvoker.java:94) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.lettuce.LettuceSetCommands.sMembers(LettuceSetCommands.java:149) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.connection.DefaultedRedisConnection.sMembers(DefaultedRedisConnection.java:805) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.core.DefaultSetOperations.lambda$members$10(DefaultSetOperations.java:214) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:222) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:189) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:96) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.core.DefaultSetOperations.members(DefaultSetOperations.java:214) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.data.redis.core.DefaultBoundSetOperations.members(DefaultBoundSetOperations.java:152) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
at org.springframework.session.data.redis.RedisSessionExpirationPolicy.cleanExpiredSessions(RedisSessionExpirationPolicy.java:132) ~[spring-session-data-redis-2.5.6.jar!/:2.5.6]
at org.springframework.session.data.redis.RedisIndexedSessionRepository.cleanupExpiredSessions(RedisIndexedSessionRepository.java:424) ~[spring-session-data-redis-2.5.6.jar!/:2.5.6]
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-5.3.20.jar!/:5.3.20]
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:95) [spring-context-5.3.20.jar!/:5.3.20]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 1 minute(s)
at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59) ~[lettuce-core-6.1.8.RELEASE.jar!/:6.1.8.RELEASE]
at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:246) ~[lettuce-core-6.1.8.RELEASE.jar!/:6.1.8.RELEASE]
at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:74) ~[lettuce-core-6.1.8.RELEASE.jar!/:6.1.8.RELEASE]
at org.springframework.data.redis.connection.lettuce.LettuceConnection.await(LettuceConnection.java:1060) ~[spring-data-redis-2.5.11.jar!/:2.5.11]
... 22 common frames omitted 2022-09-15 15:31:00.427 INFO 16828 --- [xecutorLoop-1-3] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:31:00.427 INFO 16828 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:31:00.428 WARN 16828 --- [ioEventLoop-8-4] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692
2022-09-15 15:31:00.429 WARN 16828 --- [ioEventLoop-8-5] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692
2022-09-15 15:31:00.606 INFO 16828 --- [xecutorLoop-4-7] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:31:00.608 WARN 16828 --- [ioEventLoop-9-3] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692
2022-09-15 15:31:30.527 INFO 16828 --- [xecutorLoop-1-6] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:31:30.527 INFO 16828 --- [xecutorLoop-1-7] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:31:30.530 WARN 16828 --- [ioEventLoop-8-7] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692
2022-09-15 15:31:30.532 WARN 16828 --- [ioEventLoop-8-6] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692
2022-09-15 15:31:30.706 INFO 16828 --- [xecutorLoop-4-8] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:31:30.714 WARN 16828 --- [ioEventLoop-9-4] i.l.core.protocol.ConnectionWatchdog : Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692
2022-09-15 15:32:00.627 INFO 16828 --- [xecutorLoop-1-2] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692
2022-09-15 15:32:00.627 INFO 16828 --- [xecutorLoop-1-4] i.l.core.protocol.ConnectionWatchdog : Reconnecting, last destination was 192.168.0.163:6692

从日志可以看出:Cannot reconnect to [192.168.0.163:6692]: Connection refused: /192.168.0.163:6692

*********************************************************************************************************************************************************************************************

2. 登入redis集群服务器看下redis服务状态

[root@iZ2ze0gm3scdypc0i15r8yZ redis-cluster]# ps -ef|grep redis
root 1718 1 78 09:36 ? 04:49:50 ./redis-server *:6695 [cluster]
root 1723 1 36 09:36 ? 02:12:52 ./redis-server *:6693 [cluster]
root 1724 1 94 09:36 ? 05:49:34 ./redis-server *:6694 [cluster]
root 1726 1 94 09:36 ? 05:48:41 ./redis-server *:6696 [cluster]
root 1727 1 35 09:36 ? 02:12:15 ./redis-server *:6691 [cluster]
root 2426 1 0 09:48 ? 00:01:16 ./redis-server *:6379
root 22212 22111 0 15:44 pts/4 00:00:00 grep --color=auto redis

可以看出,6692端口的redis节点挂掉了。为什么会挂掉?

*********************************************************************************************************************************************************************************************

3. 先把6692端口对应的redis节点启动,看下内存

[root@iZ2ze0gm3scdypc0i15r8yZ redis-cluster]# ./redis-cli -h 192.168.0.163 -p 6692 -c -a Tiye@123!
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.0.163:6692> info memory
# Memory
used_memory:5255993312
used_memory_human:4
.90G
used_memory_rss:6011609088
used_memory_rss_human:5.60G
used_memory_peak:5255993312
used_memory_peak_human:4.90G
used_memory_peak_perc:100.00%

used_memory_overhead:23434632
used_memory_startup:1483784
used_memory_dataset:5232558680
used_memory_dataset_perc:99.58%
allocator_allocated:5225666392
allocator_active:6178586624
allocator_resident:6227939328
total_system_memory:33020043264
total_system_memory_human:30.75G
used_memory_lua:32768
used_memory_lua_human:32.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.18
allocator_frag_bytes:952920232
allocator_rss_ratio:1.01
allocator_rss_bytes:49352704
rss_overhead_ratio:0.97
rss_overhead_bytes:-216330240
mem_fragmentation_ratio:1.15
mem_fragmentation_bytes:786011688
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

从6692节点的redis内存可看出,内存已达到最大值:4.9G,内存占用率为100%,这是该节点挂掉的原因。

*********************************************************************************************************************************************************************************************

4. 再来看看6691、6693两个节点的内存使用情况

[root@iZ2ze0gm3scdypc0i15r8yZ redis-cluster]# ./redis-cli -h 192.168.0.163 -p 6691 -c -a Tiye@123!
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.0.163:6691> info memory
# Memory
used_memory:312651792
used_memory_human:298.17M

used_memory_rss:317714432
used_memory_rss_human:303.00M
used_memory_peak:313941360
used_memory_peak_human:299.40M
used_memory_peak_perc:99.59%

used_memory_overhead:38853632
used_memory_startup:1483784
used_memory_dataset:273798160
used_memory_dataset_perc:87.99%
allocator_allocated:312643776
allocator_active:313520128
allocator_resident:322052096
total_system_memory:33020043264
total_system_memory_human:30.75G
used_memory_lua:32768
used_memory_lua_human:32.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.00
allocator_frag_bytes:876352
allocator_rss_ratio:1.03
allocator_rss_bytes:8531968
rss_overhead_ratio:0.99
rss_overhead_bytes:-4337664
mem_fragmentation_ratio:1.02
mem_fragmentation_bytes:5103624
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:841560
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
[root@iZ2ze0gm3scdypc0i15r8yZ redis-cluster]# ./redis-cli -h 192.168.0.163 -p 6693 -c -a Tiye@123!
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.0.163:6693> info memory
# Memory
used_memory:313492000
used_memory_human:298.97M

used_memory_rss:318951424
used_memory_rss_human:304.18M
used_memory_peak:314344552
used_memory_peak_human:299.78M
used_memory_peak_perc:99.73%

used_memory_overhead:38814608
used_memory_startup:1483784
used_memory_dataset:274677392
used_memory_dataset_perc:88.04%
allocator_allocated:313483968
allocator_active:314449920
allocator_resident:322850816
total_system_memory:33020043264
total_system_memory_human:30.75G
used_memory_lua:32768
used_memory_lua_human:32.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.00
allocator_frag_bytes:965952
allocator_rss_ratio:1.03
allocator_rss_bytes:8400896
rss_overhead_ratio:0.99
rss_overhead_bytes:-3899392
mem_fragmentation_ratio:1.02
mem_fragmentation_bytes:5500440
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:821056
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

6691、6693两个节点的redis内存占用率也达到了99%以上,但是实际使用内存不超过300M,redis内存在集群内没有做到均衡

*********************************************************************************************************************************************************************************************

5. redis超时问题解决办法

redis超时问题的解决办法:清除6691、6692、6693这三个主节点的缓存。不过,这只是个临时的解决方案。

[root@iZ2ze0gm3scdypc0i15r8yZ redis-cluster]# ./redis-cli -h 192.168.0.163 -p 6691 -c -a Tiye@123!
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.0.163:6691> flushall
OK

*********************************************************************************************************************************************************************************************

6. 目前问题及个人认为的最佳解决方案

6.1 目前存在两个问题

  ①redis内存消耗没有在集群内部达到均衡;②重启redis服务,缓存没有清除。

6.2 个人认为的最佳解决方案

  查看redis.conf配置文件,最大内存配置( redis.conf默认配置)为:# maxmemory <bytes>

  redis服务器总内存大小为32G,根据经验(之前在上海的某个项目,设置maxmemory为固定值,集群内部所有节点的内存消耗均衡),可以给6个redis节点都分配4G内存。即: maxmemory 4096M

重启redis集群。

***************后续待观察,redis集群最大内存是否达到均衡,redis服务是否还会挂掉*************************

记录redis集群连接超时问题及解决方案的更多相关文章

  1. Springboot2.x集成lettuce连接redis集群报超时异常Command timed out after 6 second(s)

    文/朱季谦 背景:最近在对一新开发Springboot系统做压测,发现刚开始压测时,可以正常对redis集群进行数据存取,但是暂停几分钟后,接着继续用jmeter进行压测时,发现redis就开始突然疯 ...

  2. redis-trib.rb创建Redis集群时失败报错解决方案

    问题描述: [root@eshop-cache01 init.d]# redis-trib.rb create --replicas 1 192.168.1.110:7001 192.168.1.11 ...

  3. Redis集群的原理和搭建(转载)

    转载来源:https://www.jianshu.com/p/c869feb5581d Redis集群的原理和搭建 前言 Redis 是我们目前大规模使用的缓存中间件,由于它强大高效而又便捷的功能,得 ...

  4. redis集群+JedisCluster+lua脚本实现分布式锁(转)

    https://blog.csdn.net/qq_20597727/article/details/85235602 在这片文章中,使用Jedis clien进行lua脚本的相关操作,同时也使用一部分 ...

  5. nosql数据库之Redis集群

    Redis 集群是一个可以在多个 Redis 节点之间进行数据共享的设施(installation). Redis 集群不支持那些需要同时处理多个键的 Redis 命令, 因为执行这些命令需要在多个 ...

  6. redis集群报Jedis does not support password protected Redis Cluster configurations异常解决办法

    解决spring-data-redis操作redis集群报“Jedis does not support password protected Redis Cluster configurations ...

  7. 访问redis集群提示连接超时的问题

    上周在服务器通过docker部署了一个单机版redis集群,今天通过StackExchange.Redis访问的时候报了这个错: 提示我把超时时间设置一下,我去服务器上找到redis的配置文件,发现不 ...

  8. 使用DBeaver Enterprise连接redis集群的一些操作记录

    要点总结: 使用DBeaver Enterprise连接redis集群可以通过SQL语句查看key对应的value,但是没法查看key. 使用RedisDesktopManager连接redis集群可 ...

  9. redis客户端可以连接集群,但JedisCluster连接redis集群一直报Could not get a resource from the pool

    一,问题描述: (如题目)通过jedis连接redis单机成功,使用JedisCluster连接redis集群一直报Could not get a resource from the pool 但是使 ...

随机推荐

  1. CCF NOI Online 2021 提高组 赛后心得

    T1 做个,不会,拿到 20 pts 跑路. 注意后面有个 K = 1 的部分分,这个可以递推求 b 的个数,然后直接乘上 a0 . 官方正解讲得极其详细,我还是第一次见到可以 O(K2) 做 1~n ...

  2. 解决QIcon引用qrc不显示图片

    引用Qrc 对于Qt来说,添加qrc之后,可以使用":"来直接访问qrc的文件,比如 QIcon icon(":/icon/red.png"); 绝对路径 当然 ...

  3. Android下的IPC通信方式

    一.Bundle Android的Activity.Service.Receiver都支持在Intent传递Bundle数据,Bundle实现了Parcelable接口, 所以能很方便的在不同进程之间 ...

  4. KingbaseES V8R6集群维护案例之--单实例数据迁移到集群案例

    案例说明: 生产环境是单实例,测试环境是集群,现需要将生产环境的数据迁移到集群中运行,本文档详细介绍了从单实例环境恢复数据到集群环境的操作步骤,可以作为生产环境迁移数据的参考. 适用版本: Kingb ...

  5. KingbaseES不同字符类型比较转换规则

    Postgresql 常用的字符数据类型的有char.varchar和text,其中 char 固定长度类型, varchar 和 text 是可变长度类型.这三种类型在进行比较时,会进行隐含的类型转 ...

  6. 分布式链路追踪体验-skywalking入门使用

    背景 旁友,你的线上服务是不是偶尔来个超时,或者突然抖动一下,造成用户一堆反馈投诉.然后你费了九牛二虎之力,查了一圈圈代码和日志才总算定位到问题原因了.或者公司内部有链路追踪系统,虽然可以很轻松地通过 ...

  7. Windows磁盘容量差异

    如果足够细心,你就能发现计算机管理里面显示的容量和我的电脑里面磁盘容量的显示有差异.我的电脑中显示的总会少一点. https://www.cnblogs.com/qishine/p/12125329. ...

  8. Grafana Loki 架构

    转载自:https://mp.weixin.qq.com/s?__biz=MzU4MjQ0MTU4Ng==&mid=2247492186&idx=2&sn=a06954384a ...

  9. docker-compose安装harbor

    目录 Harbor 安装环境说明 获取安装包(离线安装方式) 安装harbor 用docker-compose查看Harbor容器的运行状态 Harbor访问测试 上传镜像到Harbor服务器 Har ...

  10. 使用容器运行的minio配置https(TLS)访问

    使用certgen生成证书 下载地址:https://github.com/minio/certgen/releases/tag/v0.0.2 下载地址:https://files.cnblogs.c ...