早上一到,发现oracle连不上。

到主机上,发现只有oracleora11g一个进程,其他进程全没了。

Nov 14 23:33:30 hs-test-10-20-30-15 kernel: INFO: task sadc:14833 blocked for more than 120 seconds.
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: sadc D 0000000000000000 0 14833 14832 0x00000084
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: ffff88061533bdc8 0000000000000086 0000000000000000 ffff88061533bde8
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: ffff88061533bd88 ffffffff8111f3e0 ffff880528dab9d0 ffff88061533bde8
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: ffff880614125af8 ffff88061533bfd8 000000000000fbc8 ffff880614125af8
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: Call Trace:
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff8111f3e0>] ? find_get_pages_tag+0x40/0x130
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff81134c91>] ? do_writepages+0x21/0x40
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffffa02b6938>] jbd2_complete_transaction+0x68/0xb0 [jbd2]
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffffa02d2231>] ext4_sync_file+0x121/0x1d0 [ext4]
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811baa61>] vfs_fsync_range+0xa1/0x100
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811bab6e>] do_fsync+0x3e/0x60
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff811baba3>] sys_fdatasync+0x13/0x20
Nov 14 23:33:30 hs-test-10-20-30-15 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: INFO: task NetworkManager:2081 blocked for more than 120 seconds.
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: NetworkManage D 0000000000000001 0 2081 1 0x00000080
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: ffff880614185dc8 0000000000000082 0000000000000000 ffff880613b13e80
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: 0000000000000000 ffff880612e5e0d0 0000000000000000 0000000000000000
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: ffff88061464bab8 ffff880614185fd8 000000000000fbc8 ffff88061464bab8
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: Call Trace:
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff81134c91>] ? do_writepages+0x21/0x40
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b6938>] jbd2_complete_transaction+0x68/0xb0 [jbd2]
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffffa02d2231>] ext4_sync_file+0x121/0x1d0 [ext4]
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811baa61>] vfs_fsync_range+0xa1/0x100
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab6e>] do_fsync+0x3e/0x60
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff811babc0>] sys_fsync+0x10/0x20
Nov 15 00:01:29 hs-test-10-20-30-15 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: INFO: task NetworkManager:2081 blocked for more than 120 seconds.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: NetworkManage D 0000000000000001 0 2081 1 0x00000080
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff880614185dc8 0000000000000082 0000000000000000 ffff880613b13e80
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: 0000000000000000 ffff880612e5e0d0 0000000000000000 0000000000000000
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88061464bab8 ffff880614185fd8 000000000000fbc8 ffff88061464bab8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Call Trace:
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff81134c91>] ? do_writepages+0x21/0x40
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b6938>] jbd2_complete_transaction+0x68/0xb0 [jbd2]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02d2231>] ext4_sync_file+0x121/0x1d0 [ext4]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811baa61>] vfs_fsync_range+0xa1/0x100
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab2d>] vfs_fsync+0x1d/0x20
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811bab6e>] do_fsync+0x3e/0x60
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff811babc0>] sys_fsync+0x10/0x20
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: INFO: task sadc:15210 blocked for more than 120 seconds.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Not tainted 2.6.32-431.el6.x86_64 #1
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: sadc D 0000000000000000 0 15210 15209 0x00000084
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88091ed9bdc8 0000000000000082 0000000000000000 ffff88091ed9bde8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88091ed9bd88 ffffffff8111f3e0 ffff88008f60a9d0 ffff88091ed9bde8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: ffff88061439bab8 ffff88091ed9bfd8 000000000000fbc8 ffff88061439bab8
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: Call Trace:
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8111f3e0>] ? find_get_pages_tag+0x40/0x130
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffffa02b65a5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
Nov 15 00:03:29 hs-test-10-20-30-15 kernel: [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40

原因以及排查思路:

Under heavy IO load on servers you may see something like:

INFO: task nfsd:2252 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

...probably followed by a call trace that mentions your filesystem, and probably io_schedule and sync_buffer.

This message is not an error.

It is an indication that a program has had to wait for a very long time, and what it was doing. (which is not so informative of the reason - it's common that the real IO load issue comes from another process)

The code behind this sits in hung_task.c and was added somewhere around 2.6.30. This is a kernel thread that detects tasks that stays in the D state for a while (which typically meaning it is waiting for IO).

It complains when it sees a process has been waiting on IO so long that the whole process has not been scheduled for any CPU-time for 120 seconds (default).

Notes:

  • if it happens constantly your IO system is slower than your IO use
  • most likely to happen to a process that was ioniced into the idle class. Which means it's working, idle-class is meant as an extreme politeness thing. It just indicates something else is doing a bunch of IO right now (for at least 120 seconds)
e.g. updatedb (may be victim if it were ioniced, cause if not)
  • if it happens only nightly, look at your cron jobs
  • trashing system can cause this, and then it's purely a side effect of one program using too much RAM
  • being blocked by a desktop-class drive with bad sectors (because they retry for a long while)
  • NFS seems to be a common culprit, probably because it's good at filling the writeback cache, something which implies blocking while writeback happens - which is likely to block various things related to the same filesystem. (verify)
  • if it happens on a fileserver, you may want to consider spreading to more fileservers, or using a parallel filesystem
if your load is fairly sequential, you may get some relief from using the noop io scheduler (instead of cfq) though note that that disables ionice)
if your load is relatively random, upping the queue depth may help

kernel: INFO: task sadc:14833 blocked for more than 120 seconds.的更多相关文章

  1. INFO: task java:27465 blocked for more than 120 seconds不一定是cache太大的问题

    这几天,老有几个环境在中午收盘后者下午收盘后那一会儿,系统打不开,然后过了一会儿,进程就消失不见了,查看了下/var/log/message,有如下信息: Dec 12 11:35:38 iZ23nn ...

  2. task mysqld:26208 blocked for more than 120 seconds

    早上10点左右,某台线上ECS服务器突然没响应. 查看日志,发现如下信息: Aug 14 03:26:01 localhost rsyslogd: [origin software="rsy ...

  3. linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案(转)

    linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案 1 问题描述 服务器内存满了,ssh登录失败 , ...

  4. linux 出错 “INFO: task java: xxx blocked for more than 120 seconds.” 的3种解决方案

    1 问题描述 最近搭建的一个linux最小系统在运行到241秒时在控制台自动打印如下图信息,并且以后每隔120秒打印一次. 仔细阅读打印信息发现关键信息是“hung_task_timeout_secs ...

  5. linux 出错 “INFO: task xxxxxx: 634 blocked for more than 120 seconds.”的3种解决方案

    https://blog.csdn.net/electrocrazy/article/details/79377214

  6. Linux 日志报错 xxx blocked for more than 120 seconds

    监控作业发现一台服务器(Red Hat Enterprise Linux Server release 5.7)从凌晨1:32开始,有一小段时间无法响应,数据库也连接不上,后面又正常了.早上检查了监听 ...

  7. Linux系统出现hung_task_timeout_secs和blocked for more than 120 seconds的解决方法

    Linux系统出现系统没有响应. 在/var/log/message日志中出现大量的 “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" ...

  8. hung_task_timeout_secs 和 blocked for more than 120 seconds

    https://help.aliyun.com/knowledge_detail/41544.html 问题现象 云服务器 ECS Linux 系统出现系统没有响应. 在/var/log/messag ...

  9. 服务器卡死,重启报错: INFO: task blocked for more than 120 seconds

    问题:服务器负载很高,但是CPU利用率不高.服务器经常夯住,网站打不开,SSH连接非常不稳定,输入命令夯住. 重启服务器报错: INFO: task blocked for more than 120 ...

随机推荐

  1. sklearn中随机森林的参数

    一:sklearn中决策树的参数: 1,criterion: ”gini” or “entropy”(default=”gini”)是计算属性的gini(基尼不纯度)还是entropy(信息增益),来 ...

  2. python数据分析及展示(一)

    一.IDE选择 Anaconda软件:开源免费,https://www.anaconda.com下载,根据系统进行安装.由于下载速度慢,可以去清华大学开源软件镜像站下载. Spyder软件设置:Too ...

  3. nginx的https和http共存反向代理配置

    一.设置http反向代理: upstream ly.com { server ; server ; } upstream home.ly.com { server ; server ; } 对应增加: ...

  4. 关于linux - Centos 7 系统下使用PXE网络的方式(pxe+dhcpd+tftp+httpd)安装操作系统

    PXE(Pre-boot Execution Environment)是由Intel设计的协议,它可以使计算机通过网络而不是从本地硬盘.光驱等设备启动. 现代的网卡,一般都内嵌支持PXE的ROM芯片. ...

  5. trajan

    模板 const int N=10005; struct Edge { int v,next; }edge[5*N]; int dfn[N],low[N]; int stack[N],node[N], ...

  6. layui实现左侧菜单点击右侧内容区显示

    https://segmentfault.com/a/1190000014617129

  7. 关于 systemctl --user status 报错的问题

    关于 systemctl --user enable mpd 报错: Failed to connect to bus: No such file or directory 因为arch脚本中,sys ...

  8. Java中的 内部类(吐血总结)

    1. 内部类的作用 内部类是一个独立的实体,可以用来实现闭包:能与外部类通信:内部类与接口使得多继承更完整 2. 内部类的分类 1)普通内部类 类的实例相关,可以看成是一个实例变量.内部类的类名由 “ ...

  9. SQL语句利用日志写shell

    outfile被禁止,或者写入文件被拦截: 在数据库中操作如下:(必须是root权限) show variables like '%general%'; #查看配置 set global genera ...

  10. php 字符串截取,支持中文和其他编码

    function.php //使用方法 $content= mb_substr($content,0,25,'utf-8'); /** * 字符串截取,支持中文和其他编码 * @static * @a ...