inux and UNIX® like operating systems commonly use signals to communicate between processes. The use of the command line kill is widely known. WebSphere Application Servers on Linux and UNIX by default respond to kill -3 by producing a javacore, and to kill -11 by creating s system core and exiting. There are in fact a lot of signals that may be sent and acted on.

In some cases, we determine that a signal has unexpectedly come to a WebSphere Application Server and we need to determine which process/user sent the signal. This is possible in most cases with strace command for kill -3, but kill -9 and kill -11 are not usually reported.

The strace utility is fairly universal and starting it with this line will generally find the source of kill -3 and so on:

strace -tt -o /tmp/traceit -p <pid> &

This results in volumes of output that do include the source of most signals:

strace -tt -o /tmp/traceit -p <pid> &
16:08:45.388961 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
16:08:45.389113 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=21398, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---

16:09:01.210200 --- SIGTTOU {si_signo=SIGTTOU, si_code=SI_USER, si_pid=829, si_uid=1000} ---

In case you do not recognize SIGTTOU   use kill -l to list signals on your environment:

kill -l

1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP

6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1

11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM

16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP

21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ

26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR

31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3

38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8

43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13

48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12

53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7

58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2

63) SIGRTMAX-1 64) SIGRTMAX

which may be a surprise if you never used anything but 3, 9, and 11.   kill -22 is SIGTHOU and the process id and userid of the sender are listed. Unfortunately, most of the time strace does not show kill -9 and kill -11 as they are not trapped and all you get is this line:

++++  killed by SIGKILL  +++

There are 2 available tools that are not usually installed and/or active on Linux but have so much functionality, they should be. These tools are included in the Linux repositories for the RHEL, SUSE, and Fedora distributions and are installed as any other software package would be using the usual Linux install tools. Since they are very functional at the system level, root or elevated access rights are needed. However, the install process is quite simple and the functionality is worthwhile.

AUDIT

Auditd is a daemon process or service that does as the name implies and produces audit logs of System level activities. It is installed from the usual repository as the audit package and then is configured in /etc/audit/auditd.conf and the rules are in /etc/audit/audit.rules.

Example entry for kill signal logging:

-a entry,always -F arch=b64 -S kill -k kill_signals

then the command: sevice auditd start

will log all signals in /ver/audit/audit.log with a key of kill_signals for searching by your favorite editor or you may use ausearch -k kill_signals

Of course, this example captures all signals and is quite verbose. The usual output will look like this:

time->Wed Jun  3 16:34:08 2015
type=SYSCALL msg=audit(1433363648.091:6342): arch=c000003e syscall=62 success=no exit=-3 a0=1e06 a1=0 a2=1e06 a3=fffffffffffffff0 items=0 ppid=10044 pid=10140 auid=500 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=2 comm=4174746163682041504920696E6974 exe="/opt/ibm/WebSphere/AppServer/java/jre/bin/java" subj=unconfined_u:unconfined_r:unconfined_java_t:s0-s0:c0.c1023 key="kill_signals"
----
time->Wed Jun  3 16:34:08 2015
type=OBJ_PID msg=audit(1433363648.130:6343): opid=27307 oauid=-1 ouid=0 oses=-1 obj=system_u:system_r:initrc_t:s0 ocomm="symcfgd"
type=SYSCALL msg=audit(1433363648.130:6343): arch=c000003e syscall=62 success=yes exit=0 a0=6aab a1=12 a2=f a3=50d items=0 ppid=1 pid=27214 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sav-limitcpu" exe="/usr/bin/sav-limitcpu" subj=system_u:system_r:initrc_t:s0 key="kill_signals"
----
time->Wed Jun  3 16:34:08 2015

Stop the logging with service auditd stop command and see this link from RedHat for more information: How to use audit to monitor a specific SYSCALL

System Tap

This tool is relatively more complex and flexible than the audit tool. The tool provide probe and taps that are written in a script that is remarkably C like. It is similar to Dtrace on Solaris in that regard. It is also similar to Dtrace in that it offers a lot of probes to look at performance and memory as well as network activity. It too is easily installed (for example on RHEL yum install systemtap does it). Root access does seem to be required. Good news, it comes with a set of taps that will perform a comprehesive set of tracing. These live in /usr/share/systemtap. Root access is required or you may be a member of a group with the privileges.

The basic command:

stap sigkill.stp gets very verbose

even on lab systems while the same script can be filtered. An example to trace kill commands for a specific pid and a specific command:

stap sigkill.stp -x <pid> SIGKILL

which logs:

SIGKILL was sent to java (pid:<pid>) by bash uid:0

on testing on a command sent from the command line.

So you do need the script sigkill.stp which is created by RedHat and looks like this:

#! /usr/bin/env stap
# sigkill.stp
# Copyright (C) 2007 Red Hat, Inc., Eugene Teo <eteo@redhat.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
# /usr/share/systemtap/tapset/signal.stp:
# [...]
# probe signal.send = _signal.send.*
# {
#     sig=$sig
#     sig_name = _signal_name($sig)
#     sig_pid = task_pid(task)
#     pid_name = task_execname(task)
# [...]
probe signal.send {
  if (sig_name == "SIGKILL")
    printf("%s was sent to %s (pid:%d) by %s uid:%d\n",
           sig_name, pid_name, sig_pid, execname(), uid())
}

Here is a very useful link for System Tap. It shows some useful tools for tracking down most signals (strace) or all of them (audit and system tap):
Red Hat Enterprise Linux 6 SystemTap Beginners Guide Introduction to SystemTap

https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/Finding_the_source_of_signals_on_Linux_with_strace_auditd_or_Systemtap?lang=en

Finding the source of signals on Linux with strace, auditd, or systemtap的更多相关文章

  1. Linux利器 strace [看出process呼叫哪個system call]

    Linux利器 strace strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必 ...

  2. linux神器strace

    man strace: strace - trace system calls and signals DESCRIPTION In the simplest case strace runs the ...

  3. linux神器 strace解析

    除了人格以外,人最大的损失,莫过于失掉自信心了. 前言 strace可以说是神器一般的存在了,对于研究代码调用,内核级调用.系统级调用有非常重要的作用.打算了一周了,只有原文,一直没有梳理,拖延症犯了 ...

  4. linux申请strace ,lstrace, ptrace, dtrace

    ltrace命令是用来跟踪进程调用库函数的情况. ltrace -hUsage: ltrace [option ...] [command [arg ...]]Trace library calls ...

  5. Linux 的 strace 命令

    https://linux.cn/article-3935-1.html http://www.cnblogs.com/ggjucheng/archive/2012/01/08/2316692.htm ...

  6. Linux调试工具strace和gdb常用命令小结

    strace和gdb是Linux环境下的两个常用调试工具,这里是个人在使用过程中对这两个工具常用参数的总结,留作日后查看使用. strace调试工具 strace工具用于跟踪进程执行时的系统调用和所接 ...

  7. linux下strace命令详解

    简介 strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核 ...

  8. 使用 Linux 的 strace 命令跟踪/调试程序的常用选项

    原文:http://linoxide.com/linux-command/linux-strace-command-examples/作者: Raghu 在调试的时候,strace能帮助你追踪到一个程 ...

  9. Linux利器strace

    strace常用来跟踪进程执行时的系统调用和所接收的信号. 在Linux世界,进程不能直接访问硬件设备,当进程需要访问硬件设备(比如读取磁盘文件,接收网络数据等等)时,必须由用户态模式切换至内核态模式 ...

随机推荐

  1. 面试题----gcc的编译流程

    gcc编译流程 一.    编译与处理指令: gcc -E hello.c -o a.c 如果不使用-o指定输出的文件,会默认输出到终端.所以建议使用同时使用-o选项. 还要注意:编译时会保留#pra ...

  2. ELK常用命令

    1.查询当前所有的索引 #curl 'localhost:9200/_cat/indices?v' 2.查看集群健康状态 #curl 'localhost:9200/_cat/health?v' 绿色 ...

  3. 基于opencv将视频转化为字符串Java版

    基于opencv将视频转化为字符串Java版 opencv java  先上一个效果图吧 首先,弄清一下原理 我们要将视频转化为字符画,那么就需要获取画面的每一帧,也就是每一张图片,然后将图片进行转化 ...

  4. 【angular5项目积累总结】侧栏菜单 navmenu

    View Code import { Component, OnInit } from '@angular/core'; import { HttpClient } from '@angular/co ...

  5. 深入出不来nodejs源码-V8引擎初探

    原本打算是把node源码看得差不多了再去深入V8的,但是这两者基本上没办法分开讲. 与express是基于node的封装不同,node是基于V8的一个应用,源码内容已经渗透到V8层面,因此这章简述一下 ...

  6. ASP.NET MVC显示异常信息

    开发ASP.NET多了,它的异常信息显示也习惯了.但在ASP.NET MVC中,却是另外一番情形. 以前只习惯使用IE浏览器,现在开发ASP.NET MVC程序,为了捕获到异常信息,Firefox的f ...

  7. JavaScript学习总结(五)——jQuery插件开发与发布

    jQuery插件就是以jQuery库为基础衍生出来的库,jQuery插件的好处是封装功能,提高了代码的复用性,加快了开发速度,现在网络上开源的jQuery插件非常多,随着版本的不停迭代越来越稳定好用, ...

  8. SqlServer 使用sys.dm_tran_locks处理死锁问题

    1.模拟资源锁定 --开始事务BEGIN TRANSACTION--更新数据update Table_1 set FuncName=FuncName--等待1分钟WAITFOR DELAY '01:0 ...

  9. 深入理解Java线程池:ThreadPoolExecutor

    线程池介绍 在web开发中,服务器需要接受并处理请求,所以会为一个请求来分配一个线程来进行处理.如果每次请求都新创建一个线程的话实现起来非常简便,但是存在一个问题: 如果并发的请求数量非常多,但每个线 ...

  10. eclipse .properties插件

    资源文件 即 .properties 文件是常用于国际化: eclipse默认的 .properties 文件编辑器有几个问题: 编码问题 多种语言同步问题 下面介绍2种eclipse的 .prope ...