内核调试神器SystemTap — 更多功能与原理(三)
a linux trace/probe tool.
官网:https://sourceware.org/systemtap/
用户空间
SystemTap探测用户空间程序需要utrace的支持,3.5以上的内核版本默认支持。
对于3.5以下的内核版本,需要自己打相关补丁。
更多信息:http://sourceware.org/systemtap/wiki/utrace
需要:
debugging information for the named program
utrace support in the kernel
(1) Begin/end
探测点:
进程/线程创建时
进程/线程结束时
process.begin
process("PATH").begin
process(PID).begin
process.thread.begin
process("PATH").thread.begin
process(PID).thread.begin
process.end
process("PATH").end
process(PID).end
process.thread.end
process("PATH").thread.end
process(PID).thread.end
(2) Syscall
探测点:
系统调用开始
系统调用返回
process.syscall
process("PATH").syscall
process(PID).syscall
process.syscall.return
process("PATH").syscall.return
process(PID).syscall.return
可用的进程上下文变量:
$syscall // 系统调用号
$argN ($arg1~$arg6) // 系统调用参数
$return // 系统调用返回值
(3) Function/statement
探测点:
函数入口处
函数返回处
文件中某行
函数中的某个标签
process("PATH").function("NAME")
process("PATH").statement("*@FILE.c:123")
process("PATH").function("*").return
process("PATH").function("myfunc").label("foo")
(4) Absolute variant
探测点:
进程的虚拟地址
process(PID).statement(ADDRESS).absolute
A non-symbolic probe point uses raw, unverified virtual addresses and provide no $variables.
The target PID parameter must identify a running process and ADDRESS must identify a valid instruction address.
This is a guru mode probe.
(5) Target process
探测点:
动态链接库中的函数(比如glibc)
Target process mode (invoked with stap -c CMD or -x PID) implicitly restricts all process.* probes to the given child
process.
If PATH names a shared library, all processes map that shared library can be probed.
If dwarf debugging information is installed, try using a command with this syntax:
probe process("/lib64/libc-2.8.so").function("...") { ... }
(6) Instruction probes
探测点:
单条指令
指令块
process("PATH").insn
process(PID).insn
process("PATH").insn.block
process(PID).insn.block
The .insn probe is called for every single-stepped instruction of the process described by PID or PATH.
The .insn.block probe is called for every block-stepped instruction of the process described by PID or PATH.
Using this feature will significantly slow process execution.
统计一个进程执行了多少条指令:
stap -e 'global steps; probe process("/bin/ls").insn {steps++}; probe end {printf("Total instruction: %d\n", steps)}' \
-c /bin/ls
(7) 使用
gcc -g3 -o test test.c
stap -L 'process("./test").function("*")' // 显示程序中的函数和变量
调试等级:
Request debugging information and also use level to specify how much information. The default level is 2.
Level 0 produces no debug information at all. Thus, -g0 negates -g.
Level 1 produces minimal information, enough for making backtraces in parts of the program that you don't
plan to debug. This includes descriptions of functions and external variables, but no information about local
variables and no line numbers.
Level 3: includes extra information, such as all the macro definitions present in the program.
高级功能
(1) 自建脚本库
A tapset is just a script that designed for reuse by installation into a special directory.
Systemtap attempts to resolve references to global symbols (probes, functions, variables) that are not defined
within the script by a systematic search through the tapset library for scripts that define those symbols.
A user may give additional directories with the -I DIR option.
构建自己的库:
1. 创建库目录mylib,添加两个库文件
time-default.stp
function __time_value() {
return gettimeofday_us()
}
time-common.stp
global __time_vars function timer_begin(name) {
__time_vars[name] = __time_value()
} function timer_end(name) {
return __time_value() - __time_vars[name]
}
2. 编写应用脚本
tapset-time-user.stp
probe begin {
timer_begin("bench")
for(i=0; i<1000; i++) ;
printf("%d cycles\n", timer_end("bench"))
exit()
}
3. 执行
stap -I mylib/ tapset-time-user.stp
(2) 探测点重命名
主要用于在探测点之上提供一个抽象层。
Probe point aliases allow creation of new probe points from existing ones.
This is useful if the new probe points are named to provide a higher level of abstraction.
格式:
probe new_name = existing_name1, existing_name2[, ..., existing_nameN]
{
prepending behavior
}
实例:
probe syscallgroup.io = syscall.open, syscall.close,
syscall.read, syscall.write
{
groupname = "io"
} probe syscallgroup.process = syscall.fork, syscall.execve
{
groupname = "process"
} probe syscallgroup.*
{
groups[execname() . "/" . groupname]++
} global groups probe end
{
foreach (eg in groups+)
printf("%s: %d\n", eg, groups[eg])
}
(3) 嵌入C代码
SystemTap provides an "escape hatch" to go beyond what the language can safely offer.
嵌入的C代码段用%{和%}括起来,执行脚本时要加-g选项。
提供一个THIS宏,可以用于获取函数参数和保存函数返回值。
实例:
%{
#include <linux/sched.h>
#include <linux/list.h>
%} function process_list()
%{
struct task_struct *p;
struct list_head *_p, *_n; printk("%-20s%-10s\n", "program", "pid"); list_for_each_safe(_p, _n, ¤t->tasks) {
p = list_entry(_p, struct task_struct, tasks);
printk("%-20s%-10d\n", p->comm, p->pid);
}
%} probe begin {
process_list()
exit()
}
stap -g embeded-c.stp
dmesg可看到打印出的所有进程。
C代码用%{ ... %}括起来,可以是独立的一个段,可以作为函数的一部分,也可以只是一个表达式。
(4) 已有脚本库
SystemTap默认提供了非常强大的脚本库,主要类别如下:
Context Functions
Timestamp Functions
Time utility functions
Shell command functions
Memory Tapset
Task Time Tapset
Secheduler Tapset
IO Scheduler and block IO Tapset
SCSI Tapset
TTY Tapset
Interrupt Request (IRQ) Tapset
Networking Tapset
Socket Tapset
SNMP Information Tapset
Kernel Process Tapset
Signal Tapset
Errno Tapset
Device Tapset
Directory-entry (dentry) Tapset
Logging Tapset
Queue Statistics Tapset
Random functions Tapset
String and data retrieving functions Tapset
String and data writing functions Tapset
Guru tapsets
A collection of standard string functions
Utility functions for using ansi control chars in logs
SystemTap Translator Tapset
Network File Storage Tapsets
Speculation
实现原理
(1) SystemTap脚本的执行流程
pass1
During the parsing of the code, it is represented internally in a parse tree.
Preprocessing is performed during this step, and the code is checked for semantic and syntax errors.
pass2
During the elaboration step, the symbols and references in the SystemTap script are resolved.
Also, any tapsets that are referenced in the SystemTap script are imported.
Debug data that is read from the DWARF(a widely used, standardized debugging data format) information,
which is produced during kernel compilation, is used to find the addresses for functions and variables
referenced in the script, and allows probes to be placed inside functions.
pass3
Takes the output from the elaboration phase and converts it into C source code.
Variables used by multiple probes are protected by locks. Safety checks, and any necessary locking, are
handled during the translation. The code is also converted to use the Kprobes API for inserting probe points
into the kernel.
pass4
Once the SystemTap script has been translated into a C source file, the code is compiled into a module that
can be dynamically loaded and executed in the kernel.
pass5
Once the module is built, SystemTap loads the module into the kernel.
When the module loads, an init routine in the module starts running and begins inserting probes into their
proper locations. Hitting a probe causes execution to stop while the handler for that probe is called.
When the handler exits, normal execution continues. The module continues waiting for probes and executing
handler code until the script exits, or until the user presses Ctrl-c, at which time SystemTap removes the
probes, unloads the module, and exits.
Output from SystemTap is transferred from the kernel through a mechanism called relayfs, and sent to STDOUT.
(2) 从用户空间和内核空间来看SystemTap脚本的执行
(3) kprobes
断点指令(breakpoint instruction):__asm INT 3,机器码为CC。
断点中断(INT3)是一种软中断,当执行到INT 3指令时,CPU会把当时的程序指针(CS和EIP)压入堆栈保存起来,
然后通过中断向量表调用INT 3所对应的中断例程。
INT是软中断指令,中断向量表是中断号和中断处理函数地址的对应表。
INT 3即触发软中断3,相应的中断处理函数的地址为:中断向量表地址 + 4 * 3。
A Kprobe is a general purpose hook that can be inserted almost anywhere in the kernel code.
To allow it to probe an instruction, the first byte of the instruction is replaced with the breakpoint
instruction for the architecture being used. When this breakpoint is hit, Kprobe takes over execution,
executes its handler code for the probe, and then continues execution at the next instruction.
(4) 依赖的内核特性
kprobes/jprobes
return probes
reentrancy
colocated (multiple)
relayfs
scalability (unlocked handlers)
user-space probes
内核调试神器SystemTap — 更多功能与原理(三)的更多相关文章
- 内核调试神器SystemTap — 简介与使用(一)
a linux trace/probe tool. 官网:https://sourceware.org/systemtap/ 简介 SystemTap是我目前所知的最强大的内核调试工具,有些家伙甚至说 ...
- 内核调试神器SystemTap — 简单介绍与使用(一)
a linux trace/probe tool. 官网:https://sourceware.org/systemtap/ 简单介绍 SystemTap是我眼下所知的最强大的内核调试工具,有些家伙甚 ...
- 内核调试神器SystemTap — 探测点与语法(二)
a linux trace/probe tool. 官网:https://sourceware.org/systemtap/ 探测点 SystemTap脚本主要是由探测点和探测点处理函数组成的,来看下 ...
- 内核调试神器SystemTap — 探測点与语法(二)
a linux trace/probe tool. 官网:https://sourceware.org/systemtap/ 探測点 SystemTap脚本主要是由探測点和探測点处理函数组成的,来看下 ...
- 内核调试神器SystemTap 转摘
http://blog.csdn.net/zhangskd/article/details/25708441 https://sourceware.org/systemtap/wiki/WarStor ...
- Linux内核调试的方式以及工具集锦【转】
转自:https://blog.csdn.net/gatieme/article/details/68948080 版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原 ...
- Linux内核调试的方式以及工具集锦
原文:https://blog.csdn.net/gatieme/article/details/68948080 CSDN GitHubLinux内核调试的方式以及工具集锦 LDD-LinuxDev ...
- linux内核调试指南
linux内核调试指南 一些前言 作者前言 知识从哪里来 为什么撰写本文档 为什么需要汇编级调试 ***第一部分:基础知识*** 总纲:内核世界的陷阱 源码阅读的陷阱 代码调试的陷阱 原理理解的陷阱 ...
- Linux内核调试方法总结
Linux内核调试方法总结 一 调试前的准备 二 内核中的bug 三 内核调试配置选项 1 内核配置 2 调试原子操作 四 引发bug并打印信息 1 BUG()和BUG_ON() 2 ...
随机推荐
- 多线程之Java线程阻塞与唤醒
线程的阻塞和唤醒在多线程并发过程中是一个关键点,当线程数量达到很大的数量级时,并发可能带来很多隐蔽的问题.如何正确暂停一个线程,暂停后又如何在一个要求的时间点恢复,这些都需要仔细考虑的细节.在Java ...
- Druid VS Antlr4
DRUID VS ANTLR4 测试方法 环境:x86_64,eclipse kepler,jdk 6 测试对象:antlr v4,druid手写sql parser模块 测试过程:分别采用单线程.多 ...
- Android的四个基本概念(线程通信和GLSurfaceView)
GLSurfaceView提供了下列特性: 1> 管理一个surface,这个surface就是一块特殊的内存,能直接排版到android的视图view上. 2> 管理一个EGL disp ...
- C++对C语言register的增强
register关键字 请求编译器让变量a直接放在寄存器里面,速度快 在c语言中 register修饰的变量 不能取地址,但是在c++里面做了内容 1 register关键字的变化 register关 ...
- 【shell脚本】ftp自动上传mysql备份文件
上一篇中 mysql每日备份shell脚本 给出了使用mysqldump备份到本地的脚本,接着下面是利用ftp把备份文件传输到远程服务器的脚本. 当然也可以用scp,rsync等等方案. #!/bin ...
- 12.1、Libgdx的图像之持续性和非持续性渲染
(官网:www.libgdx.cn) Libgdx在默认情况下,渲染现成调用render()方法进行持续性渲染.频率取决于你的硬件设备. 有时候有些游戏中并不需要持续性的渲染,为了省电,可以关掉持续性 ...
- TCP的ACK确认系列 — 发送状态转换机
主要内容:TCP的ACK发送方式,以及ACK发送状态转换机的实现. 内核版本:3.15.2 我的博客:http://blog.csdn.net/zhangskd 概述 TCP采用两种方式来发送ACK: ...
- Chipmunk僵尸物理对象的出现和解决(二)
如第一篇文章中图片所示,该游戏是一个弹球游戏. 玩法很简单,屏幕底部有一个反弹棒,用来确保小球不掉出屏幕同时反弹小球撞击屏幕上方的砖块. 玩家可以触摸屏幕来左右移动反弹棒. 等等!还不是这么简单,当小 ...
- (NO.00002)iOS游戏精灵战争雏形(六)
接下来我们给MainScene场景再添加一个精灵,作为敌人. 双击SpriteBuilder中的MainScene.ccb,从控件库拖入一个CCSprite到CCPhysicsNode中,设置精灵帧为 ...
- 【leetcode73】经典算法-Guess Number Higher or Lower
题目描述: 从1-n中,随便的拿出一个数字,你来猜测. 提示 提供一个guess(int num)的api,针对猜测的数字,返回三个数值.0,-1,1 0;猜中返回num -1:比猜测的数值小 1:比 ...