Android ANR log trace日志文件分析
什么是ANR?
ANR:Application Not Responding,即应用无响应
ANR日志Trace文件获取
系统生成的Trace文件保存在data/anr,可以用过命令adb pull data/anr/取出
traces.txt只保留最后一次ANR的信息,Android系统有个DropBox功能功能,它能记录系统出现的crash错误.因此保留有发生过的ANR的信息.(log路径:/data/system/dropbox)
获取系统crash log: adb shell dumpsys dropbox --print >>log.txt
- Trace文件怎么生成的?
- 当APP(包括系统APP和用户APP)进程出现ANR、应用响应慢或WatchDog的监视没有得到回馈时,
- 系统会dump此时的top进程,进程中Thread的运行状态就都dump到这个Trace文件中了.
导致ANR的常见几种情况:
1:Input dispatching timed out(5 seconds) 按键或触摸事件处理超时(一般是UI主线程做了耗时的操作,这类ANR最常见)
2:BroadcastTimeout(10 seconds) 广播的分发和处理超时(一般是onReceiver执行时间过长)
3:ServiceTimeout(20 seconds) Service的启动和执行超时
另外还有ProviderTimeout和WatchDog等导致的ANR.还有当系统内存或CPU资源不足时容易出现ANR,一般这种情况会有lowmemorykill的log打印.
应用ANR产生的时候,ActivityManagerService的appNotResponding方法就会被调用,然后在/data/anr/traces.txt文件中写入ANR相关信息.
- final void appNotResponding(ProcessRecord app, ActivityRecord activity,
- ActivityRecord parent, boolean aboveSystem, final String annotation) {
- // ... ...
- if (MONITOR_CPU_USAGE) {
- updateCpuStatsNow(); // 更新CPU使用率
- }
- // ... ...
- final ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
- // dumpStackTraces是输出traces文件的函数
- File tracesFile = dumpStackTraces(true, firstPids, processCpuTracker, lastPids,
- NATIVE_STACKS_OF_INTEREST);
- String cpuInfo = null;
- if (MONITOR_CPU_USAGE) {
- updateCpuStatsNow(); // 再次更新CPU信息
- synchronized (mProcessCpuTracker) {
- // 输出ANR发生前一段时间内的CPU使用率
- cpuInfo = mProcessCpuTracker.printCurrentState(anrTime);
- }
- info.append(processCpuTracker.printCurrentLoad());
- info.append(cpuInfo);
- }
- // 输出ANR发生后一段时间内的CPU使用率
- info.append(processCpuTracker.printCurrentState(anrTime));
- Slog.e(TAG, info.toString());
- if (tracesFile == null) {
- // There is no trace file, so dump (only) the alleged culprit's threads to the log
- Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
- }
- // 将ANR信息同时输出到DropBox中
- addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
- cpuInfo, tracesFile, null);
- // ... ...
- synchronized (this) {
- // 显示ANR提示对话框
- // Bring up the infamous App Not Responding dialog
- Message msg = Message.obtain();
- HashMap<String, Object> map = new HashMap<String, Object>();
- msg.what = SHOW_NOT_RESPONDING_MSG;
- msg.obj = map;
- msg.arg1 = aboveSystem ? 1 : 0;
- map.put("app", app);
- if (activity != null) {
- map.put("activity", activity);
- }
- mUiHandler.sendMessage(msg);
- }
- }
避免ANR?
UI线程尽量只做跟UI相关的工作
耗时的工作(比如数据库操作,I/O,连接网络或者别的有可能阻碍UI线程的操作)把它放入单独的线程处理
尽量用Handler来处理UIthread和别的thread之间的交互
分析ANR的Log:
出现ANR的log如下:
- 06-22 10:37:46.204 3547 3604 E ActivityManager: ANR in org.code:MessengerService // ANR出现的进程包名
- E ActivityManager: PID: 17027 // ANR进程ID
- E ActivityManager: Reason: executing service org.code/.ipc.MessengerService //导致ANR的原因
- E ActivityManager: Load: 8.31 / 8.12 / 8.47
- E ActivityManager: CPU usage from 0ms to 6462ms later: //CPU在ANR发生后的使用情况
- E ActivityManager: 61% 3547/system_server: 21% user + 39% kernel / faults: 13302 minor 2 major
- E ActivityManager: 0.2% 475/debuggerd: 0% user + 0.1% kernel / faults: 6086 minor 1 major
- E ActivityManager: 10% 5742/com.android.phone: 5.1% user + 5.1% kernel / faults: 7597 minor
- E ActivityManager: 6.9% 5342/com.android.systemui: 2.1% user + 4.8% kernel / faults: 4158 minor
- E ActivityManager: 0.1% 477/debuggerd64: 0% user + 0.1% kernel / faults: 4013 minor
- E ActivityManager: 5.7% 16313/org.code: 3.2% user + 2.4% kernel / faults: 2412 minor
- E ActivityManager: 3.7% 17027/org.code:MessengerService: 1.7% user + 2% kernel / faults: 2571 minor 6 major
- E ActivityManager: 2.6% 306/cfinteractive: 0% user + 2.6% kernel
- ... ...
- E ActivityManager: +0% 17168/kworker/0:1: 0% user + 0% kernel
- E ActivityManager: 0% TOTAL: 0% user + 0% kernel + 0% softirq // CUP占用情况
- E ActivityManager: CPU usage from 5628ms to 6183ms later:
- E ActivityManager: 42% 3547/system_server: 17% user + 24% kernel / faults: 11 minor
- E ActivityManager: 12% 3604/ActivityManager: 1.7% user + 10% kernel
- E ActivityManager: 12% 3609/android.display: 8.7% user + 3.5% kernel
- E ActivityManager: 3.5% 5304/Binder_6: 1.7% user + 1.7% kernel
- E ActivityManager: 3.5% 5721/Binder_A: 1.7% user + 1.7% kernel
- E ActivityManager: 3.5% 5746/Binder_C: 3.5% user + 0% kernel
- E ActivityManager: 1.7% 3599/Binder_1: 0% user + 1.7% kernel
- E ActivityManager: 1.7% 3600/Binder_2: 0% user + 1.7% kernel
- I ActivityManager: Killing 17027:org.code:MessengerService/u0a85 (adj 1): bg anr
- I art : Wrote stack traces to '/data/anr/traces.txt' //art这个TAG打印对traces操作的信息
- D GraphicsStats: Buffer count: 9
- W ActivityManager: Scheduling restart of crashed service org.code/.ipc.MessengerService in 1000ms
log打印了ANR的基本信息,我们可以分析CPU使用率推测ANR发生的时候设备在做什么工作;如果CPU使用率很高,接近100%,可能是在进行大规模的计算更可能是陷入死循环;如果CUP使用率很低,说明主线程被阻塞了,并且当IOwait很高,可能是主线程在等待I/O操作的完成.
对于ANR只是分析Log很难知道问题所在,我们还需要通过Trace文件分析stack调用情况.
- ----- pid 17027 at 2017-06-22 10:37:39 ----- // ANR出现的进程pid和时间(和上述log中pid一致)
- Cmd line: org.code:MessengerService // ANR出现的进程名
- Build fingerprint: 'Homecare/qucii8976v3_64:6.0.1/pansen06141150:eng/test-keys' // 下面记录系统版本,内存等状态信息
- ABI: 'arm64'
- Build type: optimized
- Zygote loaded classes=6576 post zygote classes=13
- Intern table: 13780 strong; 17 weak
- JNI: CheckJNI is on; globals=283 (plus 158 weak)
- Libraries: /system/lib64/libandroid.so /system/lib64/libcompiler_rt.so /system/lib64/libjavacrypto.so /system/lib64/libjnigraphics.so /system/lib64/libmedia_jni.so /system/lib64/libwebviewchromium_loader.so libjavacore.so (7)
- Heap: 29% free, 8MB/12MB; 75731 objects
- Dumping cumulative Gc timings
- Total number of allocations 75731
- Total bytes allocated 8MB
- Total bytes freed 0B
- Free memory 3MB
- Free memory until GC 3MB
- Free memory until OOME 183MB
- Total memory 12MB
- Max memory 192MB
- Zygote space size 3MB
- Total mutator paused time: 0
- Total time waiting for GC to complete: 0
- Total GC count: 0
- Total GC time: 0
- Total blocking GC count: 0
- Total blocking GC time: 0
- suspend all histogram: Sum: 76us 99% C.I. 0.100us-28us Avg: 7.600us Max: 28us
- DALVIK THREADS (15):
- // Signal Catcher是记录traces信息的线程
- // Signal Catche(线程名)、(daemon)表示守护进程、prio(线程优先级,默认是5)、tid(线程唯一标识ID)、Runnable(线程当前状态)
- "Signal Catcher" daemon prio=5 tid=3 Runnable
- //线程组名称、suspendCount、debugSuspendCount、线程的Java对象地址、线程的Native对象地址
- | group="system" sCount=0 dsCount=0 obj=0x12d8f0a0 self=0x5598ae55d0
- //sysTid是线程号(主线程的线程号和进程号相同)
- | sysTid=17033 nice=0 cgrp=default sched=0/0 handle=0x7fb2350450
- | state=R schedstat=( 4348125 172343 3 ) utm=0 stm=0 core=1 HZ=100
- | stack=0x7fb2254000-0x7fb2256000 stackSize=1013KB
- | held mutexes= "mutator lock"(shared held)
- native: #00 pc 0000000000489e28 /system/lib64/libart.so (art::DumpNativeStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int, char const*, art::ArtMethod*, void*)+236)
- native: #01 pc 0000000000458fe8 /system/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+220)
- native: #02 pc 0000000000465bc8 /system/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+688)
- native: #03 pc 0000000000466ae0 /system/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*)+276)
- native: #04 pc 000000000046719c /system/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+188)
- native: #05 pc 0000000000467a84 /system/lib64/libart.so (art::ThreadList::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+492)
- native: #06 pc 0000000000431194 /system/lib64/libart.so (art::Runtime::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+96)
- native: #07 pc 000000000043e604 /system/lib64/libart.so (art::SignalCatcher::HandleSigQuit()+1256)
- native: #08 pc 000000000043f214 /system/lib64/libart.so (art::SignalCatcher::Run(void*)+452)
- native: #09 pc 0000000000068714 /system/lib64/libc.so (__pthread_start(void*)+52)
- native: #10 pc 000000000001c604 /system/lib64/libc.so (__start_thread+16)
- (no managed stack frames)
- //main(线程名)、prio(线程优先级,默认是5)、tid(线程唯一标识ID)、Sleeping(线程当前状态)
- "main" prio=5 tid=1 Sleeping
- | group="main" sCount=1 dsCount=0 obj=0x73132d10 self=0x5598a5f5e0
- //sysTid是线程号(主线程的线程号和进程号相同)
- | sysTid=17027 nice=0 cgrp=default sched=0/0 handle=0x7fb6db6fe8
- | state=S schedstat=( 420582038 5862546 143 ) utm=24 stm=18 core=6 HZ=100
- | stack=0x7fefba3000-0x7fefba5000 stackSize=8MB
- | held mutexes=
- // java 堆栈调用信息(这里可查看导致ANR的代码调用流程)(分析ANR最重要的信息)
- at java.lang.Thread.sleep!(Native method)
- - sleeping on <0x0c60f3c7> (a java.lang.Object)
- at java.lang.Thread.sleep(Thread.java:1031)
- - locked <0x0c60f3c7> (a java.lang.Object) // 锁住对象0x0c60f3c7
- at java.lang.Thread.sleep(Thread.java:985)
- at android.os.SystemClock.sleep(SystemClock.java:120)
- at org.code.ipc.MessengerService.onCreate(MessengerService.java:63) //导致ANR的代码
- at android.app.ActivityThread.handleCreateService(ActivityThread.java:2877)
- at android.app.ActivityThread.access$1900(ActivityThread.java:150)
- at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1427)
- at android.os.Handler.dispatchMessage(Handler.java:102)
- at android.os.Looper.loop(Looper.java:148)
- at android.app.ActivityThread.main(ActivityThread.java:5417)
- at java.lang.reflect.Method.invoke!(Native method)
- at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726)
- at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
在log中显示的pid在traces文件中与之对应,trace log中会打印当前所有线程的堆栈信息,一般我们主要关注main线程的堆栈(也有分析其他线程的情况,通过分析ANR发生时系统状态推测出设备正在进行操作)
而这里然后通过查看堆栈调用信息分析ANR的代码.上述ANR实际上在org.code.ipc.MessengerService.onCreate中63行调用SystemClock.sleep(10000)代码导致的;这是比较简单ANR示例.
以上仅为解决ANR问题提供一个思路,具体问题还需要根据实际情况具体分析
线程状态的分类: java中定义的线程状态:
- // libcore/libart/src/main/java/java/lang/Thread.java
- /**
- * A representation of a thread's state. A given thread may only be in one
- * state at a time.
- */
- public enum State {
- /**
- * The thread has been created, but has never been started.
- */
- NEW,
- /**
- * The thread may be run.
- */
- RUNNABLE,
- /**
- * The thread is blocked and waiting for a lock.
- */
- BLOCKED,
- /**
- * The thread is waiting.
- */
- WAITING,
- /**
- * The thread is waiting for a specified amount of time.
- */
- TIMED_WAITING,
- /**
- * The thread has been terminated.
- */
- TERMINATED
- }
C代码中定义的线程状态:
- // /art/runtime/thread_state.h
- enum ThreadState {
- // Thread.State JDWP state
- kTerminated = 66, // TERMINATED TS_ZOMBIE Thread.run has returned, but Thread* still around
- kRunnable, // RUNNABLE TS_RUNNING runnable
- kTimedWaiting, // TIMED_WAITING TS_WAIT in Object.wait() with a timeout
- kSleeping, // TIMED_WAITING TS_SLEEPING in Thread.sleep()
- kBlocked, // BLOCKED TS_MONITOR blocked on a monitor
- kWaiting, // WAITING TS_WAIT in Object.wait()
- kWaitingForGcToComplete, // WAITING TS_WAIT blocked waiting for GC
- kWaitingForCheckPointsToRun, // WAITING TS_WAIT GC waiting for checkpoints to run
- kWaitingPerformingGc, // WAITING TS_WAIT performing GC
- kWaitingForDebuggerSend, // WAITING TS_WAIT blocked waiting for events to be sent
- kWaitingForDebuggerToAttach, // WAITING TS_WAIT blocked waiting for debugger to attach
- kWaitingInMainDebuggerLoop, // WAITING TS_WAIT blocking/reading/processing debugger events
- kWaitingForDebuggerSuspension, // WAITING TS_WAIT waiting for debugger suspend all
- kWaitingForJniOnLoad, // WAITING TS_WAIT waiting for execution of dlopen and JNI on load code
- kWaitingForSignalCatcherOutput, // WAITING TS_WAIT waiting for signal catcher IO to complete
- kWaitingInMainSignalCatcherLoop, // WAITING TS_WAIT blocking/reading/processing signals
- kWaitingForDeoptimization, // WAITING TS_WAIT waiting for deoptimization suspend all
- kWaitingForMethodTracingStart, // WAITING TS_WAIT waiting for method tracing to start
- kWaitingForVisitObjects, // WAITING TS_WAIT waiting for visiting objects
- kWaitingForGetObjectsAllocated, // WAITING TS_WAIT waiting for getting the number of allocated objects
- kStarting, // NEW TS_WAIT native thread started, not yet ready to run managed code
- kNative, // RUNNABLE TS_RUNNING running in a JNI native method
- kSuspended, // RUNNABLE TS_RUNNING suspended by GC or debugger
- };
两者可以看出在C代码中定义更为详细,Traces中显示的线程状态都是C代码定义的.我们可以通过查看线程状态对应的信息分析ANR问题.
如: TimedWaiting对应的线程状态是TIMED_WAITING
kTimedWaiting, // TIMED_WAITING TS_WAIT in Object.wait() with a timeout 执行了无超时参数的wait函数
kSleeping, // TIMED_WAITING TS_SLEEPING in Thread.sleep() 执行了带有超时参数的sleep函数
- ZOMBIE 线程死亡,终止运行
- RUNNING/RUNNABLE 线程可运行或正在运行
- TIMED_WAIT 执行了带有超时参数的wait、sleep或join函数
- MONITOR 线程阻塞,等待获取对象锁
- WAIT 执行了无超时参数的wait函数
- INITIALIZING 新建,正在初始化,为其分配资源
- STARTING 新建,正在启动
- NATIVE 正在执行JNI本地函数
- VMWAIT 正在等待VM资源
- SUSPENDED 线程暂停,通常是由于GC或debug被暂停
Android ANR log trace日志文件分析的更多相关文章
- /VAR/LOG/各个日志文件分析
/VAR/LOG/各个日志文件分析 author:headsen chen 2017-10-24 18:00:24 部分内容取自网上搜索,部分内容为自己整理的,特此声明. 1. /v ...
- [转载]mysql慢日志文件分析处理
原文地址:mysql慢日志文件分析处理作者:maxyicha mysql有一个功能就是可以log下来运行的比较慢的sql语句,默认是没有这个log的,为了开启这个功能,要修改my.cnf或者在mysq ...
- loadrunner脚本中写入脚本输出log到外部文件,分析参数取值方式
loadrunner脚本中写入脚本输出log到外部文件,分析参数取值方式 分类: 心得 loadrunner 我的测试 2012-04-01 12:52 2340人阅读 评论(0) 收藏 举报 脚本l ...
- Linux没有/var/log/messages日志文件
1.新安装的CentOS8没有/var/log/messages日志文件: 安装rsyslog: dnf install -y rsyslog 或 yum install -y rsys ...
- Android ANR(应用无响应)解决分析【转】
本文转载自:https://blog.csdn.net/u014630142/article/details/81709459 来自: http://blog.csdn.net/tjy1985/art ...
- Android APP测试的日志文件抓取
1 log文件分类简介 实时打印的主要有:logcat main,logcat radio,logcat events,tcpdump,还有高通平台的还会有QXDM日志 状态信息的有: ...
- adb 之日志文件分析(五)
一,logcat日志文件 1,addroid日志系统提供了记录和查看系统调试信息的功能,日志都是从各种软件和一些系统的缓冲区(内存)中记录下来的,缓冲区可以通过logcat命令来查看和使用 2,在开发 ...
- 数据库报ORA-00600: 内部错误代码, 参数: [17059],并产生大量trace日志文件
用户反馈数据库服务器磁盘空间使用耗尽. 登录服务器后查看,发现数据库产生大量的trace日志,并在alert日志中发现ora-600错误 alert日志信息: Fri Jul :: Errors ): ...
- Linux日志文件分析
---恢复内容开始--- 日志保存位置 默认 var/log目录下 主要日志文件 内核及公共消息日志:message 计划任务日志:cron 系统殷桃日志:demsg 邮件系统日志:maillog 用 ...
随机推荐
- vavr:让你像写Scala一样写Java
本文阅读时间大约7分钟. Hystrix是Netflix开源的限流.熔断降级组件,去年发现Hystrix已经不再更新了,而在github主页上将我引导到了另一个替代项目--resilience4j,这 ...
- Python之路(第四十三篇)线程的生命周期、全局解释器锁
一.线程的生命周期(新建.就绪.运行.阻塞和死亡) 当线程被创建并启动以后,它既不是一启动就进入执行状态的,也不是一直处于执行状态的,在线程的生命周期中,它要经过新建(new).就绪(Ready).运 ...
- springboot 单元测试 指定启动类
问题 在做单元测试时,写了一个工具类,用于注入spring的上下文. public class AppBeanUtil implements ApplicationContextAware { pri ...
- linux cgroups简介(下)Cgroups 与 Systemd
Cgroups 是 linux 内核提供的一种机制,如果你还不了解 cgroups,请参考前文<Linux cgroups 简介>先了解 cgroups.当 Linux 的 init 系统 ...
- C语言printf函数转换说明表及其修饰符表
请求printf()函数打印数据的指令要与打印数据的类型匹配. 例如, 打印整数时使用%d, 打印字符是使用%c. 这些符号被称为转换说明, 他们指定了数据转换称可显示的形式. 1. ANSI C标准 ...
- Win 8\10系统安装软件报错
一.问题 Win 8\10系统由于权限管理严格,右键管理员运行安装软件也会出现各种错误: 进度条走到20%闪退,报程序停止运行等等等等 二.解决思路 先确定软件路径安装在D盘(非系统盘). 开启超级管 ...
- Win10 家庭版 VMware 无法启动 解决办法
引发原因 最近更新了一个补丁 KB4524147 安装后会导致 VM 无法打开(如果你没有安装hyper-v的话) 解决方案 控制面板 -> 程序 -> 查看已安装的更新 -> 找到 ...
- Fiddler无法抓取HTTPS的问题,Fiddler证书无法安装终极解决方案,
win7下Fiddler证书安装之后,总是无法抓取https的包:网上搜了很多方法都没解决问题,最终摸索解决方法如下: 第一步: 安装证书:到Fiddler的Tools-options-https下, ...
- pip下载加速
安装pqi pip install pqi pqi回车 pqi ls pqi tuna pqi show pip install --upgrade pqi git链接 https://github. ...
- 临时加一条关于bootstrap的菜单栏方面的
**有些生疏,记住了**aria-expanded表示展开状态.默认为undefined, 表示当前展开状态未知.其它可选值:true表示元素是展开的:false表示元素不是展开的. aria-hid ...