http://blog.tanelpoder.com/2008/09/02/oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others/

by TANEL PODER posted on SEPTEMBER 2, 2008

I have written two posts in one, about a performance issue with writes in system tablespace and introduction of a little DTrace stack sampling script.

Have you noticed that DML on tables residing in SYSTEM tablespace is slower than tables in other tablespaces?

Here’s an example, I’ll create two similar tables, one in USERS tablespace, other in SYSTEM, and inset into the first one (Oracle 10.2.0.3 on Solaris x64):

SQL> create table t1(a int) tablespace USERS;

Table created.

SQL> create table t2(a int) tablespace SYSTEM;

Table created.

SQL> exec for i in 1..100000 loop insert into t1 values(i); end loop;

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.09

Insert into table in USERS tablespace took 3 seconds.

Ok, let’s commit and flush dirty buffers that they wouldn’t get on the way of next insert.

SQL> commit;

Commit complete.

Elapsed: 00:00:00.02

SQL> alter system checkpoint; -- checkpointing to flush dirty buffers from previous inserts

System altered.

Elapsed: 00:00:01.34

SQL>

And now to the insert into the SYSTEM tablespace table:

SQL> exec for i in 1..100000 loop insert into t2 values(i); end loop;

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.98

SQL>

What?! The same insert took 3 times longer, almost 9 seconds?

Fine! Let’s troubleshoot it!

First (as usual) I start from my Snapper for getting a performance snapshot of the session. I ran both inserts again and used Snapper from another session to monitor the insert activity (for brevity I didn’t monitor all stat counters, but used the sinclude filter to leave only various “gets” in from V$SESSTAT).

This is the insert to USERS tablespace table:

SQL> @snapper out,gather=tsw,sinclude=gets 10 1 148

-- Session Snapper v1.07 by Tanel Poder ( http://www.tanelpoder.com )

---------------------------------------------------------------------------------------------------------------------------------------------

HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,  DELTA/SEC,     HDELTA, HDELTA/SEC

---------------------------------------------------------------------------------------------------------------------------------------------

DATA,     148, 20080901 17:20:53,       10, STAT, db block gets                           ,        103448,      10345,    103.45k,     10.34k

DATA,     148, 20080901 17:20:53,       10, STAT, db block gets from cache                ,        103448,      10345,    103.45k,     10.34k

DATA,     148, 20080901 17:20:53,       10, STAT, consistent gets                         ,           495,         49,        495,       49.5

DATA,     148, 20080901 17:20:53,       10, STAT, consistent gets from cache              ,           495,         49,        495,       49.5

DATA,     148, 20080901 17:20:53,       10, STAT, consistent gets - examination           ,           219,         22,        219,       21.9

DATA,     148, 20080901 17:20:53,       10, STAT, no work - consistent read gets          ,            89,          9,         89,        8.9

DATA,     148, 20080901 17:20:53,       10, STAT, rollbacks only - consistent read gets   ,            18,          2,         18,        1.8

DATA,     148, 20080901 17:20:53,       10, STAT, cluster key scan block gets             ,            65,          6,         65,        6.5

DATA,     148, 20080901 17:20:53,       10, TIME, hard parse elapsed time                 ,          4090,        409,     4.09ms,      409us

DATA,     148, 20080901 17:20:53,       10, TIME, PL/SQL compilation elapsed time         ,          3139,        314,     3.14ms,    313.9us

DATA,     148, 20080901 17:20:53,       10, TIME, parse time elapsed                      ,          5131,        513,     5.13ms,    513.1us

DATA,     148, 20080901 17:20:53,       10, TIME, PL/SQL execution elapsed time           ,        174288,      17429,   174.29ms,    17.43ms

DATA,     148, 20080901 17:20:53,       10, TIME, DB CPU                                  ,       3108026,     310803,      3.11s,    310.8ms

DATA,     148, 20080901 17:20:53,       10, TIME, sql execute elapsed time                ,       3104607,     310461,       3.1s,   310.46ms

DATA,     148, 20080901 17:20:53,       10, TIME, hard parse (sharing criteria) elapsed ti,          4110,        411,     4.11ms,      411us

DATA,     148, 20080901 17:20:53,       10, TIME, DB time                                 ,       3108026,     310803,      3.11s,    310.8ms

DATA,     148, 20080901 17:20:53,       10, WAIT, log file switch completion              ,         71104,       7110,     71.1ms,     7.11ms

DATA,     148, 20080901 17:20:53,       10, WAIT, SQL*Net message to client               ,             3,          0,        3us,       .3us

DATA,     148, 20080901 17:20:53,       10, WAIT, SQL*Net message from client             ,       6684030,     668403,      6.68s,    668.4ms

--  End of snap 1

PL/SQL procedure successfully completed.

You see, the DB CPU equals to DB time (in other words all the time spent servicing my request in database was spent on CPU, thus there were no waits, blocks involved). And the insert still took roughly 3 seconds.

This is the insert to SYSTEM tablespace table:

SQL> @snapper out,gather=tsw,sinclude=gets 10 1 148

-- Session Snapper v1.07 by Tanel Poder ( http://www.tanelpoder.com )

---------------------------------------------------------------------------------------------------------------------------------------------

HEAD,     SID, SNAPSHOT START   ,  SECONDS, TYPE, STATISTIC                               ,         DELTA,  DELTA/SEC,     HDELTA, HDELTA/SEC

---------------------------------------------------------------------------------------------------------------------------------------------

DATA,     148, 20080901 17:21:23,       11, STAT, db block gets                           ,        102616,       9329,    102.62k,      9.33k

DATA,     148, 20080901 17:21:23,       11, STAT, db block gets from cache                ,        102616,       9329,    102.62k,      9.33k

DATA,     148, 20080901 17:21:23,       11, STAT, consistent gets                         ,           384,         35,        384,      34.91

DATA,     148, 20080901 17:21:23,       11, STAT, consistent gets from cache              ,           384,         35,        384,      34.91

DATA,     148, 20080901 17:21:23,       11, STAT, consistent gets - examination           ,           218,         20,        218,      19.82

DATA,     148, 20080901 17:21:23,       11, STAT, no work - consistent read gets          ,            94,          9,         94,       8.55

DATA,     148, 20080901 17:21:23,       11, STAT, rollbacks only - consistent read gets   ,            17,          2,         17,       1.55

DATA,     148, 20080901 17:21:23,       11, STAT, cluster key scan block gets             ,            65,          6,         65,       5.91

DATA,     148, 20080901 17:21:23,       11, TIME, hard parse elapsed time                 ,          2350,        214,     2.35ms,   213.64us

DATA,     148, 20080901 17:21:23,       11, TIME, PL/SQL compilation elapsed time         ,          1751,        159,     1.75ms,   159.18us

DATA,     148, 20080901 17:21:23,       11, TIME, parse time elapsed                      ,          3359,        305,     3.36ms,   305.36us

DATA,     148, 20080901 17:21:23,       11, TIME, PL/SQL execution elapsed time           ,        172153,      15650,   172.15ms,    15.65ms

DATA,     148, 20080901 17:21:23,       11, TIME, DB CPU                                  ,       8751163,     795560,      8.75s,   795.56ms

DATA,     148, 20080901 17:21:23,       11, TIME, sql execute elapsed time                ,       8749139,     795376,      8.75s,   795.38ms

DATA,     148, 20080901 17:21:23,       11, TIME, hard parse (sharing criteria) elapsed ti,          2367,        215,     2.37ms,   215.18us

DATA,     148, 20080901 17:21:23,       11, TIME, DB time                                 ,       8751163,     795560,      8.75s,   795.56ms

DATA,     148, 20080901 17:21:23,       11, WAIT, SQL*Net message to client               ,             3,          0,        3us,      .27us

DATA,     148, 20080901 17:21:23,       11, WAIT, SQL*Net message from client             ,      -1442849,    -131168,     -1.44s,  -131.17ms

--  End of snap 1

PL/SQL procedure successfully completed.

You see the DB Time is still equal to DB CPU, thus all response was spent on CPU. Moreover, the time model stats tell me that all the elapsed time was spent executing SQL statements (which makes sense as I’m running INSERT SQL in a tight loop).

So, for some reason the INSERT SQL statements take almost 3 times longer when inserting to SYSTEM tablespace.

In cases where there is no waiting going on (things like Oracle Wait Interface / WaitProf would not help here), but only difference in CPU time, I normally look into V$SESSTAT counter differences first. But as seen from above, there is no major difference in counters such logical IOs (db block gets, consistent gets) to justify 3x increase in CPU usage (also there was no major difference in other counters which I have excluded for brevity).

So, the obvious next step is to sample the process stack during execution of both statements and see where the difference is. In my previous posts I’ve used pstack for getting a few stack samples out of a process and a pstack loop could be used here as well.

But for demo purposes (and convenience) I will use my DTrace stack sampling script today.

It’s dead easy to sample user process and kernel stacks using DTrace in Solaris 10 (and MacOS X and FreeBSD too). However the DTrace stack aggregator also takes account of the current program counter offset inside functions when distinguishing between stacks (this means that CPU program counter location main() + 0x10 will be different from main() + 0x12). But I don’t want such granularity, I want to see where the execution was at function level, not instruction level.

So I wrote my script to strip off the PC function offsets from DTrace output and order the stacks the way that most frequently sampled stack “branch” is displayed last, in bottom of the output. So, whatever stack and function you see outputted last, is probably the “troublemaker”. I still recommend scrolling up and reviewing few other stacks too.

So, I ran the insert to USERS tablespace table again and ran my dstackprof script on the server process with SPID 859.

This is from the “well behaving” process, so normally you should just compare this stack profile to the “bad behaving” process stack profile and see if you can spot any major differences in these. I will explain few of the functions on the go.

# ./dstackprof.sh 859



DStackProf v1.02 by Tanel Poder ( http://www.tanelpoder.com )

Sampling pid 859 for 5 seconds with stack depth of 100 frames...

147 samples with stack below

__________________

kcbchg1_main

kcbchg1                 <- buffer change function in buffer cache layer

ktuchg                    <- undo layer generate change

ktbchg2                  <- block (header) change

kdtchg                    <- data layer generate change

kdtwrp

kdtInsRow              <- data layer row insert function

insrowFastPath

insdrvFastPath

inscovexe

insExecStmtExecIniEngine

insexe                   <- OPI insert execution entry point

opiexe                   <- OPI execute (executing my recursive INSERT INTO statement)

opipls

opiodr

rpidrus                  <- all the RPI functions belong to recursive program interface (which allows making recursive calls in db)

skgmstack

rpidru                    <- these RPI functions (and few OPI ones above) are what PL/SQL context switches physically are

rpiswu2

rpidrv

psddr0                  <- most of the functions starting with "p" are PL/SQL execution ones

psdnal

pevm_EXECC

pfrinstr_EXECC

pfrrun_no_tool

pfrrun

plsql_run

peicnt                    <- PL/SQL execution engine entry point

kkxexe                  <- cursor to PL/SQL execution interface

opiexe                   <- OPI call execution dispatcher

kpoal8

opiodr

ttcpip

opitsk

opiino

opiodr

opidrv

sou2o

opimai_real

main

0xe54ffc

193 samples with stack below

__________________

libc.so.1`times         <-- during these 193 samples the execution was in opiexe (note that

opiexe                   -- it's a recursive OPI call execution (as there's the RPI layer below in the

opipls                    -- stack and there's another opiexe below. The lower opiexe is for executing

opiodr                   -- my PL/SQL anonymous block and upper is for the recursive INSERT statement

rpidrus

skgmstack

rpidru

rpiswu2

rpidrv

psddr0

psdnal

pevm_EXECC

pfrinstr_EXECC

pfrrun_no_tool

pfrrun

plsql_run

peicnt

kkxexe

opiexe

kpoal8

opiodr

ttcpip

opitsk

opiino

opiodr

opidrv

sou2o

opimai_real

main

0xe54ffc

3147 Total samples captured

Anyway, the two stacks which I have copied here don’t show anything obvious. And also, see that there were total 3147 samples taken when the process was on CPU. But the execution point was in the “top” stack during only 193 samples, which is around 6% of all samples.

As the above stacks come from a well-behaving process, let’s leave their deeper analysis and just compare them to the bad-behaving process. So, I ran the SYSTEM tablespace insert again (you still remember, we were troubleshooting why inserts into system tablespace were slower, right ;)

The top stack in profile was following:

# ./dstackprof.sh 859

DStackProf v1.02 by Tanel Poder ( http://www.tanelpoder.com )

Sampling pid 859 for 5 seconds with stack depth of 100 frames...

[...some output snipped...]



917 samples with stack below

__________________

libc.so.1`qsort

kd4_entries_overlap

kdb4chk

kd4chk

kcbchk

kco_blkchk    <-- what the heck is this? the name suggests that it's block checking of some kind

kcoapl

kcbapl

kcrfw_redo_gen             <-- also note this function, looks like the block change function (kcbchg) has called

kcbchg1_main                   <-- a function to generate redo on its behalf

kcbchg1

ktuchg

ktbchg2

kdtchg

kdtwrp

kdtInsRow

insrowFastPath

insdrvFastPath

inscovexe

insExecStmtExecIniEngine

insexe

opiexe

opipls

opiodr

rpidrus

skgmstack

rpidru

rpiswu2

rpidrv

psddr0

psdnal

pevm_EXECC

pfrinstr_EXECC

pfrrun_no_tool

pfrrun

plsql_run

peicnt

kkxexe

opiexe

kpoal8

opiodr

ttcpip

opitsk

opiino

opiodr

opidrv

sou2o

opimai_real

main

0xe54ffc

6202 Total samples captured

As most of the top stacks from the bad behaving process were spending their time somewhere in kco_blkchk() function, it rang a bell – database block checking!

However it was weird as the db_block_checking parameter was set false in my instance:

SQL> show parameter db_block_checking

NAME                                 TYPE        VALUE

------------------------------------ ----------- -----

db_block_checking                    string      FALSE

SQL>

After a little digging, I found this parameter:

NAME                          VALUE     DESCRIPTION

----------------------------- --------- -------------------------------------------------------------

_db_always_check_system_ts    TRUE      Always perform block check and checksum for System tablespace

This parameter makes Oracle (since version 8.1.6) do block structure checking in SYSTEM tablespace even if db_block_checking itself is set to FALSE. Considering the criticality of data dictionary objects, it’s a good idea.

So, now you might say: so what that there’s a performance issue, you shouldn’t keep your application tables in SYSTEM tablespace anyway!

Well, I have three words for you: AUD$, FGA_LOG$, DEF$_AQCALL :)

If you rely on auditing functionality heavily and especially if you use advanced replication, having the crucial tables (and associated LOB segments) in SYSTEM tablespace can mean quite a hit to your CPUs and system throughput. In which cases you probably would want to explore the option of moving these tables out of system tablespace.

Warning: before moving anything, consult Metalink or Oracle Support as only few strictly defined ways for moving these object are supported. See notes 731908.1 for Audit and 1037317.6 for Advanced Replication.

So, the key point of this article is that I could have started guessing or setting various events or undocumented parameters for probing Oracle here and there, for figuring out what’s going on. But I used a simple systematic approach instead. When Oracle Wait Interface and V$SESSTAT counters didn’t help, looked into process stack next. And this gave me the vital clue (kdo_blkchk) that there’s block checking going on. And if I hadn’t figured out the cause myself, I would have had some solid evidence for support.

Note that you don’t have to use DTrace for this kind of stack profiling (even though it is very convenient). On non-Solaris 10 OS’es I would use whatever safe stack dump mechanism is available (pstack for example) and post-process the results. I probably wouldn’t take thousands of samples though, but probably 100 or even less.

Also, if you want to get a complete overview of the process execution profile, you would need to do much more complex aggregations, as you’d need to aggregate “child” branches of stack samples with their “parents” (currently the samples of a backtrace “a->b->c” are not accounted in “a->b” for example). But for my case (just checking if there’s any major difference between execution profiles) this approach was good enough.

By the way (if you haven’t given up yet :), check the differences in LatchProfX output. The top is from well-behaving process and bottom one is the bad one:

SQL> @latchprofx sid,name,ksllwnam,ksllwlbl 128 % 1000000

-- LatchProfX 1.06 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                                KSLLWNAM                                 KSLLWLBL                   Held       Gets  Held %     Held ms Avg hold ms

---------- ----------------------------------- ---------------------------------------- -------------------- ---------- ---------- ------- ----------- -----------

       128 redo copy                           kcrfw_redo_gen: nowait                   latch                     49399      48937    4.94     674.790        .014

       128 cache buffers chains                kcbchg1: kslbegin: bufs not pinned                                  4574       2320     .46      62.481        .027

       128 cache buffers chains                kcbchg1: kslbegin: call CR func                                     2101       1445     .21      28.700        .020

       128 cache buffers chains                kcbgcur_2                                                           1323       1060     .13      18.072        .017

       128 cache buffers chains                kcbnew: new latch again                                              269        243     .03       3.675        .015

       128 simulator lru latch                 kcbs_simulate: simulate set              setid                       170        170     .02       2.322        .014

       128 redo allocation                     kcrfw_redo_gen: redo allocation 1        strand #                     68         13     .01        .929        .071

       128 cache buffers chains                kcbrls_2                                                              29         29     .00        .396        .014

       128 cache buffers chains                kcbgtcr: fast path                                                    25         25     .00        .342        .014

       128 cache buffers chains                kcbgcur: fast path (shr)                                               8          8     .00        .109        .014

       128 cache buffers lru chain             kcbzgws_1                                                              7          7     .00        .096        .014

SQL> @latchprofx sid,name,ksllwnam,ksllwlbl 128 % 1000000

-- LatchProfX 1.06 by Tanel Poder ( http://www.tanelpoder.com )

       SID NAME                                KSLLWNAM                                 KSLLWLBL                   Held       Gets  Held %     Held ms Avg hold ms

---------- ----------------------------------- ---------------------------------------- -------------------- ---------- ---------- ------- ----------- -----------

       128 redo copy                           kcrfw_redo_gen: nowait                   latch                    356465      79839   35.65    6088.422        .076

       128 cache buffers chains                kcbchg1: kslbegin: bufs not pinned                                  4704       2233     .47      80.344        .036

       128 cache buffers chains                kcbchg1: kslbegin: call CR func                                     3542       1997     .35      60.497        .030

       128 cache buffers chains                kcbgcur_2                                                           1929       1448     .19      32.947        .023

       128 redo allocation                     kcrfw_redo_gen: redo allocation 1        strand #                   1018         34     .10      17.387        .511

       128 shared pool simulator               kglsim_unpin_simhp: fast path                                        713        713     .07      12.178        .017

       128 simulator lru latch                 kcbs_simulate: simulate set              setid                       266        266     .03       4.543        .017

       128 cache buffers chains                kcbnew: new latch again                                              182        171     .02       3.109        .018

       128 cache buffers chains                kcbgtcr: fast path (cr pin)                                           29         29     .00        .495        .017

       128 cache buffers chains                kcbgtcr: fast path

Oracle hidden costs revealed, Part2 – Using DTrace to find why writes in SYSTEM tablespace are slower than in others的更多相关文章

Linux平台 Oracle 11gR2 RAC安装Part2：GI安装
三.GI(Grid Infrastructure)安装 3.1 解压GI的安装包 3.2 安装配置Xmanager软件 3.3 共享存储LUN的赋权 3.4 使用Xmanager图形化界面安装GI 3 ...
Linux平台 Oracle 12cR2 RAC安装Part2：GI配置
Linux平台 Oracle 12cR2 RAC安装Part2:GI配置三.GI(Grid Infrastructure)安装 3.1 解压GI的安装包 3.2 安装配置Xmanager软件 3.3 ...
极端恢复时用到的几个隐藏参数 Oracle hidden corruption recovery parameters
Oracle hidden corruption recovery parameters Note: Oracle hidden parameters are undocumented and uns ...
Linux平台 Oracle 18c RAC安装Part2：GI配置
三.GI(Grid Infrastructure)安装 3.1 解压GI的安装包 3.2 安装配置Xmanager软件 3.3 共享存储LUN的赋权 3.4 使用Xmanager图形化界面配置GI 3 ...
Linux平台 Oracle 19c RAC安装Part2：GI配置
三.GI(Grid Infrastructure)安装 3.1 解压GI的安装包 3.2 安装配置Xmanager软件 3.3 共享存储LUN的赋权 3.4 使用Xmanager图形化界面配置GI 3 ...
oracle linux 7.3 下用dtrace 跟踪 mysql-community-server-5.6.20
第一部分:oracle linux 7.3 dtrace 安装: http://public-yum.oracle.com/ 内核升极到 [root@server1 SPECS]# uname - ...
[转]Oracle Hidden Parameter:_allow_resetlogs_corruption
本文转自eygle博客,原文地址:http://www.eygle.com/archives/2005/10/oracle_hidden_allow_resetlogs_corruption.html ...
Oracle Linux 7.3下载与 dtrace安装
https://docs.oracle.com/cd/E52668_01/E86280/html/section_edm_dvp_hz.html http://www.oracle.com/techn ...
OCP读书笔记(22) - 题库(ExamB)
101.Identify two situations in which you can use Data Recovery Advisor for recovery. (Choose two.) A ...

随机推荐

Linux学习-什么是 daemon 与服务 (service)
『常驻在记体体中的程序,且可以提供一些系统或网络功能,那就是服务』.而服务一般的英文说法是『 service 』. 那么 daemon 与 service 有关啰?否则为什么都能够提供某些系统或网 ...
jenkins执行构建并查看结果
继完成构建项目配置http://www.cnblogs.com/yajing-zh/p/5111060.html后,则要执行构建. 回到jenkins主页之后,我们看到一个新建的项目显示出来: 点击进 ...
WPF触控程序开发(三)——类似IPhone相册的反弹效果
用过IPhone的都知道,IPhone相册里,当图片放大到一定程度后,手指一放,会自动缩回,移动图片超出边框后手指一放,图片也会自动缩回,整个过程非常和谐.自然.精确,那么WPF能否做到呢,答案是肯定 ...
CSS预处理器(less 和 sass)
CSS预处理器 1.基于CSS的另一种语言 2.通过工具编译成CSS 3.添加了很多CSS不具备的特性 4.能提升CSS文件的组织提供功能:1.嵌套反映层级和约束 2.变量和计算,减少重复戴拿 3 ...
python学习-- 在for循环中还有很多有用的东西，如下：
变量描述 forloop.counter 索引从 1 开始算 forloop.counter0 索引从 0 开始算 forloop.revcounter 索引从最大长度到 1 forloop.rev ...
[UiAutomator篇][3] 打开音乐应用的测试脚本
package qq.test; import android.content.Context; import android.content.Intent; import android.suppo ...
Oracle的表空间、数据文件、用户
每一个Oracle数据库都是由三种类型的文件组成:数据文件(Data File).日志文件(Log File)和控制文件(Control File).数据库的文件为数据库信息提供真正的物理 ...
NuGet之控制台管理程序包
NuGet作为VS的扩展程序,已经做好了UI,我们可以通过Manage NuGet Packages 的对话框.这里我们主要说说如何通过控制台进行包管理.使用命令行的方式,其实也是有其好处,对 ...
一步一步，完成sparkMLlib对日志文件的处理（1）
https://blog.csdn.net/u012834750/article/details/81014997 初学第一天,当然是完成helloWorld啦,有点艰难,2个小时,在idea, ...
linux磁盘问题排查
一.ls $>>ls /data* //查看有无input/output error报错二.demsg $>>demsg|grep sd 问题盘结果: 三.iostat使用 ...

Oracle hidden costs revealed, Part2 – Using DTrace to find why writes in SYSTEM tablespace are slower than in others

http://blog.tanelpoder.com/2008/09/02/oracle-hidden-costs-revealed-part2-using-dtrace-to-find-why-writes-in-system-tablespace-are-slower-than-in-others/

Oracle hidden costs revealed, Part2 – Using DTrace to find why writes in SYSTEM tablespace are slower than in others的更多相关文章

随机推荐

热门专题