就在前几天,又有一个客户向我咨询undo表空间使用率的问题。

这让我想起几年前曾经有个省份的案例,客户的实际运维人员是一位刚毕业不久的女孩,几乎不懂Oracle原理,项目经理交给她的任务也是基础运维工作,比如其中一项就是监测数据库各个表空间的使用率,并对使用率超过95%的表空间进行扩展,他们的Oracle版本是10gR2。

由于该客户业务是运营商话单相关的,业务数据量很大(几十T的规模),所以预留存储的空间也很充足。

有一次该客户有其他问题找到我远程处理的时候,我惊奇的发现他们的undo表空间居然有2个多T大小。进而询问运维人员是怎么回事,想必结果大家已经猜到了,这女孩说她日常巡检经常发现undo表空间使用率超过95%,所以她就不停地扩展,直到如今已经加到2个多T规模的大小。她甚至认为undo表空间也是某一个业务的表空间,这就尴尬了。

那么,究竟什么是undo?undo都有哪些实际作用呢?Oracle 10g的官方文档是这样描述的:

What Is Undo?

Every Oracle Database must have a method of maintaining information that is used to roll back, or undo, changes to the database. Such information consists of records of the actions of transactions, primarily before they are committed. These records are collectively referred to as undo.

Undo records are used to:

Roll back transactions when a ROLLBACK statement is issued

Recover the database

Provide read consistency

Analyze data as of an earlier point in time by using Oracle Flashback Query

Recover from logical corruptions using Oracle Flashback features

When a ROLLBACK statement is issued, undo records are used to undo changes that were made to the database by the uncommitted transaction. During database recovery, undo records are used to undo any uncommitted changes applied from the redo log to the datafiles. Undo records provide read consistency by maintaining the before image of the data for users who are accessing the data at the same time that another user is changing it.

具体来看下我10.2.0.5实验环境下undo相关参数的默认设置:

  1. SQL> show parameter undo
  2. NAME TYPE VALUE
  3. ------------------------------------ ----------- ------------------------------
  4. undo_management string AUTO
  5. undo_retention integer 900
  6. undo_tablespace string UNDOTBS1

可以看到undo_management默认设置为AUTO,关于这个值,官档这样描述:

Automatic undo management uses an undo tablespace.To enable automatic undo management, set the UNDO_MANAGEMENT initialization parameter to AUTO in your initialization parameter file. In this mode, undo data is stored in an undo tablespace and is managed by Oracle Database.

而对于undo_retention这个值,默认是900,单位是s,也就是15分钟。很多实际的环境,也会考虑将其设置的大一些,比如10800,即3小时。

来看下官档对于undo retention和与之相关的retention guarantee的具体描述:

Undo Retention

After a transaction is committed, undo data is no longer needed for rollback or transaction recovery purposes. However, for consistent read purposes, long-running queries may require this old undo information for producing older images of data blocks. Furthermore, the success of several Oracle Flashback features can also depend upon the availability of older undo information. For these reasons, it is desirable to retain the old undo information for as long as possible.

When automatic undo management is enabled, there is always a current undo retention period, which is the minimum amount of time that Oracle Database attempts to retain old undo information before overwriting it. Old (committed) undo information that is older than the current undo retention period is said to be expired. Old undo information with an age that is less than the current undo retention period is said to be unexpired.

Oracle Database automatically tunes the undo retention period based on undo tablespace size and system activity. You can specify a minimum undo retention period (in seconds) by setting the UNDO_RETENTION initialization parameter. The database makes its best effort to honor the specified minimum undo retention period, provided that the undo tablespace has space available for new transactions. When available space for new transactions becomes short, the database begins to overwrite expired undo. If the undo tablespace has no space for new transactions after all expired undo is overwritten, the database may begin overwriting unexpired undo information. If any of this overwritten undo information is required for consistent read in a current long-running query, the query could fail with the snapshot too old error message.

The following points explain the exact impact of the UNDO_RETENTION parameter on undo retention:

The UNDO_RETENTION parameter is ignored for a fixed size undo tablespace. The database may overwrite unexpired undo information when tablespace space becomes low.

For an undo tablespace with the AUTOEXTEND option enabled, the database attempts to honor the minimum retention period specified by UNDO_RETENTION. When space is low, instead of overwriting unexpired undo information, the tablespace auto-extends. If the MAXSIZE clause is specified for an auto-extending undo tablespace, when the maximum size is reached, the database may begin to overwrite unexpired undo information.

Retention Guarantee

To guarantee the success of long-running queries or Oracle Flashback operations, you can enable retention guarantee. If retention guarantee is enabled, the specified minimum undo retention is guaranteed; the database never overwrites unexpired undo data even if it means that transactions fail due to lack of space in the undo tablespace. If retention guarantee is not enabled, the database can overwrite unexpired undo when space is low, thus lowering the undo retention for the system. This option is disabled by default.

WARNING:

Enabling retention guarantee can cause multiple DML operations to fail. Use with caution.

You enable retention guarantee by specifying the RETENTION GUARANTEE clause for the undo tablespace when you create it with either the CREATE DATABASE or CREATE UNDO TABLESPACE statement. Or, you can later specify this clause in an ALTER TABLESPACE statement. You disable retention guarantee with the RETENTION NOGUARANTEE clause.

You can use the DBA_TABLESPACES view to determine the retention guarantee setting for the undo tablespace. A column named RETENTION contains a value of GUARANTEE, NOGUARANTEE, or NOT APPLY (used for tablespaces other than the undo tablespace).

看到这里,我们已经可以明白,对于本文开头我说到的那个案例,为什么undo明明是可以循环利用的,却不断增长最终使得那个女孩不断扩展undo表空间。

之前看到Maclean在群里答复一个网友的相关提问,给出了如下语句来查询undo真实的使用率:

  1. prompt
  2. prompt ############## IN USE Undo Data ##############
  3. prompt
  4. select
  5. ((select (nvl(sum(bytes),0))
  6. from dba_undo_extents
  7. where tablespace_name in (select tablespace_name from dba_tablespaces
  8. where retention like '%GUARANTEE' )
  9. and status in ('ACTIVE','UNEXPIRED')) *100) /
  10. (select sum(bytes)
  11. from dba_data_files
  12. where tablespace_name in (select tablespace_name from dba_tablespaces
  13. where retention like '%GUARANTEE' )) "PCT_INUSE"
  14. from dual;

可以看到,这个语句实际上就是将状态为ACTIVE和UNEXPIRED的,计算为已使用。如果retention guarantee并没有设置的话,那么这个使用率高也不一定会有问题,因为Oracle会将unexpired状态的也拿来重用。

另外需要注意,如果是RAC,上述的查询会将两个实例的结果平均,而实际上我们希望是各自统计各自的。所以可以直接指定我们要查询的undo表空间名称:

  1. select
  2. ((select (nvl(sum(bytes),0))
  3. from dba_undo_extents
  4. where tablespace_name = '&TABLESPACE_NAME'
  5. and status in ('ACTIVE','UNEXPIRED')) *100) /
  6. (select sum(bytes)
  7. from dba_data_files
  8. where tablespace_name = '&TABLESPACE_NAME') "PCT_INUSE"
  9. from dual;

也可以通过dba_undo_extents监控undo表空间的使用情况,按状态分组:

  1. select tablespace_name, status, sum(bytes/1024/1024) "MB"
  2. from dba_undo_extents
  3. group by tablespace_name, status
  4. order by 1, 2;

根据上面的知识,我们只需关注结果中状态为ACTIVE的占用多少,如果设置了retention guarantee,那么还要同时关注UNEXPIRED的占用多少。

此外,从Maclean的博客中找到两条实用的UNDO表空间监控的查询SQL:

  1. --在Oracle 10g版本中可以使用V$UNDOSTAT视图用于监控实例中当前事务使用UNDO表空间的情况。视图中的每行列出了每隔十分钟从实例中收集到的统计信息。
  2. --每行都表示了在过去7*24小时里每隔十分钟UNDO表空间的使用情况,事务量和查询长度等信息的统计快照。
  3. --UNDO表空间的使用情况会因事务量变化而变化,一般我们在计算时同时参考UNDO表空间的平均使用情况和峰值使用情况
  4. --以下SQL语句用于计算过去7*24小时中UNDO表空间的平均使用量
  5. select ur undo_retention,
  6. dbs db_block_size,
  7. ((ur * (ups * dbs)) + (dbs * 24)) / 1024 / 1024 as "M_bytes"
  8. from (select value as ur from v$parameter where name = 'undo_retention'),
  9. (select (sum(undoblks) / sum(((end_time - begin_time) * 86400))) ups
  10. from v$undostat),
  11. (select value as dbs from v$parameter where name = 'db_block_size');
  12. --以下SQL语句则按峰值情况计算UNDO表空间所需空间:
  13. select ur undo_retention,
  14. dbs db_block_size,
  15. ((ur * (ups * dbs)) + (dbs * 24)) / 1024 / 1024 as "M_bytes"
  16. from (select value as ur from v$parameter where name = 'undo_retention'),
  17. (select (undoblks / ((end_time - begin_time) * 86400)) ups
  18. from v$undostat
  19. where undoblks in (select max(undoblks) from v$undostat)),
  20. (select value as dbs from v$parameter where name = 'db_block_size');

最后,实际我们透过这个简单的案例来看,实际很多项目上,也的确真实存在一些运维人员,他们并不具备相应的知识储备,就直接去做相应工作了,其结果就是让本不复杂的系统布满了各种各样的坑。

所以,无论是学什么做什么,对于基础知识还是要深入的去学习和思考的,不积跬步无以至千里。

老生常谈:关于undo表空间的使用率的更多相关文章

  1. Oracle 11gR2 Database UNDO表空间使用率居高不下-转载

    客户的数据库是Oracle Database 11.2.0.3.0 for AIX 6.1 64bit的单机数据库.客户查询DBA_FREE_SPACE发现UNDO表空间的使用率高达98%以上.客户的 ...

  2. [Oracle]undo表空间使用量为100%

    在Toad中发现undo表空间undotbs1使用量已经达到100%,但是奇怪的是数据库并没有hang住,依然可以正常运转 通过Oracle提供的EM查看undotbs1表空间的使用,也达到了78.8 ...

  3. Oracle 11gR2 Database UNDO表空间使用率居高不下处理

    一.UNDO表空间监控图 Prometheus监控的到UNDO表空间使用率超过90%(90%为所有表空间告警阈值).从图中可以看到,多次增加UNDO表空间的DATAFILE,UNDO表空间达到40GB ...

  4. 监控和管理Oracle UNDO表空间的使用

    对Oracle数据库UNDO表空间的监控和管理是我们日常最重要的工作之一,UNDO表空间通常都是Oracle自动化管理(通过undo_management初始化参数确定):UNDO表空间是用于存储DM ...

  5. undo表空间居高不下和enq: US - contention

    这几天遇到一个错误,我也不知道算不算错误吧,因为没有报错,只是在那突然的短短2分钟内表的操作突然降低了,导致了该软件重新启动.查看alert日志没有报错,而是在ASH里找到了TOP SQL框有一个这样 ...

  6. (转载)undo表空间

    对Oracle数据库UNDO表空间的监控和管理是我们日常最重要的工作之一,UNDO表空间通常都是Oracle自动化管理(通过undo_management初始化参数确定):UNDO表空间是用于存储DM ...

  7. 万答#18,MySQL8.0 如何快速回收膨胀的UNDO表空间

    欢迎来到 GreatSQL社区分享的MySQL技术文章,如有疑问或想学习的内容,可以在下方评论区留言,看到后会进行解答 GreatSQL社区原创内容未经授权不得随意使用,转载请联系小编并注明来源. 背 ...

  8. 如何删除回滚段状态为NEEDS RECOVERY的undo表空间

    环境:RHEL 6.4 + Oracle 11.2.0.4 背景:备份恢复的测试库在一次不完全恢复后,没有来及做有效的全备,又一次数据库故障导致数据库无法正常open. 只能离线部分数据文件打开数据库 ...

  9. 记一次ORACLE的UNDO表空间爆满分析过程

    这篇文章是记录一次ORACLE数据库UNDO表空间爆满的分析过程,主要整理.梳理了同事分析的思路.具体过程如下所示: 早上收到一数据库服务器的UNDO表空间的告警邮件,最早一封是7:55发出的(监控作 ...

随机推荐

  1. Linux-hexdump命令调试event驱动—详解(13)

    hexdump: 查看文件的内容,比如二进制文件中包含的某些字符串,通常用来调试驱动用 1.调试 键盘驱动 讲解 当我们insmod挂载了键盘驱动后,找到键盘驱动被放在event1设备里, 此时没有按 ...

  2. vim基础详解

    目录: 什么是vim Vim能做什么 如何学习vim 如何用vim打开一个文件 Vim的三种模式 插入模式 命令模式 扩展命令模式 光标移动 在命令模式下 删除,复制,粘贴 扩展命令模式 可视化模式 ...

  3. 前端angularJS利用directive实现移动端自定义软键盘的方法

    最近公司项目的需求上要求我们iPad项目上一些需要输入数字的地方用我们自定义的软键盘而不是移动端设备自带的键盘,刚接到需求有点懵,因为之前没有做过,后来理了一下思路发现这东西也就那样.先看一下实现之后 ...

  4. 03标准对象-02-RegExp 正则表达式

    1.基本概念 和 定义 用一种描述性的语言来给字符串定义一个规则,你可以形象地理解正则表达式是一个"框",凡是符合大小形状条件的字符串,都算是"匹配"了. JS ...

  5. mxnet的训练过程——从python到C++

    mxnet的训练过程--从python到C++ mxnet(github-mxnet)的python接口相当完善,我们可以完全不看C++的代码就能直接训练模型,如果我们要学习它的C++的代码,从pyt ...

  6. HashMap的源码分析(一)

    1.hashMap的关键值 DEFAULT_INITIAL_CAPACITY:默认初始容量16,∈(0,1<<30),实际大小为2的整数次幂: DEFAULT_LOAD_FACTOR:默认 ...

  7. Android 从ImageView中获取Bitmap对象方法

    showImageView.setDrawingCacheEnabled(true); Bitmap bitmap=showImageView.getDrawingCache(); showImage ...

  8. 如何科学地蹭热点:用python爬虫获取热门微博评论并进行情感分析

    前言:本文主要涉及知识点包括新浪微博爬虫.python对数据库的简单读写.简单的列表数据去重.简单的自然语言处理(snowNLP模块.机器学习).适合有一定编程基础,并对python有所了解的盆友阅读 ...

  9. Java历程-初学篇 Day03扫描仪与类型转换

    一,扫描仪 步骤1,使用扫描仪方法 步骤2,导个包 步骤三,使用 注意事项:严格区分大小写 二,类型转换 1,自动类型转换 当将一个数值范围小的类型赋给一个数值范围大的数值型变量,java在编译过程中 ...

  10. Python自学笔记-Django分页器小实例

    from django.core.paginator import Paginator iter = 'abcdefhijklmnopqw' paginator = Paginator(iter,4) ...