诊断:记一次存储异常CRASH致数据库无法正常打开的恢复
数据库存储异常crash,首先控制文件出现问题
ORA-: ????? ????
ORA-: ???? : '/oracledata/oradata/orc11rac/orc11rac/system01.dbf'
ORA-: ????????? - ??????
/home/oracle>oerr ora
, , "file is more recent than control file - old control file"
// *Cause: The control file change sequence number in the data file is
// greater than the number in the control file. This implies that
// the wrong control file is being used. Note that repeatedly causing
// this error can make it stop happening without correcting the real
// problem. Every attempt to open the database will advance the
// control file change sequence number until it is great enough.
// *Action: Use the current control file or do backup control file recovery to
// make the control file current. Be sure to follow all restrictions
// on doing a backup control file recovery.
/home/oracle>oerr ora
, , "data file %s: '%s'"
// *Cause: Reporting file name for details of another error
// *Action: See associated error message
/home/oracle>oerr ora
, , "database file %s failed verification check"
// *Cause: The information in this file is inconsistent with information
// from the control file. See accompanying message for reason.
// *Action: Make certain that the db files and control files are the correct
// files for this database.
这个问题可以采取重建控制文件然后进行recover database进行解决。
需要注意的是,在RAC环境中,需要关闭cluster_database。
即在单线程环境下进行操作。
否则可能会遇到如下问题:
ORA-: CREATE CONTROLFILE failed
ORA-: operation requires database is in EXCLUSIVE mode
本以为,事情可以过去,但是在recover的时候,文件、redolog、archivedlog都出现讹误,常规手段恢复后都无法打开。
最后采取_allow_resetlogs_corruption参数的方式进行尝试。
在pfile文件中添加参数
*._allow_resetlogs_corruption=true
使用该参数resetlogs打开数据库时,可能会由于SCN不一致而遭遇到ORA-00600 2662号错误。
ORA-: internal error code, arguments: [], [], [], [], [], [], [], []
- =
每一次尝试重启,ORA-600的错误参数是会变动的。
ORA-: internal error code, arguments: [], [], [], [], [], [], [], []
- =
可以发现,从19980到19972,这个值在缩小,这个错误,如果值相对较近,可以尝试多重启几次。
但是需要重启2497次,这个是短期内无法接受。
此时我们可以通过Oracle的内部事件来调整SCN:
增进SCN有两种常用方法:
1.通过immediate trace name方式(在数据库Open状态下)
alter session set events 'IMMEDIATE trace name ADJUST_SCN level x';
2.通过10015事件(在数据库无法打开,mount状态下)
alter session set events '10015 trace name adjust_scn level x';
注:level 1为增进SCN 10亿 (1 billion) (1024*1024*1024),通常Level 1已经足够。也可以根据实际情况适当调整。
SQL> alter session set events 'IMMEDIATE trace name ADJUST_SCN level 10'; Session altered. SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/oracledata/oradata/orc11rac/orc11rac/system01.dbf' SQL> recover database
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
Process ID: 27474
Session ID: 1105 Serial number: 5
仍无法打开,后台报错
ORA-: internal error code, arguments: [], [], [], [], [], [], [], []
ORA-600的报错发生了变化,上述操作已经生效。但是诱发了新的错误。
DESCRIPTION: A mismatch has been detected between Redo records and Rollback (Undo)
records. We are validating the Undo block sequence number in the undo block against
the Redo block sequence number relating to the change being applied. This error is reported when this validation fails. ARGUMENTS:
Arg [a] Undo record seq number
Arg [b] Redo record seq number FUNCTIONALITY:
KERNEL TRANSACTION UNDO ORA- [] [a] [b] [ ] [ ] [ ]
Versions: 7.2. - 9.2. Source: ktuc.c
===========================================================================
Meaning: seq# mismatch while adding an undo record to an undo block. This
is done by the application of redo.
---------------------------------------------------------------------------
Argument Description: a. (ktubhseq): undo record seq# - this is the seq# of the block that
this undo record WILL BE APPLIED TO.
This is from the Undo Block. It is
NOT the seq# of the undo block itself. b. (ktudbseq): redo RECORD seq# - this is the seq# number in the block
that this redo WILL BE APPLIED TO.
This is from the Redo Record. ---------------------------------------------------------------------------
Diagnosis: This error is raised in kturdb which handles the adding of undo records
by the application of redo. When we try to apply redo to an undo block (forward changes are made by
the application of redo to a block) we check that the seq# in the undo
record matches the seq# in the redo record. These seq# should be the
same because when we apply a redo record we must apply it to the
correct version of the block. We can only apply a redo record to a
block that contains the same seq# as in the redo record. If the seq# do not match then this error is raised. This implies some
kind of block corruption in either the redo or the undo block. 7.3.x - 8.1..x
ASSERT2(ubh->ktubhseq == db->ktudbseq, OERI(), KSESVSGN,
ubh->ktubhseq, db->ktudbseq);
9.2.x
ksesic2(OERI(), ksenrg(ubh->ktubhseq), ksenrg(db->ktudbseq)); struct ktubh
{
kxid ktubhxid; /* txid of tx currently using or last used this block */
ub2 ktubhseq; /* undo block sequence number */
ub1 ktubhcnt; /* high water mark record index, number of undo entries */
ub1 ktubhirb; /* rollback record index, rec index to start the rollback */
ub1 ktubhicl; /* collecting record index, rec index to start retrieving col info */
ub1 ktubhflg; /* dummy */
ub2 ktubhidx[]; /* byte offset of record in block, grows at runtime */
}; struct ktudb Kernel Transaction Undo Data operation Block (redo)
{
ub2 ktudbsiz; /* size of entry */
ub2 ktudbspc; /* verification: space left in undo block */
ub2 ktudbflg; /* flag to indicate the kind of redo operation */
kxid ktudbxid; /* current tx id */
ub2 ktudbseq; /* block sequence number */
ub1 ktudbrec; /* new record index for this change */
};
处理方式是
1、新建一个UNDO表空间;
2、修改undo管理为manual;
本次选择了手工的方式,参数文件中修改
*.undo_management=manual
SQL> startup mount
ORACLE instance started. Total System Global Area 1.3429E+10 bytes
Fixed Size 2149040 bytes
Variable Size 6845105488 bytes
Database Buffers 6576668672 bytes
Redo Buffers 4730880 bytes
Database mounted.
SQL> alter database open; Database altered.
至此,数据库成功打开。此时已经可以导出需要的数据进行备份。
某些版本的数据库仍需要进行TEMP表空间的temp文件添加。
但此时已经可以导出需要的数据进行备份。
继续观察后台日志报错,也可以新建新的UNDO表空间为auto管理。
诊断:记一次存储异常CRASH致数据库无法正常打开的恢复的更多相关文章
- 解Bug之路-记一次存储故障的排查过程
解Bug之路-记一次存储故障的排查过程 高可用真是一丝细节都不得马虎.平时跑的好好的系统,在相应硬件出现故障时就会引发出潜在的Bug.偏偏这些故障在应用层的表现稀奇古怪,很难让人联想到是硬件出了问题, ...
- mvc 使用预置队列类型存储异常对象
using PaiXie.Utils; using System; using System.Collections.Generic; using System.Linq; using System. ...
- Android数据的四种存储方式之SQLite数据库
Test.java: /** * 本例解决的问题: * 核心问题:通过SQLiteOpenHelper类创建数据库对象 * 通过数据库对象对数据库的数据的操作 * 1.sql语句方式操作SQLite数 ...
- Spring Boot干货系列:(八)数据存储篇-SQL关系型数据库之JdbcTemplate的使用
Spring Boot干货系列:(八)数据存储篇-SQL关系型数据库之JdbcTemplate的使用 原创 2017-04-13 嘟嘟MD 嘟爷java超神学堂 前言 前面几章介绍了一些基础,但都是静 ...
- (4.14)存储:RAID在数据库存储上的应用
关键词:(4.14)存储:RAID在数据库存储上的应用 转自:http://blog.51cto.com/qianzhang/1251260 随着单块磁盘在数据安全.性能.容量上呈现出的局限,磁盘阵列 ...
- 数据存储之非关系型数据库存储----MongoDB存储
MongoDB存储----文档型数据库 利用pymongo连接MongoDB import pymongo client = pymongo.MongoClient(host='localhost', ...
- 用texarea存储数据,查询数据库后原样显示在jsp中,包括空格和回车换行
用texarea存储数据,查询数据库后原样显示在jsp中,包括空格和回车换行
- 教你 Debug 的正确姿势——记一次 CoreMotion 的 Crash
作者:林蓝东 最近的一个手机 QQ 版本发出去后收到比较多关于 CoreMotion 的 crash 上报,案发现场如下: 但是看看这个堆栈发现它完全不按照套路出牌啊! 乍一看是挂在 CoreMoti ...
- 异常Crash之 NSGenericException,NSArray was mutated while being enumerated
*** Terminating app due to uncaught exception 'NSGenericException', reason: '*** Collection <__NS ...
随机推荐
- POJ3254 状压dp
Corn ...
- Grunt学习日记
Grunt和 Grunt 插件是通过npm安装并管理的, npm是Node.js的包管理器. 第一步:先安装node.js环境 第二步:安装Grunt-CLI 在node.js命令工具中输入npm i ...
- android:模拟水波效果的自己定义View
Github地址:https://github.com/nuptboyzhb/WaterWaveView 欢迎Fork.欢迎Star 1.先看效果 watermark/2/text/aHR0cDovL ...
- 呐喊-Skrik
尼斯,1892年1月22日,我和两个朋友还在散步,太阳已快下山了,天空突然间变得血一样红,我似乎感受到了一种悲伤忧郁的气息,我止住了脚步,轻轻地倚在篱笆边,极度的疲倦已使我快要窒息了.火焰般的云彩像血 ...
- javaSE基础(三)
泛型类:像ArrayList这样的特殊类,他们允许通过类型参数来指明使用的数据类型. 报装类:一种用于将基本类型的数据"封装"成对象的类. 装箱:将 基本类型的数据自动转换为对应类 ...
- 配置Cocos Code IDE 可以正常运行的组合:jdk,sdk ,ndk, ant, cocos2d-x
Cocos Code IDE:Cocos Code IDE 1.0.0-RC2 jdk:Cocos Code IDE 自动的jdk sdk:8以上 ndk:r9d(r10有bug),希望以后版本可以修 ...
- 逻辑频道号---DVB NIT LCN
先介绍NIT,NIT描述如下: 有一点要注意,NIT是对大网的描述,即NIT并不是描述当前的流,而是描述大网的某些或者全部流.如下图,TS流描述1-6共对6个频点不同的TS流进行了描述,具体对哪一个流 ...
- 02_cfork分叉进程
fork函数.调用它就可以在当前的进程当中给它分叉出一个新的进程.分叉出的进程就可以看看它有什么特点?
- bzoj4506: [Usaco2016 Jan]Fort Moo(暴力)
4506: [Usaco2016 Jan]Fort Moo Time Limit: 10 Sec Memory Limit: 128 MBSubmit: 145 Solved: 104[Submi ...
- java自学-方法
上节介绍了流程控制语句,一个复杂的业务逻辑会由很多java代码组成,包含许多功能.比如说购物业务,就包含选商品.下单.支付等功能,如果这些功能的代码写到一起,就会显得很臃肿,可读性非常不好.java提 ...