联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ORA-600 kcratr_nab_less_than_odr故障解决
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
朋友的数据库服务器出现ORA-00600[kcratr_nab_less_than_odr],不能open数据库
1.open数据库报ORA-00600[kcratr_nab_less_than_odr]
SQL> ALTER DATABASE OPEN; ALTER DATABASE OPEN * 第 1 行出现错误: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], []
2.查看alert日志
Wed Jan 11 13:56:16 2012 ALTER DATABASE OPEN Beginning crash recovery of 1 threads parallel recovery started with 2 processes Started redo scan Completed redo scan read 54591 KB redo, 0 data blocks need recovery Errors in file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc (incident=818557): ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] Incident details in: d:\dbdms\diag\rdbms\dbdms\dbdms\incident\incdir_818557\dbdms_ora_3936_i818557.trc Aborting crash recovery due to error 600 Errors in file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] Errors in file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] ORA-600 signalled during: ALTER DATABASE OPEN... Trace dumping is performing id=[cdmp_20120110214555]
3.查看trace文件
Trace file d:\dbdms\diag\rdbms\dbdms\dbdms\trace\dbdms_ora_3108.trc Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options Windows NT Version V6.1 Service Pack 1 CPU : 2 - type 8664, 2 Physical Cores Process Affinity : 0x0x0000000000000000 Memory (Avail/Total): Ph:2250M/4060M, Ph+PgF:5868M/8119M Instance name: dbdms Redo thread mounted by this instance: 1 Oracle process number: 17 Windows thread id: 3108, image: ORACLE.EXE (SHAD) ………………………… WARNING! Crash recovery of thread 1 seq 99189 is ending at redo block 43531 but should not have ended before redo block 43569 Incident 826550 created, dump file: d:\dbdms\diag\rdbms\dbdms\dbdms\incident\incdir_826550\dbdms_ora_3108_i826550.trc ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] ORA-00600: ??????, ??: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], []
通过alert和trace中的内容可以知道,数据库需要恢复到rba到43569,但是因为某种原因实例恢复的时候,只能利用1 thread 99189 seq#,恢复rba到43531。从而导致数据库无法正常open
This Problem is caused by Storage Problem of the Database Files. The Subsystem (eg. SAN) crashed while the Database was open. The Database then crashed since the Database Files were not accessible anymore. This caused a lost Write into the Online RedoLogs and so Instance Recovery is not possible and raising the ORA-600.
4.解决方法
SQL> SELECT STATUS FROM V$INSTANCE; STATUS ------------ MOUNTED --尝试直接recover database SQL> RECOVER DATABASE ; ORA-00283: 恢复会话因错误而取消 ORA-00264: 不要求恢复 --提示不用恢复 --再打开数据库,还是kcratr_nab_less_than_odr错误警告 SQL> ALTER DATABASE OPEN; ALTER DATABASE OPEN * 第 1 行出现错误: ORA-00600: 内部错误代码, 参数: [kcratr_nab_less_than_odr], [1], [99189], [43531], [43569], [], [], [], [], [], [], [] --尝试不完全恢复 SQL> RECOVER DATABASE UNTIL CANCEL; ORA-10879: error signaled in parallel recovery slave ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误 ORA-01152: 文件 1 没有从过旧的备份中还原 ORA-01110: 数据文件 1: 'D:\DBDMS\DATA\SYSTEM01.DBF' --重建控制文件 SQL> ALTER DATABASE BACKUP CONTROLFILE TO TRACE AS 'D:/1.TXT'; 数据库已更改。 SQL> SHUTDOWN IMMEDIATE; ORA-01109: 数据库未打开 已经卸载数据库。 ORACLE 例程已经关闭。 SQL> STARTUP NOMOUNT; ORACLE 例程已经启动。 Total System Global Area 417546240 bytes Fixed Size 2176328 bytes Variable Size 268438200 bytes Database Buffers 138412032 bytes Redo Buffers 8519680 bytes SQL> CREATE CONTROLFILE REUSE DATABASE "DBDMS" NORESETLOGS NOARCHIVELOG 2 MAXLOGFILES 16 3 MAXLOGMEMBERS 3 4 MAXDATAFILES 100 5 MAXINSTANCES 8 6 MAXLOGHISTORY 18688 7 LOGFILE 8 GROUP 1 'D:\DBDMS\LOG\REDO01.LOG' SIZE 50M BLOCKSIZE 512, 9 GROUP 2 'D:\DBDMS\LOG\REDO02.LOG' SIZE 50M BLOCKSIZE 512, 10 GROUP 3 'D:\DBDMS\LOG\REDO03.LOG' SIZE 50M BLOCKSIZE 512 11 DATAFILE 12 'D:\DBDMS\DATA\SYSTEM01.DBF', 13 'D:\DBDMS\DATA\SYSAUX01.DBF', 14 'D:\DBDMS\DATA\RBSG01.DBF', 15 'D:\DBDMS\DATA\DATA01.DBF', 16 'D:\DBDMS\DATA\INDX01.DBF', 17 'D:\DBDMS\DATA\DATA02.DBF', 18 'D:\DBDMS\DATA\DATA03.DBF', 19 'D:\DBDMS\DATA\DATA04.DBF', 20 'D:\DBDMS\DATA\INDX02.DBF', 21 'D:\DBDMS\DATA\SYSTEM02.DBF' 22 CHARACTER SET ZHS16GBK 23 ; 控制文件已创建。 --继续尝试恢复 SQL> RECOVER DATABASE ; 完成介质恢复。 SQL> ALTER DATABASE OPEN; 数据库已更改。 --open成功
在这次恢复中,主要就是重建控制文件,然后直接恢复成功,如果redo有损坏,那么可能需要使用不完全恢复,然后使用resetlogs打开数据库
VMCD,
1.ORA-00600[kcratr_nab_less_than_odr]错误的本质就是rba不能恢复到on-disk rba位置,可能的原因就断电的时候,使得redo中记录的rba没有达到控制文件中的on-disk rba,这个问题的解决办法,一般都是利用备份控制文件恢复,现在没有备份控制文件,只能利用重建控制文件解决
2.根据上面的完全恢复和不完全恢复的提示,也可以考虑重建控制文件后再尝试恢复
你重建控制文件的目的是什么
Redo Byte Address (RBA)
the log file sequence number (4 bytes)
the log file block number (4 bytes)
the byte offset into the block at which the redo record starts (2 bytes)
With respect to a dirty block in the buffer cache, the low RBA is the address of the redo for the first change that was applied to the block since it was last clean, and the high RBA is the address of the redo for the most recent change to have been applied to the block.
Dirty buffers are maintained on the buffer cache checkpoint queues in low RBA order. The checkpoint RBA is the point up to which DBWn has written buffers from the checkpoint queues if incremental checkpointing is enabled — otherwise it is the RBA of last full thread checkpoint. The checkpoint RBA is copied into the checkpoint progress record of the controlfile by the checkpoint heartbeat once every 3 seconds. Instance recovery, when needed, begins from the checkpoint RBA recorded in the controlfile. The target RBA is the point up to which DBWn should seek to advance the checkpoint RBA to satisfy instance recovery objectives.
The on-disk RBA is the point up to which LGWR has flushed the redo thread to the online log files. DBWn may not write a block for which the high RBA is beyond the on-disk RBA. Otherwise transaction recovery (rollback) would not be possible, because the redo needed to undo a change is always in the same redo record as the redo for the change itself.
The term sync RBA is sometimes used to refer to the point up to which LGWR is required to sync the thread. However, this is not a full RBA — only a redo block number is used at this point.
low rba :在buffer cache中的数据块第一次数据改变所对应的RAB。 脏数据块在检查点 队列里面按照low rba排列。
high rba :在buffer cache中的数据块最近一次数据改变时所对应的RAB。
checkpoint rba:在checkpint queue中(每次checkpoint queue被clean以后)第一个脏数据块第一次被修改对应的RAB,这个RBA之前的脏数据已经被全部写入磁盘。
on-disk rba:是 lgwr 写日志文件的最末位置的地址。