联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ORA-00600[kcratr1_lostwrt]/ORA-00600[3020]错误恢复
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
open数据库alert日志报ORA-00600[kcratr1_lostwrt]错误
Mon May 14 14:57:28 2012 ALTER DATABASE OPEN Mon May 14 14:57:29 2012 Beginning crash recovery of 1 threads Mon May 14 14:57:29 2012 Started redo scan Mon May 14 14:57:29 2012 Errors in file d:\oracle\admin\cqgasold\udump\cqgasold_ora_504.trc: ORA-00600: 内部错误代码,参数: [kcratr1_lostwrt], [], [], [], [], [], [], [] ORA-600 signalled during: alter database open...
查询相关SCN
同一个查询中SCN相同,省略
SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN", 2 To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file; 未选定行 SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN", 2 to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile; FILE# SCN STOP_SCN ---------- ---------------- ---------------- 1 7842987188 2 7842987188 3 7842987188 SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN", 2 to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN" 3 from v$datafile_header; FILE# SCN RESETLOGS SCN ---------- ----------------- ----------------- 1 7842991811 1 2 7842991811 1 3 7842991811 1
这里看到奇怪现象datafile scn小于datafile_header scn,数据库异常断电一般来说也不会出现这样的情况,个人猜测是错误的恢复或者使用历史控制文件导致,对于这样的现状,我先尝试着使用using backup controlfile方式恢复,结果失败.估计控制文件有异常,本着先拉起库原则,重建控制文件.
进行完全恢复
SQL> recover database; ORA-00283: 恢复会话因错误而取消 ORA-00600: 内部错误代码,参数: [3020], [8388617], [1], [23403], [25], [112],[], [] ORA-10567: Redo is inconsistent with data block (file# 2, block# 9) ORA-10564: tablespace UNDOTBS1 ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF' ORA-10560: block type 'KTU SMU HEADER BLOCK'
尝试跳过坏块继续恢复
SQL> recover database allow 1 corruption; ORA-00283: 恢复会话因错误而取消 ORA-00600: 内部错误代码,参数: [3020], [8388610], [1], [23403], [2264], [16],[], [] ORA-10567: Redo is inconsistent with data block (file# 2, block# 2) ORA-10564: tablespace UNDOTBS1 ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF' ORA-10560: block type 'KTFB Bitmapped File Space Header'
使用dbv检查坏块数量
C:\>dbv file='d:\oracle\oradata\cqgasold\undo_1.dbf' blocksize=8192 DBVERIFY: Release 9.2.0.5.0 - Production on 星期二 5月 15 19:43:42 2012 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. DBVERIFY - 验证正在开始 : FILE = d:\oracle\oradata\cqgasold\undo_1.dbf DBV-00200: 块, dba 8388617, 已经标记为崩溃 汇入的页369 - 可能是介质损坏 *** Corrupt block relative dba: 0x00800171 (file 2, block 369) Fractured block found during dbv: Data in bad block - type: 2 format: 2 rdba: 0x00800171 last change scn: 0x0001.d37c57db seq: 0x1 flg: 0x04 consistency value in tail: 0x4e930260 check value in block header: 0x8202, computed block checksum: 0x4e5f spare1: 0x0, spare2: 0x0, spare3: 0x0 *** 汇入的页417 - 可能是介质损坏 *** Corrupt block relative dba: 0x008001a1 (file 2, block 417) Fractured block found during dbv: Data in bad block - type: 2 format: 2 rdba: 0x008001a1 last change scn: 0x0001.d37c53d4 seq: 0x2 flg: 0x04 consistency value in tail: 0x4b6b0201 check value in block header: 0x6ae7, computed block checksum: 0x5abc spare1: 0x0, spare2: 0x0, spare3: 0x0 *** ………… --类此记录很多,我放弃了跳过坏块修复的方法
恢复过程中提示坏块数据库文件离线恢复
SQL> alter database datafile 'd:\oracle\oradata\cqgasold\undo_1.dbf' offline; 数据库已更改。 SQL> recover database; 完成介质恢复。 SQL> alter database open; alter database open * ERROR 位于第 1 行: ORA-00604: 递归 SQL 层 1 出现错误 ORA-00376: 此时无法读取文件 2 ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF'
到了这一步,根据经验,数据库被open的可能性很多了,很有可能是open以后因为smon回滚导致数据库down
查看日志,屏蔽回滚段,完成恢复
Tue May 15 19:59:52 2012 alter database open Tue May 15 19:59:52 2012 Beginning crash recovery of 1 threads Tue May 15 19:59:52 2012 Started redo scan Tue May 15 19:59:52 2012 Completed redo scan 323 redo blocks read, 82 data blocks need recovery Tue May 15 19:59:52 2012 Started recovery at Thread 1: logseq 23404, block 3, scn 0.0 Recovery of Online Redo Log: Thread 1 Group 4 Seq 23404 Reading mem 0 Mem# 0 errs 0: F:\ORACLE\ORADATA\LOGCQGASOLD4.ORA Tue May 15 19:59:52 2012 Completed redo application Tue May 15 19:59:52 2012 Ended recovery at Thread 1: logseq 23404, block 326, scn 1.3548264979 82 data blocks read, 82 data blocks written, 323 redo blocks read Crash recovery completed successfully Tue May 15 19:59:53 2012 Thread 1 advanced to log sequence 23405 Thread 1 opened at log sequence 23405 Current log# 2 seq# 23405 mem# 0: D:\ORACLE\ORADATA\CQGASOLD\REDO02.LOG Successful open of redo thread 1 Tue May 15 19:59:53 2012 SMON: enabling cache recovery SMON: enabling tx recovery Tue May 15 19:59:54 2012 Database Characterset is ZHS16GBK Tue May 15 19:59:55 2012 replication_dependency_tracking turned off (no async multimaster replication found) ORA-604 signalled during: alter database open... Tue May 15 19:59:56 2012 SMON: about to recover undo segment 1 SMON: mark undo segment 1 as needs recovery SMON: about to recover undo segment 2 SMON: mark undo segment 2 as needs recovery SMON: about to recover undo segment 3 SMON: mark undo segment 3 as needs recovery SMON: about to recover undo segment 4 SMON: mark undo segment 4 as needs recovery SMON: about to recover undo segment 5 SMON: mark undo segment 5 as needs recovery SMON: about to recover undo segment 6 SMON: mark undo segment 6 as needs recovery SMON: about to recover undo segment 7 SMON: mark undo segment 7 as needs recovery SMON: about to recover undo segment 8 SMON: mark undo segment 8 as needs recovery SMON: about to recover undo segment 9 SMON: mark undo segment 9 as needs recovery SMON: about to recover undo segment 10 SMON: mark undo segment 10 as needs recovery Tue May 15 20:00:37 2012 Shutting down instance (abort)
看到这里,可以大概确定是因为undo文件离线,导致回滚段异常.
这个问题,基本上可以确定通过隐含参数屏蔽回滚段,然后open数据库,重建undo删除异常undo,数据库恢复完成。
600[3020]恢复过程alert日志记录
ORA-600 [3020] “Stuck Recovery” [ID 30866.1]