标签归档:ORA-600 krr_parse_3

近1万个数据文件的恢复case

朋友介绍一个恢复case,数据库发生过硬件故障,做过硬件恢复之后,数据库无法正常启动.我恢复的已经不是第一现场,客户和我反馈说找过三批人进行恢复,都没有正常打开数据库.数据库整体不大(1T左右),但是数据文件近1万个(9895个数据文件),我看了下alert日志,主要报错有:
ORA-600 kcratr_scan_lastbwr,该错误比较常见,一般是由于坏块或者redo和数据文件不匹配导致,在某些情况下recover下就可以解决,有些时候不行,看人品

Mon Feb 17 15:51:15 2025
Started redo scan
Hex dump of (file 3, block 240) in trace file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc
Reading datafile 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF' for corruption at rdba: 0x00c000f0 (file 3, block 240)
Reread (file 3, block 240) found same corrupt data (logically corrupt)
Write verification failed for File 3 Block 240 (rdba 0xc000f0)
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc  (incident=293029):
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_293029\orcl_ora_10508_i293029.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Mon Feb 17 15:51:22 2025
Sweep [inc2][293029]: completed
Mon Feb 17 15:51:25 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []

ORA-600 krr_parse_3错误,官方没有查询到资料,但是从报错的位置分析,应该和redo的应用有直接关系

Thu Feb 20 11:45:03 2025
ALTER DATABASE RECOVER  datafile 116  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321616):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 116  ...
ALTER DATABASE RECOVER  datafile 1168  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321617):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1168  ...

上述两个错误,由于数据库部分文件被offline,而且屏蔽一致性打开等操作,绕过了上述的两个ORA-600错误,现在停留在ORA-00604 ORA-00376 ORA-01110故障导致数据库无法打开的情况,该错误是由于数据库启动过程中有事务,需要使用被offline的undo文件.

Fri Feb 21 07:42:37 2025
minact-scn: got error during useg scan e:376 usn:1
minact-scn: useg scan erroring out with error e:376
Fri Feb 21 07:44:02 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m007_11464.trc:
ORA-51106: 由于出错, 检查无法完成。请查看下面的错误
ORA-48223: 已请求中断 - 提取已中止 - 返回代码 [12751] [HM_FINDING]
Fri Feb 21 07:45:12 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_14108.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00376: 此时无法读取文件 3
ORA-01110: 数据文件 3: 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF'

分析数据库文件状态,有25个数据文件被offline,而且这些文件的resetlogs信息均不对(截取了部分文件)

SQL> set lines 150
SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_TIME                          FUZZY  CHECKPOINT_CHANGE#          ROW_NUM
-------------- ---------------------------------------- ------ ------------------ ----------------
OFFLINE        2025-02-11 15:27:00                      YES            1909526545               22
OFFLINE        2025-02-17 17:24:14                      YES            1909551234                2
OFFLINE        2025-02-17 17:27:35                      NO             1909551234                1
ONLINE         2025-02-22 17:29:25                      YES            2095190672             9869

wrong-resetlogs
offline
对于这种情况,最简单的解决方法就是使用开发的小工具Oracle Recovery Tools(Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)),对这些offline的文件头信息进行修改
ora-tool
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复
尝试直接open数据库

SQL> STARTUP MOUNT PFILE='D:/PFILE.TXT'
ORACLE 例程已经启动。

Total System Global Area      82309009408 bytes
Fixed Size                        2290160 bytes
Variable Size                 12884905488 bytes
Database Buffers              69256347648 bytes
Redo Buffers                    165466112 bytes
数据库装载完毕。
SQL> RECOVER DATAFILE 3;
完成介质恢复。
SQL> RECOVER datafile 6601,7043,7044,7045,7050,
   7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
  ,7188,7190,7191,7192,7244,9501 ;
完成介质恢复。
SQL> alter database datafile 3,6601,7043,7044,7045,7050,
  7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
 ,7188,7190,7191,7192,7244,9501 online;
SQL> ALTER DATABASE OPEN;

数据库已更改。

QQ20250222-184800

Sat Feb 22 18:38:26 2025
alter database mount exclusive
Sat Feb 22 18:38:26 2025
MMNL started with pid=25, OS id=7524 
Successful mount of redo thread 1, with mount id 3367723362
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Sat Feb 22 18:42:34 2025
Thread 1 opened at log sequence 5
  Current log# 2 seq# 5 mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Successful open of redo thread 1
Sat Feb 22 18:42:34 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Feb 22 18:42:34 2025
SMON: enabling cache recovery
[7960] Successfully onlined Undo Tablespace 12273.
Undo initialization finished serial:0 start:98760972 end:98761612 diff:640 (6 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Feb 22 18:42:41 2025
QMNC started with pid=29, OS id=8116 
Sat Feb 22 18:42:45 2025
Completed: alter database open
Sat Feb 22 18:42:47 2025
Starting background process CJQ0
Sat Feb 22 18:42:47 2025
CJQ0 started with pid=31, OS id=3264 
Sat Feb 22 18:42:47 2025
db_recovery_file_dest_size of 4977 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.

数据库已经open,后续收尾工作比较简单,不再累赘.
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复,还是比较方便的
软件下载:OraRecovery下载
使用说明:使用说明

发表在 非常规恢复 | 标签为 , , , , | 留下评论