标签归档:ORA-600 kcratr_scan_lastbwr

近1万个数据文件的恢复case

朋友介绍一个恢复case,数据库发生过硬件故障,做过硬件恢复之后,数据库无法正常启动.我恢复的已经不是第一现场,客户和我反馈说找过三批人进行恢复,都没有正常打开数据库.数据库整体不大(1T左右),但是数据文件近1万个(9895个数据文件),我看了下alert日志,主要报错有:
ORA-600 kcratr_scan_lastbwr,该错误比较常见,一般是由于坏块或者redo和数据文件不匹配导致,在某些情况下recover下就可以解决,有些时候不行,看人品

Mon Feb 17 15:51:15 2025
Started redo scan
Hex dump of (file 3, block 240) in trace file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc
Reading datafile 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF' for corruption at rdba: 0x00c000f0 (file 3, block 240)
Reread (file 3, block 240) found same corrupt data (logically corrupt)
Write verification failed for File 3 Block 240 (rdba 0xc000f0)
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc  (incident=293029):
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_293029\orcl_ora_10508_i293029.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Mon Feb 17 15:51:22 2025
Sweep [inc2][293029]: completed
Mon Feb 17 15:51:25 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []

ORA-600 krr_parse_3错误,官方没有查询到资料,但是从报错的位置分析,应该和redo的应用有直接关系

Thu Feb 20 11:45:03 2025
ALTER DATABASE RECOVER  datafile 116  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321616):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 116  ...
ALTER DATABASE RECOVER  datafile 1168  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321617):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1168  ...

上述两个错误,由于数据库部分文件被offline,而且屏蔽一致性打开等操作,绕过了上述的两个ORA-600错误,现在停留在ORA-00604 ORA-00376 ORA-01110故障导致数据库无法打开的情况,该错误是由于数据库启动过程中有事务,需要使用被offline的undo文件.

Fri Feb 21 07:42:37 2025
minact-scn: got error during useg scan e:376 usn:1
minact-scn: useg scan erroring out with error e:376
Fri Feb 21 07:44:02 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m007_11464.trc:
ORA-51106: 由于出错, 检查无法完成。请查看下面的错误
ORA-48223: 已请求中断 - 提取已中止 - 返回代码 [12751] [HM_FINDING]
Fri Feb 21 07:45:12 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_14108.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00376: 此时无法读取文件 3
ORA-01110: 数据文件 3: 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF'

分析数据库文件状态,有25个数据文件被offline,而且这些文件的resetlogs信息均不对(截取了部分文件)

SQL> set lines 150
SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_TIME                          FUZZY  CHECKPOINT_CHANGE#          ROW_NUM
-------------- ---------------------------------------- ------ ------------------ ----------------
OFFLINE        2025-02-11 15:27:00                      YES            1909526545               22
OFFLINE        2025-02-17 17:24:14                      YES            1909551234                2
OFFLINE        2025-02-17 17:27:35                      NO             1909551234                1
ONLINE         2025-02-22 17:29:25                      YES            2095190672             9869

wrong-resetlogs
offline
对于这种情况,最简单的解决方法就是使用开发的小工具Oracle Recovery Tools(Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)),对这些offline的文件头信息进行修改
ora-tool
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复
尝试直接open数据库

SQL> STARTUP MOUNT PFILE='D:/PFILE.TXT'
ORACLE 例程已经启动。

Total System Global Area      82309009408 bytes
Fixed Size                        2290160 bytes
Variable Size                 12884905488 bytes
Database Buffers              69256347648 bytes
Redo Buffers                    165466112 bytes
数据库装载完毕。
SQL> RECOVER DATAFILE 3;
完成介质恢复。
SQL> RECOVER datafile 6601,7043,7044,7045,7050,
   7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
  ,7188,7190,7191,7192,7244,9501 ;
完成介质恢复。
SQL> alter database datafile 3,6601,7043,7044,7045,7050,
  7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
 ,7188,7190,7191,7192,7244,9501 online;
SQL> ALTER DATABASE OPEN;

数据库已更改。

QQ20250222-184800

Sat Feb 22 18:38:26 2025
alter database mount exclusive
Sat Feb 22 18:38:26 2025
MMNL started with pid=25, OS id=7524 
Successful mount of redo thread 1, with mount id 3367723362
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Sat Feb 22 18:42:34 2025
Thread 1 opened at log sequence 5
  Current log# 2 seq# 5 mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Successful open of redo thread 1
Sat Feb 22 18:42:34 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Feb 22 18:42:34 2025
SMON: enabling cache recovery
[7960] Successfully onlined Undo Tablespace 12273.
Undo initialization finished serial:0 start:98760972 end:98761612 diff:640 (6 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Feb 22 18:42:41 2025
QMNC started with pid=29, OS id=8116 
Sat Feb 22 18:42:45 2025
Completed: alter database open
Sat Feb 22 18:42:47 2025
Starting background process CJQ0
Sat Feb 22 18:42:47 2025
CJQ0 started with pid=31, OS id=3264 
Sat Feb 22 18:42:47 2025
db_recovery_file_dest_size of 4977 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.

数据库已经open,后续收尾工作比较简单,不再累赘.
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复,还是比较方便的
软件下载:OraRecovery下载
使用说明:使用说明

发表在 非常规恢复 | 标签为 , , , , | 留下评论

数据库open报ORA-600 kcratr_scan_lastbwr故障处理

由于断电,导致数据库正常open报ORA-600 kcratr_scan_lastbwr错误

Wed Jan 17 18:23:26 2024
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 1028618590
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jan 17 18:23:30 2024
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Hex dump of (file 3, block 144) in trace file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc
Reading datafile '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf' 
  for corruption at rdba: 0x00c00090 (file 3, block 144)
Reread (file 3, block 144) found same corrupt data (logically corrupt)
Write verification failed for File 3 Block 144 (rdba 0xc00090)
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc  (incident=672241):
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], []
Incident details in: /database/oracle/app/oracle/diag/rdbms/xff/xff/incident/incdir_672241/xff_ora_66361_i672241.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc:
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], []
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_66361.trc:
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

尝试recover 数据库报ORA-600 3020错误

Wed Jan 17 18:28:38 2024
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 96 slaves
Wed Jan 17 18:28:41 2024
Recovery of Online Redo Log: Thread 1 Group 2 Seq 410864 Reading mem 0
  Mem# 0: /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_2_hct740hq_.log
Wed Jan 17 18:28:42 2024
ORA-00600: internal error code, arguments: [3020], [3], [240], [12583152], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 240, file offset is 1966080 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 3: '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf'
ORA-10560: block type 'KTU SMU HEADER BLOCK'
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Slave exiting with ORA-600 exception
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_pr19_68212.trc:
ORA-00600: internal error code, arguments: [3020], [3], [240], [12583152], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 240, file offset is 1966080 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 3: '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf'
ORA-10560: block type 'KTU SMU HEADER BLOCK'
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_68038.trc  (incident=672243):
ORA-00600: internal error code, arguments: [3020], [3], [240], [12583152], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 240, file offset is 1966080 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 3: '/database/oracle/app/oracle/oradata/xff/datafile/o1_mf_undotbs1_hct7001s_.dbf'
ORA-10560: block type 'KTU SMU HEADER BLOCK'
Wed Jan 17 18:28:43 2024
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Recovery Slave PR19 previously exited with exception 600
Wed Jan 17 18:28:43 2024
Sweep [inc][672865]: completed
Media Recovery failed with error 448
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_pr00_68115.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
ORA-600 signalled during: ALTER DATABASE RECOVER  database  ...

加上隐含参数尝试强制拉库

alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 6165467436
Clearing online redo logfile 1 /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_1_hct740fp_.log
Clearing online log 1 of thread 1 sequence number 0
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_2_hct740hq_.log
Clearing online log 2 of thread 1 sequence number 0
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_3_hct740k7_.log
Clearing online log 3 of thread 1 sequence number 0
Clearing online redo logfile 3 complete
Online log /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_1_hct740fp_.log: Thread 1 Group 1 was previously cleared
Online log /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_2_hct740hq_.log: Thread 1 Group 2 was previously cleared
Online log /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_3_hct740k7_.log: Thread 1 Group 3 was previously cleared
Fri Jan 19 09:24:59 2024
Setting recovery target incarnation to 2
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 6165467439, threshold SCN value is 0
If you have not previously reported this warning on this database,
  please notify Oracle Support so that additional diagnosis can be performed.
Fri Jan 19 09:24:59 2024
Assigning activation ID 1028784413 (0x3d52011d)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /database/oracle/app/oracle/oradata/xff/onlinelog/o1_mf_1_hct740fp_.log
Successful open of redo thread 1
Fri Jan 19 09:24:59 2024
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jan 19 09:24:59 2024
SMON: enabling cache recovery
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_84860.trc  (incident=1344255):
ORA-00600: internal error code, arguments: [2662], [1], [1870500147], [1], [1870515285], [12583040]
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_84860.trc:
ORA-00600: internal error code, arguments: [2662], [1], [1870500147], [1], [1870515285], [12583040]
Errors in file /database/oracle/app/oracle/diag/rdbms/xff/xff/trace/xff_ora_84860.trc:
ORA-00600: internal error code, arguments: [2662], [1], [1870500147], [1], [1870515285], [12583040]
Error 600 happened during db open, shutting down database
USER (ospid: 84860): terminating the instance due to error 600
Instance terminated by USER, pid = 84860
ORA-1092 signalled during: alter database open resetlogs...

客户自行恢复到这一步,后面无法处理,接手之后进行恢复,其实后面比较简单了,就是修改下数据库scn,数据库就可以open起来,然后处理异常的undo和对象即可,可以参考以前类似文章:
ORA-600 2662快速恢复之Patch scn工具
硬件故障导致ORA-600 2662错误处理
ORA-00600 [2662]和ORA-00600 [4194]恢复
更多参考:惜分飞blog中ORA-600 2662文章

发表在 Oracle备份恢复 | 标签为 , , , | 评论关闭

ora-600 kcratr_scan_lastbwr

有客户数据库由于断电,导致启动报错ora-600 kcratr_scan_lastbwr错误

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Hex dump of (file 4, block 3952129) in trace file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4500.trc
Reading datafile 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS01.DBF' for corruption at rdba:0x013c4e01(file 4,block 3952129)
Reread (file 4, block 3952129) found same corrupt data (logically corrupt)
Write verification failed for File 4 Block 3952129 (rdba 0x13c4e01)
Fri Feb 18 10:16:34 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4500.trc  (incident=388961):
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in:D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_388961\orcl_ora_4500_i388961.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4500.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_4500.trc:
ORA-00600: ??????, ??: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...

根据MOS中的描述,这个问题主要出现在11.2.0.2之前版本中,但是本case发生在11.2.0.3的数据库中
20220218220920


ORA-600 [kcratr_scan_lastbwr] (Doc ID 1267231.1)描述,recover操作,数据库直接open,实现数据0丢失

发表在 Oracle备份恢复 | 标签为 , | 评论关闭