标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,683)
- DB2 (22)
- MySQL (73)
- Oracle (1,545)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (565)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (79)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-15411: Failure groups in disk group DATA have different number of disks.
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
- Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)
- OGG-02771 Input trail file format RELEASE 19.1 is different from previous trail file form at RELEASE 11.2.
- OGG-02246 Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
标签归档:ORA-600 KCLCHKBLK_4
某医院存储掉线导致Oracle数据库故障恢复
xx医院存储突然掉线,导致数据库异常,现场工程师折腾了一天,问题依旧没有解决,无奈之下找到我们,希望我们能够帮忙恢复数据库.
启动报ORA-00600[2131]错误
Fri Nov 06 14:50:59 2015 ALTER DATABASE MOUNT This instance was first to mount Fri Nov 06 14:50:59 2015 ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.4.4)(PORT=1521))' SCOPE=MEMORY SID='xifenfei1'; NOTE: Loaded library: System SUCCESS: diskgroup DATA was mounted NOTE: dependency between database xifenfei and diskgroup resource ora.DATA.dg is established Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13221.trc (incident=191085): ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], [] Incident details in: /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_191085/xifenfei1_ora_13221_i191085.trc Fri Nov 06 14:51:10 2015 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. ORA-600 signalled during: ALTER DATABASE MOUNT...
出现该错误的原因是由于:We are attempting to write a controlfile checkpoint progress record, but find we do not have the progress record generating this exception.由于控制文件异常导致,出现此类情况,我们一般使用单个控制文件一次尝试,如果都不可以考虑重建控制文件
由于坏块(逻辑/物理)导致数据库实例恢复无法进行
Beginning crash recovery of 2 threads Started redo scan kcrfr_rnenq: use log nab 393216 kcrfr_rnenq: use log nab 2 Completed redo scan read 4427 KB redo, 500 data blocks need recovery Started redo application at Thread 1: logseq 5731, block 391398 Thread 2: logseq 4252, block 520815 Recovery of Online Redo Log: Thread 1 Group 2 Seq 5731 Reading mem 0 Mem# 0: +DATA/xifenfei/onlinelog/group_2.266.835331047 Recovery of Online Redo Log: Thread 2 Group 8 Seq 4252 Reading mem 0 Mem# 0: +DATA/xifenfei/onlinelog/group_8.331.835330421 Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc (incident=197486): ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], [] Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197486/xifenfei1_ora_13770_i197486.trc Fri Nov 06 15:03:09 2015 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc (incident=197487): ORA-01578: ORACLE data block corrupted (file # 2, block # 65207) ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753' ORA-10564: tablespace SYSAUX ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753' ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045 ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], [] Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197487/xifenfei1_ora_13770_i197487.trc Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc: ORA-01578: ORACLE data block corrupted (file # 2, block # 65207) ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753' ORA-10564: tablespace SYSAUX ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753' ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045 ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], [] Recovery of Online Redo Log: Thread 2 Group 3 Seq 4253 Reading mem 0 Mem# 0: +DATA/xifenfei/onlinelog/group_3.332.835330505 Hex dump of (file 14, block 62536) in trace file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc Reading datafile '+DATA/xifenfei/datafile/ht01.dbf' for corruption at rdba: 0x0380f448 (file 14, block 62536) Reread (file 14, block 62536) found same corrupt data (logically corrupt) RECOVERY OF THREAD 1 STUCK AT BLOCK 62536 OF FILE 14 Fri Nov 06 15:03:13 2015 Abort recovery for domain 0 Aborting crash recovery due to error 1172 Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc: ORA-01172: recovery of thread 1 stuck at block 62536 of file 14 ORA-01151: use media recovery to recover block, restore backup if needed Abort recovery for domain 0 Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc: ORA-01172: recovery of thread 1 stuck at block 62536 of file 14 ORA-01151: use media recovery to recover block, restore backup if needed ORA-1172 signalled during: ALTER DATABASE OPEN...
查看资料发现和Bug 14301592 – Several errors by corrupt blocks shifted by 2 bytes in buffer cache during recovery caused by INDEX redo apply,可以通过ALLOW 1 CORRUPTION临时解决
使用ALLOW 1 CORRUPTION进行恢复,出现ORA-07445[kdxlin]错误
Specify log: {<RET>=suggested | filename | AUTO | CANCEL} +DATA/xifenfei/onlinelog/group_3.332.835330505 ORA-00279: change 700860458 generated at 11/05/2015 21:20:15 needed for thread 1 ORA-00289: suggestion : +ARCHIVE/xifenfei/xifenfei_1_5731_835324843.arc ORA-00280: change 700860458 for thread 1 is in sequence #5731 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} +DATA/xifenfei/onlinelog/group_2.266.835331047 ORA-00283: recovery session canceled due to errors ORA-10562: Error occurred while applying redo to data block (file# 2, block# 70104) ORA-10564: tablespace SYSAUX ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753' ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 82289 ORA-00607: Internal error occurred while making a change to a data block ORA-00602: internal programming exception ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB572] [Address not mapped to object] [] ORA-01112: media recovery not started
ORA-07445[kdxlin()+4088]未找到类似说明,到了这一步,无法简单的恢复成功,只能通过设置隐含参数跳过实例恢复,尝试resetlog库
通过设置_allow_resetlogs_corruption参数继续恢复
SQL> startup pfile='/tmp/pfile.ora' mount; ORACLE instance started. Total System Global Area 7315603456 bytes Fixed Size 2267384 bytes Variable Size 2566915848 bytes Database Buffers 4731174912 bytes Redo Buffers 15245312 bytes Database mounted. SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], [] Process ID: 13563 Session ID: 157 Serial number: 3
alert日志报错
Fri Nov 06 19:26:39 2015 SMON: enabling cache recovery Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc (incident=319140): ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], [] Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_319140/xifenfei1_ora_13563_i319140.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Redo thread 2 internally disabled at seq 1 (CKPT) ARC1: Archiving disabled thread 2 sequence 1 Archived Log entry 9956 added for thread 2 sequence 1 ID 0x0 dest 1: ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], [] Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], [] Error 600 happened during db open, shutting down database USER (ospid: 13563): terminating the instance due to error 600 Fri Nov 06 19:26:42 2015 Instance terminated by USER, pid = 13563 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (13563) as a result of ORA-1092 Fri Nov 06 19:26:42 2015 ORA-1092 : opitsk aborting process
这里是比较熟悉的ora-600[kclchkblk_4]错误,和ora-600[2662]错误类似,需要调整scn,由于数据库版本为11.2.0.4,无法使用常规方法调整scn,在修改控制文件,oradebug,bbed方法可供选择
使用oradebug方法处理
因为是asm环境,其他方法处理起来都相对麻烦
[oracle@wisetop1 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Fri Nov 6 19:30:59 2015 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup pfile='/tmp/pfile.ora' mount; ORACLE instance started. Total System Global Area 7315603456 bytes Fixed Size 2267384 bytes Variable Size 2566915848 bytes Database Buffers 4731174912 bytes Redo Buffers 15245312 bytes Database mounted. SQL> oradebug setmypid Statement processed. SQL> oradebug poke 0x06001AE70 4 0x2FAF0800 BEFORE: [06001AE70, 06001AE74) = 00000000 AFTER: [06001AE70, 06001AE74) = 2FAF0800 SQL> alter database open; Database altered.
至此数据库open成功,后续就是处理一些坏块的工作,并建议客户逻辑重建库.
redo异常 ORA-600 kclchkblk_4 故障恢复
朋友和我说,他们数据库由于存储控制器异常,导致数据库无法正常启动.
数据库recover database 提示需要已经覆盖的redo
-bash-3.2$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:23:35 2014 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, Data Mining and Real Application Testing options SQL> recover database; ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2 ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf ORA-00280: change 1639063379 for thread 2 is in sequence #29919 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel Media recovery cancelled. SQL> select thread#,group#,sequence# from v$log; THREAD# GROUP# SEQUENCE# ---------- ---------- ---------- 1 1 30261 1 2 30262 1 3 30258 1 4 30259 1 5 30260 2 6 29966 2 7 29967 2 8 29968 2 9 29969 2 10 29970 10 rows selected. SQL> select group#,member from v$logfile; GROUP# MEMBER ---------- -------------------------------------------------- 1 /dev/md/oradg/rdsk/d115 2 /dev/md/oradg/rdsk/d116 3 /dev/md/oradg/rdsk/d117 4 /dev/md/oradg/rdsk/d118 5 /dev/md/oradg/rdsk/d119 6 /dev/md/oradg/rdsk/d120 7 /dev/md/oradg/rdsk/d121 8 /dev/md/oradg/rdsk/d122 9 /dev/md/oradg/rdsk/d123 10 /dev/md/oradg/rdsk/d124 10 rows selected. SQL> recover database; ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2 ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf ORA-00280: change 1639063379 for thread 2 is in sequence #29919 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel Media recovery cancelled.
数据库没归档,redo覆盖,无法继续恢复,使用_allow_resetlogs_corruption屏蔽前滚,继续恢复
SQL> startup mount pfile='/tmp/pfile.txt'; ORACLE instance started. Total System Global Area 2097152000 bytes Fixed Size 2053120 bytes Variable Size 1090522112 bytes Database Buffers 855638016 bytes Redo Buffers 148938752 bytes Database mounted. SQL> recover database until cancel; ORA-00279: change 1639063379 generated at 11/01/2014 12:58:20 needed for thread 1 ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch1_30262_790965041.dbf ORA-00280: change 1639063379 for thread 1 is in sequence #30262 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '/dev/md/oradg/rdsk/d105' ORA-01112: media recovery not started SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced
查看alert日志
Mon Nov 03 17:37:11 CST 2014 Completed crash recovery at Thread 2: logseq 1, block 3, scn 1639083385 0 data blocks read, 0 data blocks written, 1 redo blocks read Picked broadcast on commit scheme to generate SCNs Mon Nov 03 17:37:12 CST 2014 Thread 2 advanced to log sequence 2 (thread open) Thread 2 opened at log sequence 2 Current log# 7 seq# 2 mem# 0: /dev/md/oradg/rdsk/d121 Successful open of redo thread 2 Mon Nov 03 17:37:12 CST 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Mon Nov 03 17:37:13 CST 2014 SMON: enabling cache recovery Mon Nov 03 17:37:13 CST 2014 Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], [] Mon Nov 03 17:37:14 CST 2014 Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], [] Mon Nov 03 17:37:14 CST 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 15700 ORA-1092 signalled during: alter database open...
参考:In 10.1.0.2: ORA-600 [kclchkblk_4] and ORA-600 [2662] After Recovery of Database (Doc ID 275902.1),删除tempfile继续打开数据库
SQL> alter database tempfile '/dev/md/oradg/rdsk/d109' drop; Database altered. SQL> c/109/110 1* alter database tempfile '/dev/md/oradg/rdsk/d110' drop SQL> / Database altered. SQL> alter database open; alter database open * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced
数据库依然报ORA-00600kclchkblk_4]错误,因此解决该问题选择使用bbed修改文件头scn来完成,具体参考类似文章:使用bbed修复损坏datafile header
-bash-3.2$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:41:17 2014 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to an idle instance. SQL> startup pfile='/tmp/pfile.txt' mount; ORACLE instance started. Total System Global Area 2097152000 bytes Fixed Size 2053120 bytes Variable Size 1090522112 bytes Database Buffers 855638016 bytes Redo Buffers 148938752 bytes Database mounted. SQL> alter database open; Database altered. SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d109' size 8388608000 autoextend off; Tablespace altered. SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d110' size 8388608000 autoextend off; Tablespace altered. SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 2097152000 bytes Fixed Size 2053120 bytes Variable Size 1090522112 bytes Database Buffers 855638016 bytes Redo Buffers 148938752 bytes Database mounted. Database opened.
至此数据库恢复完成,建议逻辑方式重建数据库
rac redo log file被意外覆盖数据库恢复
当前日志被覆盖导致错误
朋友的一客户在一套rac上包含了两个数据库,其其中一个库增加redo group时候,覆盖了另外一个库的redo,悲剧的是刚好是current redo
Wed May 16 17:03:05 2012 ALTER DATABASE OPEN This instance was first to open Wed May 16 17:03:09 2012 Beginning crash recovery of 2 threads parallel recovery started with 15 processes Wed May 16 17:03:11 2012 Started redo scan Wed May 16 17:03:11 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc: ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2' ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1' Abort recovery for domain 0 Wed May 16 17:03:11 2012 Aborting crash recovery due to error 305 Wed May 16 17:03:11 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc: ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2' ORA-00305: log 14 of thread 1 inconsistent; belongs to another database ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1' ORA-305 signalled during: ALTER DATABASE OPEN... Wed May 16 17:03:13 2012 Shutting down instance (abort)
使用_allow_resetlogs_corruption= TRUE进行恢复
Wed May 16 18:16:48 2012 SMON: enabling cache recovery Wed May 16 18:16:48 2012 Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads Wed May 16 18:16:48 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], [2522], [18446744072024247666], [], [], [] Wed May 16 18:16:50 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], [2522], [18446744072024247666], [], [], [] Wed May 16 18:16:50 2012 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 2105454 ORA-1092 signalled during: alter database open resetlogs...
ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block is too high.
The same reason caused the ORA-600[2662]s in the alert logs.
因为是临时文件的scn太大的问题,那就比较好解决:
启动数据库到mount状态,查询出来相关temp file,然后drop掉.
ORA-00600[6856]
Wed May 16 20:25:16 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc: ORA-00339: archived log does not contain any redo ORA-00334: archived log: '/dev/rods_redo2_1_1' ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], [] ORACLE Instance odsdb1 (pid = 16) - Error 600 encountered while recovering transaction (10, 8) on object 7162533. Wed May 16 20:25:16 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc: ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []
这里的4193和4194是比较熟悉的,根据这里的提示猜测6856也是和undo有关系
ORA-600[6856]SMON is trying to recover a dead transaction.
But the undo application runs into an internal error (trying to delete a row that is already deleted).
因为smon回滚的时候出现上面错误,解决方法是想办法终止回滚,使用event=”10513 trace name context forever, level 2″.
ORA-00600[4193]/ORA-00600[4194]
Wed May 16 20:25:17 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc: ORA-00339: archived log does not contain any redo ORA-00334: archived log: '/dev/rods_redo2_1_1' ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], [] Wed May 16 20:25:18 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc: ORA-00339: archived log does not contain any redo ORA-00334: archived log: '/dev/rods_redo2_1_1' ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], [] Wed May 16 20:25:56 2012 Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc: ORA-00600: internal error code, arguments: [4193], [22248], [22252], [], [], [], [], []
太常见错误,不再做说明,虽然使用event是的库open成功,因为部分回滚段有问题,该错误还是会出现(还是喜欢直接屏蔽回滚段)
ORA-00600[ktpridestroy2]
Wed May 16 20:36:26 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2101296.trc: ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], []
This error could be the result of a corruption and involves the parallel rollback that SMON enables each startup.
解决:fast_start_parallel_rollback=false
ORA-00600[kturacf1]/ORA-00600[kcbgcur_9]
Wed May 16 20:49:15 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_2007088.trc: ORA-00600: internal error code, arguments: [kturacf1], [2097152], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [kcbgcur_9], [780140563], [4], [4294959056], [2097152], [], [], []
ORA-00600[kturacf1]错误未查询到原因
ORA-00600[kcbgcur_9]错误原因可能是:Buffers are pinned in a specific class order to prevent internal deadlocks.
因为这两个错误是job产生非致命错误,在这次的处理过程中可以忽略
ORA-00600[4097]
Wed May 16 21:05:05 2012 Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_1716282.trc: ORA-12012: error on auto execute of job 6603 ORA-20001: ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], [] ORA-06512: at "EPBI.UP_SYSLOG_ONLINE_USER", line 141 ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
When an instance has a rollback segment offline and the instance crashes, or
the user does a shutdown abort, the rollback segment wrap number does not get
updated. If that segment is then dropped and recreated immediately after the
instance is restarted, the wrap number could be lower than existing wrap
numbers. This will cause the ORA-600[4097] to occur in subsequent
transactions using Rollback.
这个错误也是因为回滚段wrap number未被及时更新导致的异常.
总结这次恢复过程
1.因当前redo丢失,使用隐含参数打开库,然后出现ORA-600[KCLCHKBLK_4](这个错误比较少见,更何况这个库是10.2.0.4)
2.undo出现问题出现ORA-00600[6856]错误不是很多见
3.接下来都是一些列undo导致的问题,其实如果开始就直接使用隐含参数删除掉有问题回滚段,效果可能会比event好.
4.因为部分trace文件没有拿到,未做深入分析,文章列出整体恢复思路
5.本次恢复的数据库版本是10.2.0.4,根据mos描述,很多错误应不会发生,但是实际还是发生了,MOS的版本范围,不要太看重.
6.其实这篇文章的本质不是展示恢复过程,而是再一次提醒:操作数据库慎重,特别是一台机器上多套库.