联系:手机/微信(+86 17813235971) QQ(107644445)
标题:SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失
Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc: ORA-00313: open failed for members of log group 7 of thread 1 ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc: ORA-00313: open failed for members of log group 7 of thread 1 ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Thu Nov 20 11:28:39 2014 LGWR: terminating instance due to error 313 Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc: ORA-00313: open failed for members of log group of thread Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc: ORA-00313: open failed for members of log group of thread Instance terminated by LGWR, pid = 1404
尝试clear redo文件方式恢复
Thu Nov 20 13:04:16 2014 alter database clear logfile group 9 Thu Nov 20 13:04:16 2014 ORA-1624 signalled during: alter database clear logfile group 9... Thu Nov 20 13:04:45 2014 alter database clear logfile group 9 Thu Nov 20 13:04:46 2014 ORA-1624 signalled during: alter database clear logfile group 9... Thu Nov 20 13:04:59 2014 alter database clear unarchived logfile group 9 Thu Nov 20 13:04:59 2014 ORA-1624 signalled during: alter database clear unarchived logfile group 9... Thu Nov 20 13:05:00 2014 alter database clear unarchived logfile group 9 Thu Nov 20 13:05:00 2014 ORA-1624 signalled during: alter database clear unarchived logfile group 9...
不完全恢复resetlogs尝试打开数据库
ORA-279 signalled during: ALTER DATABASE RECOVER database using backup controlfile ... Thu Nov 20 13:49:01 2014 ALTER DATABASE RECOVER CONTINUE DEFAULT Thu Nov 20 13:49:02 2014 Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc ORA-308 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ... Thu Nov 20 13:49:02 2014 ALTER DATABASE RECOVER CONTINUE DEFAULT Thu Nov 20 13:49:02 2014 Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc ORA-308 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ... Thu Nov 20 13:49:02 2014 ALTER DATABASE RECOVER CANCEL Thu Nov 20 13:49:03 2014 Media Recovery Canceled Completed: ALTER DATABASE RECOVER CANCEL Thu Nov 20 13:49:33 2014 alter database open resetlogs Thu Nov 20 13:49:34 2014 ORA-1113 signalled during: alter database open resetlogs...
使用隐含参数
_allow_resetlogs_corruption= TRUE
进行不完全恢复,尝试open数据库报ORA-600 4000错误
Thu Nov 20 14:35:02 2014 ALTER DATABASE MOUNT Thu Nov 20 14:35:07 2014 Setting recovery target incarnation to 2 Thu Nov 20 14:35:07 2014 Successful mount of redo thread 1, with mount id 4039504598 Thu Nov 20 14:35:07 2014 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Thu Nov 20 14:40:33 2014 ALTER DATABASE RECOVER database until cancel Thu Nov 20 14:40:33 2014 Media Recovery Start Thu Nov 20 14:40:33 2014 Media Recovery failed with error 1610 ORA-283 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 14:41:23 2014 ALTER DATABASE RECOVER database using backup controlfile until cancel Thu Nov 20 14:43:08 2014 alter database open resetlogs Thu Nov 20 14:43:08 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571 Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6) Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared Thu Nov 20 14:43:14 2014 Setting recovery target incarnation to 3 Thu Nov 20 14:43:14 2014 Assigning activation ID 4039504598 (0xf0c5f2d6) Thread 1 opened at log sequence 1 Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log Successful open of redo thread 1 Thu Nov 20 14:43:14 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 14:43:14 2014 SMON: enabling cache recovery Thu Nov 20 14:43:14 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc: ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 14:43:16 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 14:43:16 2014 Error 704 happened during db open, shutting down database USER: terminating instance due to error 704 Instance terminated by USER, pid = 1844 ORA-1092 signalled during: alter database open resetlogs...
尝试隐含屏蔽回滚段
_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………
错误依旧ORA-600 4000
Thu Nov 20 15:09:21 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 15:09:21 2014 SMON: enabling cache recovery Thu Nov 20 15:09:21 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc: ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 15:09:23 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 15:09:23 2014 Error 704 happened during db open, shutting down database USER: terminating instance due to error 704 Instance terminated by USER, pid = 624 ORA-1092 signalled during: alter database open
多次重启,resetlogs后,数据库出现ORA-600 2662错误
Successful open of redo thread 1 Thu Nov 20 17:13:24 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 17:13:24 2014 SMON: enabling cache recovery Thu Nov 20 17:13:24 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc: ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], [] Thu Nov 20 17:13:25 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc: ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], [] Thu Nov 20 17:13:25 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 7967 ORA-1092 signalled during: ALTER DATABASE OPEN... Thu Nov 20 17:18:23 2014 USER: terminating instance due to error 1092 Instance terminated by USER, pid = 7967
offline undo相关文件,尝试打开数据库
Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Thu Nov 20 17:52:31 2014 ALTER DATABASE RECOVER database until cancel Thu Nov 20 17:52:31 2014 Media Recovery Start parallel recovery started with 15 processes ORA-279 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 17:53:42 2014 ALTER DATABASE RECOVER CANCEL Thu Nov 20 17:53:44 2014 ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ... Thu Nov 20 17:56:34 2014 alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline Thu Nov 20 17:56:35 2014 Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline Thu Nov 20 17:57:01 2014 alter database datafile '/data2/oradata/undotbs02.dbf' offline Thu Nov 20 17:57:02 2014 Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline Thu Nov 20 17:57:26 2014 alter database datafile '/data2/oradata/undotbs03.dbf' offline Thu Nov 20 17:57:27 2014 Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline Thu Nov 20 17:57:43 2014 alter database open resetlogs Thu Nov 20 17:57:43 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. ORA-1245 signalled during: alter database open resetlogs... Thu Nov 20 17:58:58 2014 alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop Thu Nov 20 17:58:58 2014 Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop Thu Nov 20 17:59:15 2014 alter database open resetlogs Thu Nov 20 17:59:15 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. ORA-1245 signalled during: alter database open resetlogs... Thu Nov 20 17:59:35 2014 alter database datafile '/data2/oradata/undotbs02.dbf' offline drop Thu Nov 20 17:59:35 2014 Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop Thu Nov 20 17:59:50 2014 alter database datafile '/data2/oradata/undotbs03.dbf' offline drop Thu Nov 20 17:59:50 2014 Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop Thu Nov 20 18:00:07 2014 alter database open resetlogs Thu Nov 20 18:00:07 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628 Resetting resetlogs activation ID 4039492628 (0xf0c5c414) Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared Thu Nov 20 18:00:14 2014 Setting recovery target incarnation to 8 Thu Nov 20 18:00:14 2014 Assigning activation ID 4039504142 (0xf0c5f10e) Thread 1 opened at log sequence 1 Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log Successful open of redo thread 1 Thu Nov 20 18:00:15 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 18:00:15 2014 SMON: enabling cache recovery Thu Nov 20 18:00:15 2014 Successfully onlined Undo Tablespace 1. Dictionary check beginning File #2 is offline, but is part of an online tablespace. data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf' File #100 is offline, but is part of an online tablespace. data file 100: '/data2/oradata/undotbs02.dbf' Thu Nov 20 18:00:28 2014 File #185 is offline, but is part of an online tablespace. data file 185: '/data2/oradata/undotbs03.dbf' Dictionary check complete Thu Nov 20 18:00:35 2014 SMON: enabling tx recovery Thu Nov 20 18:00:36 2014 Database Characterset is ZHS16CGB231280 Thu Nov 20 18:00:37 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-00376: file 185 cannot be read at this time ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf' Error 604 happened during db open, shutting down database USER: terminating instance due to error 604 Thu Nov 20 18:00:37 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc: ORA-00604: error occurred at recursive SQL level Thu Nov 20 18:00:37 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc: ORA-00604: error occurred at recursive SQL level Instance terminated by USER, pid = 28472 ORA-1092 signalled during: alter database open resetlogs...
不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致
Thu Nov 20 19:18:15 2014 SMON: enabling cache recovery Thu Nov 20 19:18:16 2014 Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during buffer read Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data Successfully onlined Undo Tablespace 1. Thu Nov 20 19:18:16 2014 SMON: enabling tx recovery Thu Nov 20 19:18:17 2014 Database Characterset is ZHS16CGB231280 Thu Nov 20 19:18:17 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-00376: file 185 cannot be read at this time ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf' Error 604 happened during db open, shutting down database USER: terminating instance due to error 604 Instance terminated by USER, pid = 13232 ORA-1092 signalled during: alter database open...
尝试不完全恢复,并resetlogs操作
ALTER DATABASE RECOVER database until cancel Thu Nov 20 19:33:41 2014 Media Recovery Start Datafile 2 is on orphaned branch File status = 4 Abs fuzzy SCN = 0 Hot backup fuzzy SCN = 0 Thu Nov 20 19:33:41 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc: ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0] ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf' Thu Nov 20 19:33:42 2014 Media Recovery failed with error 600 ORA-283 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 19:34:06 2014 alter database open resetlogs Thu Nov 20 19:34:06 2014 ORA-1139 signalled during: alter database open resetlogs... Thu Nov 20 19:34:17 2014 alter database open Thu Nov 20 19:34:17 2014 ORA-1190 signalled during: alter database open... Thu Nov 20 19:35:57 2014 ALTER DATABASE RECOVER database until cancel Thu Nov 20 19:35:57 2014 Media Recovery Start Datafile 2 is on orphaned branch File status = 4 Abs fuzzy SCN = 0 Hot backup fuzzy SCN = 0 Thu Nov 20 19:35:58 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc: ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0] ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf' Thu Nov 20 19:35:59 2014 Media Recovery failed with error 600 ORA-283 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 19:37:19 2014 alter database open resetlogs Thu Nov 20 19:37:19 2014 ORA-1139 signalled during: alter database open resetlogs...
继续打开报 ORA-600 3600错误
Thu Nov 20 19:43:14 2014 alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop Thu Nov 20 19:43:14 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc: ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], [] Thu Nov 20 19:43:15 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc: ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], [] Thu Nov 20 19:43:15 2014 DBW0: terminating instance due to error 471 Instance terminated by DBW0, pid = 20856 1 <strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong> 1 Fri Nov 21 12:35:12 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Fri Nov 21 12:35:12 2014 SMON: enabling cache recovery Fri Nov 21 12:35:13 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc: ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], [] Fri Nov 21 12:35:14 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc: ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], [] Fri Nov 21 12:35:14 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Fri Nov 21 12:35:14 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc: ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [] Fri Nov 21 12:35:14 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc: ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [] Instance terminated by USER, pid = 15596 ORA-1092 signalled during: ALTER DATABASE OPEN..
继续尝试打开数据库出现ORA-600 ktsitbs_info1错误
SMON: enabling cache recovery Fri Nov 21 13:54:25 2014 Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during buffer read Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data Fri Nov 21 13:54:25 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc: ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], [] Fri Nov 21 13:54:27 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc: ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], [] Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 21111 ORA-1092 signalled during: alter database open...
以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程
dbv 检查system01.dbf文件,得到结果
HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014 Copyright (c) 1982, 2005, Oracle. All rights reserved. DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf Page 60 is marked corrupt Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during dbv: Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Corrupt block relative dba: 0x004001f2 (file 1, block 498) Bad check value found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x004001f2 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ca10601 check value in block header: 0xe458 computed block checksum: 0x9720 DBVERIFY - Verification complete Total Pages Examined : 786432 Total Pages Processed (Data) : 201131 Total Pages Failing (Data) : 2 Total Pages Processed (Index): 221394 Total Pages Failing (Index): 0 Total Pages Processed (Other): 60265 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 303641 Total Pages Marked Corrupt : 2 Total Pages Influx : 0 Highest block SCN : 1229823477 (7.1229823477)
这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$
对数据库启动过程做10046,得到trace文件
PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0' select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext, dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags, plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1 END OF STMT PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569 BINDS #1: kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=ffffffff7dbac9a8 bln=22 avl=02 flg=05 value=2 EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296 WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449 Hex dump of (file 1, block 60) Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during buffer read Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438 STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)' STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)' *** 2014-11-22 14:44:43.235 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], [] Current SQL statement for this session: select max(maxconcurrency) from sys.wrh$_undostat where instance_number = :1 and dbid = :2 and snap_id in (select snap_id from dba_hist_snapshot where end_interval_time > (select max(end_interval_time)-7 from dba_hist_snapshot))
这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.
bbed尝试修复file 1 block 60
HNDX-DB% bbed password=blockedit mode=edit BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014 Copyright (c) 1982, 2005, Oracle. All rights reserved. ************* !!! For Oracle Internal Use only !!! *************** BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf' FILENAME /opt/oracle/oradata/xifenfei/system01.dbf BBED> set block 8192 BLOCK# 8192 BBED> set block 60 BLOCK# 60 BBED> set count 64 COUNT 64 BBED> map File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Dba:0x00000000 ------------------------------------------------------------ BBED-00400: invalid blocktype (128) BBED> set block 61 BLOCK# 61 BBED> map File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 61 Dba:0x00000000 ------------------------------------------------------------ KTB Data Block (Table/Cluster) struct kcbh, 20 bytes @0 struct ktbbh, 72 bytes @20 struct kdbh, 14 bytes @92 struct kdbt[3], 12 bytes @106 sb2 kdbr[2] @118 ub1 freespace[7959] @122 ub1 rowdata[107] @8081 ub4 tailchk @8188 BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x0040003d ub4 bas_kcbh @8 0x0000235b ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x01 ub1 flg_kcbh @15 0x04 (KCBHFCKV) ub2 chkval_kcbh @16 0x7a85 ub2 spare3_kcbh @18 0x0000 BBED> set block 60 BLOCK# 60 BBED> d File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Offsets: 0 to 63 Dba:0x00000000 ------------------------------------------------------------------------ 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000 <32 bytes per line> BBED> d block 61 File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 61 Offsets: 0 to 63 Dba:0x00000000 ------------------------------------------------------------------------ 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000 <32 bytes per line> BBED> set block 60 BLOCK# 60 BBED> m /x 06a2 File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Offsets: 0 to 63 Dba:0x00000000 ------------------------------------------------------------------------ 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000 <32 bytes per line> BBED> map File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Dba:0x00000000 ------------------------------------------------------------ KTB Data Block (Table/Cluster) struct kcbh, 20 bytes @0 struct ktbbh, 72 bytes @20 struct kdbh, 14 bytes @92 struct kdbt[3], 12 bytes @106 sb2 kdbr[2] @118 ub1 freespace[7598] @122 ub1 rowdata[468] @7720 ub4 tailchk @8188 BBED> sum apply Check value for File 0, Block 60: current = 0xe908, required = 0xe908 BBED> verify DBVERIFY - Verification starting FILE = /opt/oracle/oradata/xifenfei/system01.dbf BLOCK = 60 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 1 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 BBED>
尝试启动数据库
Sat Nov 22 15:51:33 2014 alter database open Sat Nov 22 15:51:34 2014 Thread 1 opened at log sequence 7 Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log Successful open of redo thread 1 Sat Nov 22 15:51:34 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Sat Nov 22 15:51:34 2014 SMON: enabling cache recovery SMON: enabling tx recovery Sat Nov 22 15:51:34 2014 Database Characterset is ZHS16CGB231280 Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc Corrupt block relative dba: 0x004001f2 (file 1, block 498) Bad check value found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x004001f2 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ca10601 check value in block header: 0xe458 computed block checksum: 0x9720 Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data Sat Nov 22 15:51:35 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc: ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], [] replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC QMNC started with pid=18, OS id=3000 Sat Nov 22 15:51:36 2014 Completed: alter database open Sat Nov 22 15:51:36 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc: ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], [] Sat Nov 22 15:51:37 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc: ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], [] Sat Nov 22 15:51:37 2014 Non-fatal internal error happenned while SMON was doing non-existent object cleanup. SMON encountered 1 out of maximum 100 non-fatal internal errors. Sat Nov 22 15:51:38 2014 ORA-600 encountered when generating server alert SMG-3000 Sat Nov 22 15:51:38 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc: ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []
只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段
SQL> select name,ts#,status$ from undo$; NAME TS# STATUS$ ------------------------------ ---------- ---------- SYSTEM 0 2 _SYSSMU1$ 1 2 _SYSSMU2$ 1 2 _SYSSMU3$ 1 2 ………… _SYSSMU168$ 1 2 _SYSSMU169$ 1 2
这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)
SQL> drop tablespace undotbs1 including contents; drop tablespace undotbs1 including contents * ERROR at line 1: ORA-01561: failed to remove all objects in the tablespace specified
通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed
dbv检测备份文件
DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak Page 60 is marked corrupt Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during dbv: Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Block Checking: DBA = 4194802, Block Type = KTB-managed data block data header at 0x1002ef05c kdbchk: row locked by non-existent transaction table=0 slot=4 lockid=1 ktbbhitc=2 Page 498 failed with check code 6101 DBVERIFY - Verification complete Total Pages Examined : 786432 Total Pages Processed (Data) : 201131 Total Pages Failing (Data) : 1 Total Pages Processed (Index): 221394 Total Pages Failing (Index): 0 Total Pages Processed (Other): 60265 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 303641 Total Pages Marked Corrupt : 1 Total Pages Influx : 0 Highest block SCN : 1229823477 (7.1229823477)
好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)
再次启动数据库
dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2 dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Sat Nov 22 17:53:38 2014 alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak' Sat Nov 22 17:53:39 2014 Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak' Sat Nov 22 17:55:43 2014 alter database open Sat Nov 22 17:55:48 2014 LGWR: STARTING ARCH PROCESSES ARC0 started with pid=18, OS id=15858 Sat Nov 22 17:56:10 2014 ARC0: Archival started ARC1: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC1 started with pid=17, OS id=15879 Sat Nov 22 17:56:19 2014 Thread 1 opened at log sequence 7 Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log Successful open of redo thread 1 Sat Nov 22 17:56:19 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Sat Nov 22 17:56:19 2014 SMON: enabling cache recovery SMON: enabling tx recovery Sat Nov 22 17:56:20 2014 ARC1: STARTING ARCH PROCESSES Sat Nov 22 17:56:20 2014 ARC0: Becoming the 'no FAL' ARCH ARC0: Becoming the 'no SRL' ARCH Sat Nov 22 17:56:22 2014 Database Characterset is ZHS16CGB231280 replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Sat Nov 22 17:56:33 2014 ARC2: Archival started ARC1: STARTING ARCH PROCESSES COMPLETE ARC1: Becoming the heartbeat ARCH ARC2 started with pid=23, OS id=15928 QMNC started with pid=25, OS id=15996 Sat Nov 22 17:57:11 2014 Completed: alter database open Sat Nov 22 17:57:18 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc: ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], [] Sat Nov 22 17:57:26 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc: ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], [] Sat Nov 22 17:58:17 2014 Starting background process EMN0 Sat Nov 22 18:00:03 2014 Shutting down instance: further logons disabled EMN0 started with pid=71, OS id=16421 Sat Nov 22 18:00:12 2014 SMON: Restarting fast_start parallel rollback Sat Nov 22 18:00:23 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc: ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], [] Sat Nov 22 18:00:24 2014 Stopping background process CJQ0 Sat Nov 22 18:00:24 2014 Stopping background process QMNC Sat Nov 22 18:00:27 2014 Doing block recovery for file 2 block 41 Block recovery from logseq 7, block 180883 to scn 214748389244 Sat Nov 22 18:00:27 2014 Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0 Mem# 0 errs 0: /data2/oradata/redo0802.log Block recovery stopped at EOT rba 7.180988.16 Block recovery completed at rba 7.180988.16, scn 50.24441 Sat Nov 22 18:00:32 2014 Stopping background process MMNL Sat Nov 22 18:00:38 2014 Stopping background process MMON Sat Nov 22 18:00:41 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc: ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], [] Sat Nov 22 18:00:42 2014 ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4). Sat Nov 22 18:00:42 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc: ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库
表示没看懂呵呵 好多知识不懂啊!有材料可以推荐学习下么 ,就这次故障处理的知识点;