标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,682)
- DB2 (22)
- MySQL (73)
- Oracle (1,544)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (67)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (565)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (79)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
- Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)
- OGG-02771 Input trail file format RELEASE 19.1 is different from previous trail file form at RELEASE 11.2.
- OGG-02246 Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
标签归档:ORA-600 4198
在数据库恢复遭遇ORA-07445 kgegpa错误
接到客户恢复请求,数据库启动报ORA-600 2662错误
Fri Apr 24 19:52:58 2020 alter database open resetlogs RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 15491509441794 Resetting resetlogs activation ID 1460987657 (0x5714e709) Fri Apr 24 19:52:59 2020 Setting recovery target incarnation to 3 Fri Apr 24 19:52:59 2020 Assigning activation ID 1566342598 (0x5d5c7dc6) Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: Y:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Fri Apr 24 19:52:59 2020 SMON: enabling cache recovery Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3860.trc (incident=8561): ORA-00600: 内部错误代码, 参数: [2662], [3606], [3857372426], [3606], [3857377059], [12583040], [], [], [], [], [], [] Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_8561\orcl_ora_3860_i8561.trc Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3860.trc: ORA-00600: 内部错误代码, 参数: [2662], [3606], [3857372426], [3606], [3857377059], [12583040], [], [], [], [], [], [] Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3860.trc: ORA-00600: 内部错误代码, 参数: [2662], [3606], [3857372426], [3606], [3857377059], [12583040], [], [], [], [], [], [] Error 600 happened during db open, shutting down database USER (ospid: 3860): terminating the instance due to error 600 Instance terminated by USER, pid = 3860 ORA-1092 signalled during: alter database open resetlogs...
这个错误比较常见,通过对数据库scn进行调整,顺利规避该错误,继续启动报如下错误
SQL> startup mount pfile='d:/pfile.txt'; ORACLE 例程已经启动。 Total System Global Area 1.3696E+10 bytes Fixed Size 2188768 bytes Variable Size 6878661152 bytes Database Buffers 6777995264 bytes Redo Buffers 37044224 bytes 数据库装载完毕。 SQL> alter database open; alter database open * 第 1 行出现错误: ORA-03113: 通信通道的文件结尾 进程 ID: 5884 会话 ID: 66 序列号: 3
Fri Apr 24 20:57:49 2020 SMON: enabling cache recovery Successfully onlined Undo Tablespace 2. Dictionary check beginning Dictionary check complete Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Database Characterset is ZHS16GBK No Resource Manager plan active Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x898ADE43] [PC:0x9287D88, kgegpa()+38] Dump file d:\app\administrator\diag\rdbms\orcl\orcl\trace\alert_orcl.log Fri Apr 24 20:57:49 2020 ORACLE V11.2.0.1.0 - 64bit Production vsnsta=0 vsnsql=16 vsnxtr=3 Windows NT Version V6.1 CPU : 16 - type 8664, 16 Physical Cores Process Affinity : 0x0x0000000000000000 Memory (Avail/Total): Ph:21429M/32767M, Ph+PgF:54255M/65533M Fri Apr 24 20:57:49 2020 Errors in file ORA-07445: caught exception [ACCESS_VIOLATION] at [kgegpa()+38] [0x0000000009287D88] Fri Apr 24 20:57:52 2020 PMON (ospid: 2496): terminating the instance due to error 397 Instance terminated by PMON, pid = 2496
这里的主要错误是由于ORA-07445 kgegpa,根据以前恢复经验,该问题很可能和undo有关,对undo进行处理之后启动库
SQL> startup mount pfile='d:/pfile.txt' ; ORACLE 例程已经启动。 Total System Global Area 1.3696E+10 bytes Fixed Size 2188768 bytes Variable Size 6878661152 bytes Database Buffers 6777995264 bytes Redo Buffers 37044224 bytes 数据库装载完毕。 SQL> recover database; 完成介质恢复。 SQL> alter database open; 数据库已更改。
SMON: enabling tx recovery Database Characterset is ZHS16GBK SMON: Restarting fast_start parallel rollback Fri Apr 24 21:01:28 2020 Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p000_4360.trc (incident=13377): ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], [] Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13377\orcl_p000_4360_i13377.trc Stopping background process MMNL Doing block recovery for file 3 block 296 Resuming block recovery (PMON) for file 3 block 296 Block recovery from logseq 3, block 25 to scn 15491947056761 Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0 Mem# 0: Y:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG Block recovery completed at rba 3.25.16, scn 3607.20090 Doing block recovery for file 6 block 165592 Resuming block recovery (PMON) for file 6 block 165592 Block recovery from logseq 3, block 33 to scn 15491947056769 Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0 Mem# 0: Y:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG Block recovery completed at rba 3.58.16, scn 3607.20098 Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4912.trc (incident=13321): ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], [] Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13321\orcl_smon_4912_i13321.trc SMON: Parallel transaction recovery slave got internal error SMON: Downgrading transaction recovery to serial Stopping background process MMON Fri Apr 24 21:01:29 2020 Trace dumping is performing id=[cdmp_20200424210129] Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4912.trc (incident=13322): ORA-00600: internal error code, arguments: [4137], [12.30.1712324], [0], [0], [], [], [], [], [], [], [], [] Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13322\orcl_smon_4912_i13322.trc ORACLE Instance orcl (pid = 14) - Error 600 encountered while recovering transaction (12, 30). Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4912.trc: ORA-00600: internal error code, arguments: [4137], [12.30.1712324], [0], [0], [], [], [], [], [], [], [], [] Completed: alter database open upgrade Fri Apr 24 21:01:30 2020 MMON started with pid=16, OS id=4980 Fri Apr 24 21:01:31 2020 Sweep [inc][13322]: completed Corrupt block relative dba: 0x00c395ee (file 3, block 234990) Fractured block found during buffer read Data in bad block: type: 2 format: 2 rdba: 0x00c395ee last change scn: 0x0e16.e5ead38b seq: 0x2b flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0xdb720232 check value in block header: 0xebe2 computed block checksum: 0xb60b Reading datafile'Y:\APP\ADMINISTRATOR\ORADATA\ORCL\UNDOTBS01.DBF'for corruption at rdba: 0x00c395ee (file 3,block 234990) Reread (file 3, block 234990) found same corrupt data Corrupt Block Found TSN = 2, TSNAME = UNDOTBS1 RFN = 3, BLK = 234990, RDBA = 12817902 OBJN = 0, OBJD = -1, OBJECT = , SUBOBJECT = SEGMENT OWNER = , SEGMENT TYPE = Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_m001_4852.trc (incident=13641): ORA-01578: ORACLE data block corrupted (file # 3, block # 234990) ORA-01110: data file 3: 'Y:\APP\ADMINISTRATOR\ORADATA\ORCL\UNDOTBS01.DBF' Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13641\orcl_m001_4852_i13641.trc
SQL> create undo tablespace undotbs2 datafile 2 'Y:\APP\ADMINISTRATOR\ORADATA\ORCL\undo_xff02.dbf' size 128M autoextend on; 表空间已创建。 SQL> drop tablespace undotbs1 including contents and datafiles; 表空间已删除。 SQL> shutdown immediate; 数据库已经关闭。 已经卸载数据库。 ORACLE 例程已经关闭。 SQL> create spfile from pfile='d:/pfile.txt'; 文件已创建。 SQL> startup mount ORACLE 例程已经启动。 Total System Global Area 1.3696E+10 bytes Fixed Size 2188768 bytes Variable Size 6878661152 bytes Database Buffers 6777995264 bytes Redo Buffers 37044224 bytes 数据库装载完毕。 SQL> alter database open; 数据库已更改。
数据库启动之后继续报出来的ORA-600 4198和ORA-600 4137以及undo坏块均证明是由于undo异常引起的问题,通过重建新undo,数据库open正常,安排客户进行数据导出导入到新库
记录一次200T的数据库恢复经历
有一个客户恢复请求,6个节点11.2.0.3 RAC,非归档模式,数据量近200T
由于存储掉电导致数据库6个节点全部宕机,恢复硬件之后,数据库无法正常启动,报错如下:
SQL> recover database; ORA-00279: change 318472018583 generated at 05/04/2019 17:58:05 needed for thread 4 ORA-00289: suggestion : /u01/app/oracle/product/11.2.0/db_1/dbs/arch4_322810_870181839.dbf ORA-00280: change 318472018583 for thread 4 is in sequence #322810 Wed Aug 28 11:19:55 2019 ALTER DATABASE RECOVER DATABAE Media Recovery Start Serial Media Recovery started Recovery of Online Redo Log: Thread 1 Group 14 Seq 552 Reading mem 0 Mem# 0: +REDO/xff/log2.ora Recovery of Online Redo Log: Thread 2 Group 15 Seq 126 Reading mem 0 Mem# 0: +REDO/xff/log3.ora Recovery of Online Redo Log: Thread 3 Group 18 Seq 122 Reading mem 0 Mem# 0: +REDO/xff/log6.ora ORA-279 signalled during: ALTER DATABASE RECOVER database ... Wed Aug 28 11:21:31 2019 ALTER DATABASE RECOVER CANCEL Media Recovery Canceled Completed: ALTER DATABASE RECOVER CANCEL
数据库恢复需要thread 4 sequence #322810,查询redo信息
redo已经被覆盖,数据库无法通过正常途径恢复实现数据库open,尝试屏蔽一致性强制拉库操作后
Wed Aug 28 12:40:15 2019 SMON: enabling tx recovery Database Characterset is ZHS16GBK Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_smon_51338.trc (incident=244209): ORA-00600: internal error code, arguments: [4137], [44.47.613406], [0], [0], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/xff/xff1/incident/incdir_244209/xff1_smon_51338_i244209.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. No Resource Manager plan active replication_dependency_tracking turned off (no async multimaster replication found) Wed Aug 28 12:40:16 2019 ORACLE Instance xff1 (pid = 26) - Error 600 encountered while recovering transaction (44, 47). Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_smon_51338.trc: ORA-00600: internal error code, arguments: [4137], [44.47.613406], [0], [0], [], [], [], [], [], [], [], [] Wed Aug 28 12:40:20 2019 Exception[type: SIGSEGV,Address not mapped to object][ADDR:0x5122000000C8][PC:0xE1B4D3,ktugru()+87][flags:0x0,count:1] Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_p086_54066.trc (incident=245017): ORA-07445:exception encountered:core dump [ktugru()+87][SIGSEGV][ADDR:0x5122000000C8][Address not mapped to object] Incident details in: /u01/app/oracle/diag/rdbms/xff/xff1/incident/incdir_245017/xff1_p086_54066_i245017.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Wed Aug 28 12:40:20 2019 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_p000_53873.trc (incident=244305): ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/xff/xff1/incident/incdir_244305/xff1_p000_53873_i244305.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details.
提示undo异常,屏蔽回滚段之后,数据库正常打开没有任何报错信息
Wed Aug 28 12:57:15 2019 SMON: enabling cache recovery Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads [57676] Successfully onlined Undo Tablespace 22. Undo initialization finished serial:0 start:2386111306 end:2386112316 diff:1010 (10 seconds) Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Database Characterset is ZHS16GBK Wed Aug 28 12:57:17 2019 minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:57624 status:0x7 minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000 No Resource Manager plan active Starting background process GTX0 Wed Aug 28 12:57:18 2019 GTX0 started with pid=45, OS id=57777 Starting background process RCBG Wed Aug 28 12:57:18 2019 RCBG started with pid=46, OS id=57779 replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Wed Aug 28 12:57:19 2019 QMNC started with pid=47, OS id=57788 Completed: ALTER DATABASE OPEN
后续涉及创建新undo,删除老undo并处理一些类似,基本上恢复正常
SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)
一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失
Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc: ORA-00313: open failed for members of log group 7 of thread 1 ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc: ORA-00313: open failed for members of log group 7 of thread 1 ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log' ORA-27037: unable to obtain file status SVR4 Error: 2: No such file or directory Additional information: 3 Thu Nov 20 11:28:39 2014 LGWR: terminating instance due to error 313 Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc: ORA-00313: open failed for members of log group of thread Thu Nov 20 11:28:39 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc: ORA-00313: open failed for members of log group of thread Instance terminated by LGWR, pid = 1404
尝试clear redo文件方式恢复
Thu Nov 20 13:04:16 2014 alter database clear logfile group 9 Thu Nov 20 13:04:16 2014 ORA-1624 signalled during: alter database clear logfile group 9... Thu Nov 20 13:04:45 2014 alter database clear logfile group 9 Thu Nov 20 13:04:46 2014 ORA-1624 signalled during: alter database clear logfile group 9... Thu Nov 20 13:04:59 2014 alter database clear unarchived logfile group 9 Thu Nov 20 13:04:59 2014 ORA-1624 signalled during: alter database clear unarchived logfile group 9... Thu Nov 20 13:05:00 2014 alter database clear unarchived logfile group 9 Thu Nov 20 13:05:00 2014 ORA-1624 signalled during: alter database clear unarchived logfile group 9...
不完全恢复resetlogs尝试打开数据库
ORA-279 signalled during: ALTER DATABASE RECOVER database using backup controlfile ... Thu Nov 20 13:49:01 2014 ALTER DATABASE RECOVER CONTINUE DEFAULT Thu Nov 20 13:49:02 2014 Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc ORA-308 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ... Thu Nov 20 13:49:02 2014 ALTER DATABASE RECOVER CONTINUE DEFAULT Thu Nov 20 13:49:02 2014 Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc ORA-308 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ... Thu Nov 20 13:49:02 2014 ALTER DATABASE RECOVER CANCEL Thu Nov 20 13:49:03 2014 Media Recovery Canceled Completed: ALTER DATABASE RECOVER CANCEL Thu Nov 20 13:49:33 2014 alter database open resetlogs Thu Nov 20 13:49:34 2014 ORA-1113 signalled during: alter database open resetlogs...
使用隐含参数
_allow_resetlogs_corruption= TRUE
进行不完全恢复,尝试open数据库报ORA-600 4000错误
Thu Nov 20 14:35:02 2014 ALTER DATABASE MOUNT Thu Nov 20 14:35:07 2014 Setting recovery target incarnation to 2 Thu Nov 20 14:35:07 2014 Successful mount of redo thread 1, with mount id 4039504598 Thu Nov 20 14:35:07 2014 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Thu Nov 20 14:40:33 2014 ALTER DATABASE RECOVER database until cancel Thu Nov 20 14:40:33 2014 Media Recovery Start Thu Nov 20 14:40:33 2014 Media Recovery failed with error 1610 ORA-283 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 14:41:23 2014 ALTER DATABASE RECOVER database using backup controlfile until cancel Thu Nov 20 14:43:08 2014 alter database open resetlogs Thu Nov 20 14:43:08 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571 Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6) Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared Thu Nov 20 14:43:14 2014 Setting recovery target incarnation to 3 Thu Nov 20 14:43:14 2014 Assigning activation ID 4039504598 (0xf0c5f2d6) Thread 1 opened at log sequence 1 Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log Successful open of redo thread 1 Thu Nov 20 14:43:14 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 14:43:14 2014 SMON: enabling cache recovery Thu Nov 20 14:43:14 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc: ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 14:43:16 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 14:43:16 2014 Error 704 happened during db open, shutting down database USER: terminating instance due to error 704 Instance terminated by USER, pid = 1844 ORA-1092 signalled during: alter database open resetlogs...
尝试隐含屏蔽回滚段
_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………
错误依旧ORA-600 4000
Thu Nov 20 15:09:21 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 15:09:21 2014 SMON: enabling cache recovery Thu Nov 20 15:09:21 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc: ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 15:09:23 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], [] Thu Nov 20 15:09:23 2014 Error 704 happened during db open, shutting down database USER: terminating instance due to error 704 Instance terminated by USER, pid = 624 ORA-1092 signalled during: alter database open
多次重启,resetlogs后,数据库出现ORA-600 2662错误
Successful open of redo thread 1 Thu Nov 20 17:13:24 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 17:13:24 2014 SMON: enabling cache recovery Thu Nov 20 17:13:24 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc: ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], [] Thu Nov 20 17:13:25 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc: ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], [] Thu Nov 20 17:13:25 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 7967 ORA-1092 signalled during: ALTER DATABASE OPEN... Thu Nov 20 17:18:23 2014 USER: terminating instance due to error 1092 Instance terminated by USER, pid = 7967
offline undo相关文件,尝试打开数据库
Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Thu Nov 20 17:52:31 2014 ALTER DATABASE RECOVER database until cancel Thu Nov 20 17:52:31 2014 Media Recovery Start parallel recovery started with 15 processes ORA-279 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 17:53:42 2014 ALTER DATABASE RECOVER CANCEL Thu Nov 20 17:53:44 2014 ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ... Thu Nov 20 17:56:34 2014 alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline Thu Nov 20 17:56:35 2014 Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline Thu Nov 20 17:57:01 2014 alter database datafile '/data2/oradata/undotbs02.dbf' offline Thu Nov 20 17:57:02 2014 Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline Thu Nov 20 17:57:26 2014 alter database datafile '/data2/oradata/undotbs03.dbf' offline Thu Nov 20 17:57:27 2014 Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline Thu Nov 20 17:57:43 2014 alter database open resetlogs Thu Nov 20 17:57:43 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. ORA-1245 signalled during: alter database open resetlogs... Thu Nov 20 17:58:58 2014 alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop Thu Nov 20 17:58:58 2014 Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop Thu Nov 20 17:59:15 2014 alter database open resetlogs Thu Nov 20 17:59:15 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. ORA-1245 signalled during: alter database open resetlogs... Thu Nov 20 17:59:35 2014 alter database datafile '/data2/oradata/undotbs02.dbf' offline drop Thu Nov 20 17:59:35 2014 Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop Thu Nov 20 17:59:50 2014 alter database datafile '/data2/oradata/undotbs03.dbf' offline drop Thu Nov 20 17:59:50 2014 Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop Thu Nov 20 18:00:07 2014 alter database open resetlogs Thu Nov 20 18:00:07 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628 Resetting resetlogs activation ID 4039492628 (0xf0c5c414) Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared Thu Nov 20 18:00:14 2014 Setting recovery target incarnation to 8 Thu Nov 20 18:00:14 2014 Assigning activation ID 4039504142 (0xf0c5f10e) Thread 1 opened at log sequence 1 Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log Successful open of redo thread 1 Thu Nov 20 18:00:15 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Thu Nov 20 18:00:15 2014 SMON: enabling cache recovery Thu Nov 20 18:00:15 2014 Successfully onlined Undo Tablespace 1. Dictionary check beginning File #2 is offline, but is part of an online tablespace. data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf' File #100 is offline, but is part of an online tablespace. data file 100: '/data2/oradata/undotbs02.dbf' Thu Nov 20 18:00:28 2014 File #185 is offline, but is part of an online tablespace. data file 185: '/data2/oradata/undotbs03.dbf' Dictionary check complete Thu Nov 20 18:00:35 2014 SMON: enabling tx recovery Thu Nov 20 18:00:36 2014 Database Characterset is ZHS16CGB231280 Thu Nov 20 18:00:37 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-00376: file 185 cannot be read at this time ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf' Error 604 happened during db open, shutting down database USER: terminating instance due to error 604 Thu Nov 20 18:00:37 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc: ORA-00604: error occurred at recursive SQL level Thu Nov 20 18:00:37 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc: ORA-00604: error occurred at recursive SQL level Instance terminated by USER, pid = 28472 ORA-1092 signalled during: alter database open resetlogs...
不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致
Thu Nov 20 19:18:15 2014 SMON: enabling cache recovery Thu Nov 20 19:18:16 2014 Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during buffer read Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data Successfully onlined Undo Tablespace 1. Thu Nov 20 19:18:16 2014 SMON: enabling tx recovery Thu Nov 20 19:18:17 2014 Database Characterset is ZHS16CGB231280 Thu Nov 20 19:18:17 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-00376: file 185 cannot be read at this time ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf' Error 604 happened during db open, shutting down database USER: terminating instance due to error 604 Instance terminated by USER, pid = 13232 ORA-1092 signalled during: alter database open...
尝试不完全恢复,并resetlogs操作
ALTER DATABASE RECOVER database until cancel Thu Nov 20 19:33:41 2014 Media Recovery Start Datafile 2 is on orphaned branch File status = 4 Abs fuzzy SCN = 0 Hot backup fuzzy SCN = 0 Thu Nov 20 19:33:41 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc: ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0] ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf' Thu Nov 20 19:33:42 2014 Media Recovery failed with error 600 ORA-283 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 19:34:06 2014 alter database open resetlogs Thu Nov 20 19:34:06 2014 ORA-1139 signalled during: alter database open resetlogs... Thu Nov 20 19:34:17 2014 alter database open Thu Nov 20 19:34:17 2014 ORA-1190 signalled during: alter database open... Thu Nov 20 19:35:57 2014 ALTER DATABASE RECOVER database until cancel Thu Nov 20 19:35:57 2014 Media Recovery Start Datafile 2 is on orphaned branch File status = 4 Abs fuzzy SCN = 0 Hot backup fuzzy SCN = 0 Thu Nov 20 19:35:58 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc: ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0] ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf' Thu Nov 20 19:35:59 2014 Media Recovery failed with error 600 ORA-283 signalled during: ALTER DATABASE RECOVER database until cancel ... Thu Nov 20 19:37:19 2014 alter database open resetlogs Thu Nov 20 19:37:19 2014 ORA-1139 signalled during: alter database open resetlogs...
继续打开报 ORA-600 3600错误
Thu Nov 20 19:43:14 2014 alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop Thu Nov 20 19:43:14 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc: ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], [] Thu Nov 20 19:43:15 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc: ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], [] Thu Nov 20 19:43:15 2014 DBW0: terminating instance due to error 471 Instance terminated by DBW0, pid = 20856 1 <strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong> 1 Fri Nov 21 12:35:12 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Fri Nov 21 12:35:12 2014 SMON: enabling cache recovery Fri Nov 21 12:35:13 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc: ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], [] Fri Nov 21 12:35:14 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc: ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], [] Fri Nov 21 12:35:14 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Fri Nov 21 12:35:14 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc: ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [] Fri Nov 21 12:35:14 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc: ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [] Instance terminated by USER, pid = 15596 ORA-1092 signalled during: ALTER DATABASE OPEN..
继续尝试打开数据库出现ORA-600 ktsitbs_info1错误
SMON: enabling cache recovery Fri Nov 21 13:54:25 2014 Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during buffer read Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data Fri Nov 21 13:54:25 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc: ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], [] Fri Nov 21 13:54:27 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc: ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], [] Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 21111 ORA-1092 signalled during: alter database open...
以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程
dbv 检查system01.dbf文件,得到结果
HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014 Copyright (c) 1982, 2005, Oracle. All rights reserved. DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf Page 60 is marked corrupt Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during dbv: Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Corrupt block relative dba: 0x004001f2 (file 1, block 498) Bad check value found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x004001f2 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ca10601 check value in block header: 0xe458 computed block checksum: 0x9720 DBVERIFY - Verification complete Total Pages Examined : 786432 Total Pages Processed (Data) : 201131 Total Pages Failing (Data) : 2 Total Pages Processed (Index): 221394 Total Pages Failing (Index): 0 Total Pages Processed (Other): 60265 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 303641 Total Pages Marked Corrupt : 2 Total Pages Influx : 0 Highest block SCN : 1229823477 (7.1229823477)
这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$
对数据库启动过程做10046,得到trace文件
PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0' select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext, dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags, plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1 END OF STMT PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569 BINDS #1: kkscoacd Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0 kxsbbbfp=ffffffff7dbac9a8 bln=22 avl=02 flg=05 value=2 EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296 WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449 Hex dump of (file 1, block 60) Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during buffer read Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438 STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)' STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)' *** 2014-11-22 14:44:43.235 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], [] Current SQL statement for this session: select max(maxconcurrency) from sys.wrh$_undostat where instance_number = :1 and dbid = :2 and snap_id in (select snap_id from dba_hist_snapshot where end_interval_time > (select max(end_interval_time)-7 from dba_hist_snapshot))
这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.
bbed尝试修复file 1 block 60
HNDX-DB% bbed password=blockedit mode=edit BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014 Copyright (c) 1982, 2005, Oracle. All rights reserved. ************* !!! For Oracle Internal Use only !!! *************** BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf' FILENAME /opt/oracle/oradata/xifenfei/system01.dbf BBED> set block 8192 BLOCK# 8192 BBED> set block 60 BLOCK# 60 BBED> set count 64 COUNT 64 BBED> map File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Dba:0x00000000 ------------------------------------------------------------ BBED-00400: invalid blocktype (128) BBED> set block 61 BLOCK# 61 BBED> map File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 61 Dba:0x00000000 ------------------------------------------------------------ KTB Data Block (Table/Cluster) struct kcbh, 20 bytes @0 struct ktbbh, 72 bytes @20 struct kdbh, 14 bytes @92 struct kdbt[3], 12 bytes @106 sb2 kdbr[2] @118 ub1 freespace[7959] @122 ub1 rowdata[107] @8081 ub4 tailchk @8188 BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x0040003d ub4 bas_kcbh @8 0x0000235b ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x01 ub1 flg_kcbh @15 0x04 (KCBHFCKV) ub2 chkval_kcbh @16 0x7a85 ub2 spare3_kcbh @18 0x0000 BBED> set block 60 BLOCK# 60 BBED> d File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Offsets: 0 to 63 Dba:0x00000000 ------------------------------------------------------------------------ 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000 <32 bytes per line> BBED> d block 61 File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 61 Offsets: 0 to 63 Dba:0x00000000 ------------------------------------------------------------------------ 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000 <32 bytes per line> BBED> set block 60 BLOCK# 60 BBED> m /x 06a2 File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Offsets: 0 to 63 Dba:0x00000000 ------------------------------------------------------------------------ 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000 <32 bytes per line> BBED> map File: /opt/oracle/oradata/xifenfei/system01.dbf (0) Block: 60 Dba:0x00000000 ------------------------------------------------------------ KTB Data Block (Table/Cluster) struct kcbh, 20 bytes @0 struct ktbbh, 72 bytes @20 struct kdbh, 14 bytes @92 struct kdbt[3], 12 bytes @106 sb2 kdbr[2] @118 ub1 freespace[7598] @122 ub1 rowdata[468] @7720 ub4 tailchk @8188 BBED> sum apply Check value for File 0, Block 60: current = 0xe908, required = 0xe908 BBED> verify DBVERIFY - Verification starting FILE = /opt/oracle/oradata/xifenfei/system01.dbf BLOCK = 60 DBVERIFY - Verification complete Total Blocks Examined : 1 Total Blocks Processed (Data) : 1 Total Blocks Failing (Data) : 0 Total Blocks Processed (Index): 0 Total Blocks Failing (Index): 0 Total Blocks Empty : 0 Total Blocks Marked Corrupt : 0 Total Blocks Influx : 0 BBED>
尝试启动数据库
Sat Nov 22 15:51:33 2014 alter database open Sat Nov 22 15:51:34 2014 Thread 1 opened at log sequence 7 Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log Successful open of redo thread 1 Sat Nov 22 15:51:34 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Sat Nov 22 15:51:34 2014 SMON: enabling cache recovery SMON: enabling tx recovery Sat Nov 22 15:51:34 2014 Database Characterset is ZHS16CGB231280 Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc Corrupt block relative dba: 0x004001f2 (file 1, block 498) Bad check value found during buffer read Data in bad block: type: 6 format: 2 rdba: 0x004001f2 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x9ca10601 check value in block header: 0xe458 computed block checksum: 0x9720 Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data Sat Nov 22 15:51:35 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc: ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], [] replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC QMNC started with pid=18, OS id=3000 Sat Nov 22 15:51:36 2014 Completed: alter database open Sat Nov 22 15:51:36 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc: ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], [] Sat Nov 22 15:51:37 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc: ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], [] Sat Nov 22 15:51:37 2014 Non-fatal internal error happenned while SMON was doing non-existent object cleanup. SMON encountered 1 out of maximum 100 non-fatal internal errors. Sat Nov 22 15:51:38 2014 ORA-600 encountered when generating server alert SMG-3000 Sat Nov 22 15:51:38 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc: ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []
只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段
SQL> select name,ts#,status$ from undo$; NAME TS# STATUS$ ------------------------------ ---------- ---------- SYSTEM 0 2 _SYSSMU1$ 1 2 _SYSSMU2$ 1 2 _SYSSMU3$ 1 2 ………… _SYSSMU168$ 1 2 _SYSSMU169$ 1 2
这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)
SQL> drop tablespace undotbs1 including contents; drop tablespace undotbs1 including contents * ERROR at line 1: ORA-01561: failed to remove all objects in the tablespace specified
通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed
dbv检测备份文件
DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak Page 60 is marked corrupt Corrupt block relative dba: 0x0040003c (file 1, block 60) Bad header found during dbv: Data in bad block: type: 128 format: 0 rdba: 0x0040003c last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04 spare1: 0x0 spare2: 0x0 spare3: 0x0 consistency value in tail: 0x4bc90602 check value in block header: 0x6faa computed block checksum: 0x0 Block Checking: DBA = 4194802, Block Type = KTB-managed data block data header at 0x1002ef05c kdbchk: row locked by non-existent transaction table=0 slot=4 lockid=1 ktbbhitc=2 Page 498 failed with check code 6101 DBVERIFY - Verification complete Total Pages Examined : 786432 Total Pages Processed (Data) : 201131 Total Pages Failing (Data) : 1 Total Pages Processed (Index): 221394 Total Pages Failing (Index): 0 Total Pages Processed (Other): 60265 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 303641 Total Pages Marked Corrupt : 1 Total Pages Influx : 0 Highest block SCN : 1229823477 (7.1229823477)
好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)
再次启动数据库
dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2 dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Sat Nov 22 17:53:38 2014 alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak' Sat Nov 22 17:53:39 2014 Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak' Sat Nov 22 17:55:43 2014 alter database open Sat Nov 22 17:55:48 2014 LGWR: STARTING ARCH PROCESSES ARC0 started with pid=18, OS id=15858 Sat Nov 22 17:56:10 2014 ARC0: Archival started ARC1: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC1 started with pid=17, OS id=15879 Sat Nov 22 17:56:19 2014 Thread 1 opened at log sequence 7 Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log Successful open of redo thread 1 Sat Nov 22 17:56:19 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Sat Nov 22 17:56:19 2014 SMON: enabling cache recovery SMON: enabling tx recovery Sat Nov 22 17:56:20 2014 ARC1: STARTING ARCH PROCESSES Sat Nov 22 17:56:20 2014 ARC0: Becoming the 'no FAL' ARCH ARC0: Becoming the 'no SRL' ARCH Sat Nov 22 17:56:22 2014 Database Characterset is ZHS16CGB231280 replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Sat Nov 22 17:56:33 2014 ARC2: Archival started ARC1: STARTING ARCH PROCESSES COMPLETE ARC1: Becoming the heartbeat ARCH ARC2 started with pid=23, OS id=15928 QMNC started with pid=25, OS id=15996 Sat Nov 22 17:57:11 2014 Completed: alter database open Sat Nov 22 17:57:18 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc: ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], [] Sat Nov 22 17:57:26 2014 Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc: ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], [] Sat Nov 22 17:58:17 2014 Starting background process EMN0 Sat Nov 22 18:00:03 2014 Shutting down instance: further logons disabled EMN0 started with pid=71, OS id=16421 Sat Nov 22 18:00:12 2014 SMON: Restarting fast_start parallel rollback Sat Nov 22 18:00:23 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc: ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], [] Sat Nov 22 18:00:24 2014 Stopping background process CJQ0 Sat Nov 22 18:00:24 2014 Stopping background process QMNC Sat Nov 22 18:00:27 2014 Doing block recovery for file 2 block 41 Block recovery from logseq 7, block 180883 to scn 214748389244 Sat Nov 22 18:00:27 2014 Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0 Mem# 0 errs 0: /data2/oradata/redo0802.log Block recovery stopped at EOT rba 7.180988.16 Block recovery completed at rba 7.180988.16, scn 50.24441 Sat Nov 22 18:00:32 2014 Stopping background process MMNL Sat Nov 22 18:00:38 2014 Stopping background process MMON Sat Nov 22 18:00:41 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc: ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], [] Sat Nov 22 18:00:42 2014 ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4). Sat Nov 22 18:00:42 2014 Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc: ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库