标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,674)
- DB2 (22)
- MySQL (73)
- Oracle (1,536)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (22)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (67)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (562)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (78)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
- ORA-600 16703故障再现
- 数据库启动报ORA-27102 OSD-00026 O/S-Error: (OS 1455)
- .[metro777@cock.li].Elbie勒索病毒加密数据库恢复
- 应用连接错误,初始化mysql数据库恢复
- RAC默认服务配置优先节点
标签归档:ORA-600[4194]
又一起存储故障导致ORA-00333 ORA-00312恢复
数据库启动报ORA-00333 ORA-00312错误,无法正常open数据库
Thu Aug 07 10:42:03 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc: ORA-00333: redo log read error block 63489 count 2048 ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG' ORA-27091: unable to queue I/O ORA-27070: async read/write failed OSD-04006: ReadFile() 失败, 无法读取文件 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 10:42:03 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc: ORA-00333: redo log read error block 63489 count 2048 Thu Aug 07 10:42:03 2014 ARC0: All Archive destinations made inactive due to error 333 Thu Aug 07 10:42:03 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc: ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止 ORA-00340: 处理联机日志 (用于线程 ) 时出现 I/O 错误 Thu Aug 07 10:42:03 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc: ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止 ORA-00340: 处理联机日志 (用于线程 ) 时出现 I/O 错误 Thu Aug 07 10:42:03 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc: ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止 ORA-00340: 处理联机日志 (用于线程 ) 时出现 I/O 错误 Thu Aug 07 10:42:03 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_lgwr_884.trc: ORA-00340: IO error processing online log 3 of thread 1 ORA-00345: redo log write error block 65238 count 13 ORA-00312: online log 3 thread 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27070: async read/write failed OSD-04016: 异步 I/O 请求排队时出错。 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 10:42:03 2014 LGWR: terminating instance due to error 340 Thu Aug 07 10:42:05 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00449: background process 'LGWR' unexpectedly terminated with error 340 ORA-00340: IO error processing online log of thread Thu Aug 07 10:42:05 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00449: background process 'LGWR' unexpectedly terminated with error 340 ORA-00340: IO error processing online log of thread Thu Aug 07 10:42:05 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00449: background process 'LGWR' unexpectedly terminated with error 340 ORA-00340: IO error processing online log of thread Thu Aug 07 17:40:05 2014 ALTER DATABASE OPEN Thu Aug 07 17:40:05 2014 Beginning crash recovery of 1 threads parallel recovery started with 15 processes Thu Aug 07 17:40:06 2014 Started redo scan Thu Aug 07 17:40:06 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27070: 异步读取/写入失败 OSD-04016: 异步 I/O 请求排队时出错。 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 17:40:06 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27091: 无法将 I/O 排队 ORA-27070: 异步读取/写入失败 OSD-04006: ReadFile() 失败, 无法读取文件 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 17:40:06 2014 Aborting crash recovery due to error 333 Thu Aug 07 17:40:06 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-333 signalled during: ALTER DATABASE OPEN...
进一步检查发现在7月6日系统就已经报io异常
Sun Jul 06 10:05:23 2014 ARC0: All Archive destinations made inactive due to error 333 Sun Jul 06 10:06:07 2014 KCF: write/open error block=0xd03 online=1 file=3 F:\ORADATA\SZCG\SYSAUX01.DBF error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。 O/S-Error: (OS 1) 函数不正确。' Automatic datafile offline due to write error on file 3: F:\ORADATA\SZCG\SYSAUX01.DBF Sun Jul 06 10:06:23 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc1_2676.trc: ORA-00333: redo log read error block 63489 count 2048 ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG' ORA-27091: unable to queue I/O ORA-27070: async read/write failed OSD-04006: ReadFile() 失败, 无法读取文件 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 10:36:54 2014 ARC1: All Archive destinations made inactive due to error 333 Thu Aug 07 10:37:25 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_m000_5832.trc: ORA-01135: file 3 accessed for DML/query is offline ORA-01110: data file 3: 'F:\ORADATA\SZCG\SYSAUX01.DBF'
检查硬件发现raid一块盘完全损坏,另外一块盘也处于告警状态,保护现场拷贝文件过程中发现redo02,redo03,sysaux无法拷贝,使用rman检查发现
因为redo完全损坏,使用工具跳过坏块,拷贝相关有坏块文件到其他目录,重命名相关文件尝试启动数据库,依然报ORA-00333 ORA-00312
Started redo scan Thu Aug 07 17:40:06 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27070: 异步读取/写入失败 OSD-04016: 异步 I/O 请求排队时出错。 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 17:40:06 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27091: 无法将 I/O 排队 ORA-27070: 异步读取/写入失败 OSD-04006: ReadFile() 失败, 无法读取文件 O/S-Error: (OS 1) 函数不正确。 Thu Aug 07 17:40:06 2014 Aborting crash recovery due to error 333 Thu Aug 07 17:40:06 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-333 signalled during: ALTER DATABASE OPEN...
设置隐含参数_allow_resetlogs_corruption,尝试强制拉库
Started redo scan Fri Aug 08 12:13:25 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27070: 异步读取/写入失败 OSD-04016: 异步 I/O 请求排队时出错。 O/S-Error: (OS 1) 函数不正确。 Fri Aug 08 12:13:25 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG' ORA-27091: 无法将 I/O 排队 ORA-27070: 异步读取/写入失败 OSD-04006: ReadFile() 失败, 无法读取文件 O/S-Error: (OS 1) 函数不正确。 Fri Aug 08 12:13:25 2014 Aborting crash recovery due to error 333 Fri Aug 08 12:13:25 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc: ORA-00333: 重做日志读取块 63016 计数 8192 出错 ORA-333 signalled during: ALTER DATABASE OPEN... Fri Aug 08 12:13:45 2014 ALTER DATABASE RECOVER database until cancel Fri Aug 08 12:13:45 2014 Media Recovery Start parallel recovery started with 15 processes ORA-279 signalled during: ALTER DATABASE RECOVER database until cancel ... Fri Aug 08 12:13:55 2014 ALTER DATABASE RECOVER CANCEL Fri Aug 08 12:13:59 2014 ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ... Fri Aug 08 12:13:59 2014 ALTER DATABASE RECOVER CANCEL ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ... Fri Aug 08 12:14:12 2014 alter database open resetlogs Fri Aug 08 12:14:13 2014 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. ORA-1245 signalled during: alter database open resetlogs... Fri Aug 08 12:54:11 2014 alter tablespace sysaux offline Fri Aug 08 12:54:11 2014 ORA-1109 signalled during: alter tablespace sysaux offline... Fri Aug 08 13:05:30 2014 alter database open Fri Aug 08 13:05:30 2014 ORA-1589 signalled during: alter database open...
在offline过程中,数据库检查到sysaux数据文件为offline状态,当表空间只有一个数据文件,而且该数据文件为offline,数据库将会尝试offline sysaux表空间,但是发现该表空间文件非正常scn,无法offline 表空间,导致resetlogs操作失败。这里是操作失误应该先online相关数据文件,然后再进行resetlogs操作
Sat Aug 09 11:56:03 2014 alter database datafile 3 online Sat Aug 09 11:56:04 2014 Completed: alter database datafile 3 online Sat Aug 09 11:56:08 2014 alter database open resetlogs RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. Sat Aug 09 11:56:18 2014 ARCH: Encountered disk I/O error 19502 Sat Aug 09 11:56:18 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc: ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) ORA-27072: 文件 I/O 错误 OSD-04008: WriteFile() 失败, 无法写入文件 O/S-Error: (OS 1) 函数不正确。 ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) Sat Aug 09 11:56:18 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc: ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) ORA-27072: 文件 I/O 错误 OSD-04008: WriteFile() 失败, 无法写入文件 O/S-Error: (OS 1) 函数不正确。 ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512) ARCH: I/O error 19502 archiving log 3 to 'F:\ARCHIVE\ARC01745_0814618167.001' Sat Aug 09 11:56:18 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc: ORA-00265: 要求实例恢复, 无法设置 ARCHIVELOG 模式 Archive all online redo logfiles failed:265 RESETLOGS after incomplete recovery UNTIL CHANGE 77983856 Resetting resetlogs activation ID 3562192628 (0xd452bef4) Online log F:\ORADATA\SZCG\REDO01.LOG: Thread 1 Group 1 was previously cleared Online log F:\ORADATA\SZCG\REDO02.LOG: Thread 1 Group 2 was previously cleared Online log D:\REDO04.LOG: Thread 1 Group 4 was previously cleared Sat Aug 09 11:56:22 2014 Setting recovery target incarnation to 3 Sat Aug 09 11:56:23 2014 Assigning activation ID 3602586269 (0xd6bb1a9d) LGWR: STARTING ARCH PROCESSES ARC0 started with pid=33, OS id=5900 Sat Aug 09 11:56:23 2014 ARC0: Archival started ARC1: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC1 started with pid=34, OS id=5776 Sat Aug 09 11:56:24 2014 Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: F:\ORADATA\SZCG\REDO01.LOG Successful open of redo thread 1 Sat Aug 09 11:56:24 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Sat Aug 09 11:56:24 2014 ARC1: Becoming the 'no FAL' ARCH ARC1: Becoming the 'no SRL' ARCH Sat Aug 09 11:56:24 2014 ARC0: Becoming the heartbeat ARCH Sat Aug 09 11:56:24 2014 SMON: enabling cache recovery Sat Aug 09 11:56:25 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc: ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], [] Sat Aug 09 11:56:26 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc: ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], [] Sat Aug 09 11:56:26 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 4516 ORA-1092 signalled during: alter database open resetlogs...
ORA-600 2662这个错误很熟悉,直接推SCN,数据库open,但是报ORA-600 4194
Sat Aug 09 12:01:28 2014 SMON: enabling cache recovery Dictionary check complete Sat Aug 09 12:01:32 2014 SMON: enabling tx recovery Sat Aug 09 12:01:32 2014 Database Characterset is ZHS16GBK Opening with internal Resource Manager plan replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC QMNC started with pid=34, OS id=6116 Sat Aug 09 12:01:34 2014 LOGSTDBY: Validating controlfile with logical metadata Sat Aug 09 12:01:34 2014 LOGSTDBY: Validation complete Sat Aug 09 12:01:34 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc: ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], [] Sat Aug 09 12:01:36 2014 Doing block recovery for file 2 block 319 Resuming block recovery (PMON) for file 2 block 319 Block recovery from logseq 2, block 56 to scn 1073742003 Sat Aug 09 12:01:36 2014 Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0 Mem# 0: F:\ORADATA\SZCG\REDO02.LOG Block recovery stopped at EOT rba 2.79.16 Block recovery completed at rba 2.79.16, scn 0.1073742002 Doing block recovery for file 2 block 153 Resuming block recovery (PMON) for file 2 block 153 Block recovery from logseq 2, block 56 to scn 1073741986 Sat Aug 09 12:01:36 2014 Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0 Mem# 0: F:\ORADATA\SZCG\REDO02.LOG Block recovery completed at rba 2.66.16, scn 0.1073741988 Sat Aug 09 12:01:36 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc: ORA-01595: error freeing extent (4) of rollback segment (10)) ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], [] Sat Aug 09 12:01:36 2014 Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5272.trc: ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], [] Sat Aug 09 12:01:36 2014 Completed: alter database open
尝试重建undo表空间并切换undo_tabspace到新undo表空间解决,因为数据库在恢复过程中使用了隐含参数强制拉库,不能保证数据一致性,强烈建议逻辑方式重建数据库
在本次故障中,所幸的是只有redo和sysaux文件损坏,如果是业务数据文件或者system数据文件损坏,恢复的后果可能更加麻烦,丢失数据可能更加多。再次说明:数据库备份非常重要,数据的安全性不能完全寄希望于硬件之上
发表在 Oracle备份恢复
标签为 ORA-00312, ORA-00333, ORA-00340, ORA-00603, ORA-27072, ORA-600 2662, ORA-600 4194, ORA-600[4194], OSD-04008
评论关闭
ORA-607/ORA-600[4194]不一定是重大灾难
以前解决过ORA-607/ORA-600[4194]和模拟过ORA-607/ORA-600[4194]错误,所以固定思维任务ORA-607/ORA-600[4194]可能就是重大灾难,通过这个案例来说明ORA-607/ORA-600[4194]可能也就是一个常规的不能再常规的错误:有一网友数据库因意外关闭电源导致启动过程出现ORA-00607/ORA-00600[4194]/ORA-00600[4097]的错误,使得数据库启动失败.
SMON: enabling tx recovery Fri Aug 31 23:14:08 2012 Database Characterset is ZHS16GBK replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC QMNC started with pid=19, OS id=15619 Fri Aug 31 23:14:10 2012 Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc: ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], [] Fri Aug 31 23:14:12 2012 Non-fatal internal error happenned while SMON was doing logging scn->time mapping. SMON encountered 1 out of maximum 100 non-fatal internal errors. Fri Aug 31 23:14:12 2012 Completed: alter database open Fri Aug 31 23:14:14 2012 Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc: ORA-01595: error freeing extent (2) of rollback segment (4)) ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], [] Fri Aug 31 23:29:41 2012 Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc: ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], [] Fri Aug 31 23:29:43 2012 Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc: ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], [] Fri Aug 31 23:29:44 2012 Errors in file /u01/oradata/orcl/bdump/orcl_pmon_15577.trc: ORA-00474: SMON process terminated with error Fri Aug 31 23:29:44 2012 PMON: terminating instance due to error 474 Instance terminated by PMON, pid = 15577
通过alert日志可以定位到SMON_SCN_TIME表或者其回滚操作可能异常,结合alert和trace分析,发现这次错误的操作主要sql语句为:
ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], [] Current SQL statement for this session: UPDATE SYS.COL_USAGE$ SET EQUALITY_PREDS = EQUALITY_PREDS + DECODE(BITAND(:FLAG, 1), 0, 0, 1), EQUIJOIN_PREDS = EQUIJOIN_PREDS + DECODE(BITAND(:FLAG, 2), 0, 0, 1), NONEQUIJOIN_PREDS = NONEQUIJOIN_PREDS + DECODE(BITAND(:FLAG, 4), 0, 0, 1), RANGE_PREDS = RANGE_PREDS + DECODE(BITAND(:FLAG, 8), 0, 0, 1), LIKE_PREDS = LIKE_PREDS + DECODE(BITAND(:FLAG, 16), 0, 0, 1), NULL_PREDS = NULL_PREDS + DECODE(BITAND(:FLAG, 32), 0, 0, 1), TIMESTAMP = :TIME WHERE OBJ# = :OBJN AND INTCOL# = :COLN ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], [] Current SQL statement for this session: UPDATE SYS.MON_MODS$ SET INSERTS = INSERTS + :INS, UPDATES = UPDATES + :UPD, DELETES = DELETES + :DEL, FLAGS = (DECODE(BITAND(FLAGS, :FLAG), :FLAG, FLAGS, FLAGS + :FLAG)), DROP_SEGMENTS = DROP_SEGMENTS + :DROPSEG, TIMESTAMP = :TIME WHERE OBJ# = :OBJN ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], [] Current SQL statement for this session: INSERT INTO SMON_SCN_TIME (THREAD, TIME_MP, TIME_DP, SCN, SCN_WRP, SCN_BAS, NUM_MAPPINGS, TIM_SCN_MAP) VALUES (0, :1, :2, :3, :4, :5, :6, :7)
这里主要涉及到对oracle的三张表的操作
COL_USAGE$:主要是在收集统计信息的时候作为是否需要收集列直方图信息参考
MON_MODS$:Oracle主要利用该表来记录那些表的数据发生改变,方便收集统计信息
SMON_SCN_TIME:记录SCN和TIME的对应关系
通过这里的分析可以确定这三张表中的数据对于数据库来说不是致命的基表信息,在数据库运行过程中可以清理掉这些信息,最多就是因为数据库性能的下降或者SCN和TIME互转功能不完善.
解决思路
完整的undo异常处理顺序
1.从alert中可以看出来数据库是在open之后由于SMON回滚到上述几条sql异常导致数据库down,所以可以尝试使用system回滚段启动数据库,看看是否可以屏蔽相关问题
2.如果方法1不可行,那使用event屏蔽smon对回滚段的相关操作,使得数据库正常启动
3.如果由于存在特殊事务,event无法屏蔽,尝试使用隐含参数处理该问题
4.如果隐含参数尚无法解决给问题考虑使用bbed
5.如果bbed不能解决,那只能选择dul或者其类似工具处理
这个案例中我们明确的看到是因为上面的三条sql回滚异常出现问题导致,对于这样的问题,经过测试使用方法1和2都能够顺利解决问题(open库之后需要重建undo,删除有问题undo表空间,修改参数[可能包括event],切换undo表空间).因为遇到几次ORA-607/ORA-600[4194]是因为system rollback损坏导致,所以这次开始也认为是一次比较复杂的恢复,最后证明这次是一种非常常规的恢复.对于ORACLE的数据库恢复有经验可能会比较快的定位问题,但是如果按照固定的思路去想可能会让自己走进死胡同.
数据库报ORA-00607/ORA-00600[4194]错误
昨天晚上处理一起比较奇特的ORA-00600[4194]错误的数据库恢复案例,客户数据库刚刚上线,因为一时疏忽没有做备份.谁知天有不测风云,就这样的系统也会出问题(数据库文件总共 5g redo log sequence#=9).这个事故告诉我们:作为dba在任何时候都不要有侥幸心理,备份重于一切
数据库报ORA-00607/ORA-00600[4194]错误
Thu Jul 26 13:21:11 2012 SMON: enabling cache recovery Thu Jul 26 13:21:11 2012 Errors in file /orasvr/admin/mispdata/udump/mispdata_ora_2865.trc: ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], [] Thu Jul 26 13:21:11 2012 Doing block recovery for file 1 block 18 Block recovery from logseq 3994, block 3 to scn 89979535 Thu Jul 26 13:21:11 2012 Recovery of Online Redo Log: Thread 1 Group 1 Seq 3994 Reading mem 0 Mem# 0: /orasvr/mispdata/redo01.log Block recovery stopped at EOT rba 3994.5.16 Block recovery completed at rba 3994.5.16, scn 0.89979533 Doing block recovery for file 1 block 9 Block recovery from logseq 3994, block 3 to scn 89979532 Thu Jul 26 13:21:11 2012 Recovery of Online Redo Log: Thread 1 Group 1 Seq 3994 Reading mem 0 Mem# 0: /orasvr/mispdata/redo01.log Block recovery completed at rba 3994.5.16, scn 0.89979533 Thu Jul 26 13:21:11 2012 Errors in file /orasvr/admin/mispdata/udump/mispdata_ora_2865.trc: ORA-00604: error occurred at recursive SQL level 1 ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], [] Error 604 happened during db open, shutting down database USER: terminating instance due to error 604 Instance terminated by USER, pid = 2865 ORA-1092 signalled during: ALTER DATABASE OPEN...
通过alert日志中,我们可以发现是因为ORA-00600[4194]导致数据库不能被正常open,但是这次不同的是在报ORA-00600之前有ORA-00607的错误出现,根据这个提示,应该是一个基本的数据块有问题导致.而ORA-00600[4194]是因为undo和redo不一致导致.对于本错误放在一起分析,大概的评估是因为内部对象的异常出现ora-607,导致undo和redo不一致出现ORA-00600[4194].
trace文件分析
--dump redo DUMP OF REDO FROM FILE '/orasvr/mispdata/redo02.log' Opcodes *.* DBAs (file#, block#): (1, 18) RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff Times: creation thru eternity FILE HEADER: Compatibility Vsn = 169870080=0xa200300 Db ID=658120234=0x273a1e2a, Db Name='MISPDATA' Activation ID=658142762=0x273a762a Control Seq=16668=0x411c, File size=102400=0x19000 File Number=2, Blksiz=512, File Type=2 LOG descrip:"Thread 0001, Seq# 0000003992, SCN 0x0000055c5e3c-0x0000055cac62" thread: 1 nab: 0x5 seq: 0x00000f98 hws: 0x6 eot: 0 dis: 0 resetlogs count: 0x2d42646a scn: 0x0000.00000001 (1) resetlogs terminal rcv count: 0x0 scn: 0x0000.00000000 prev resetlogs count: 0x0 scn: 0x0000.00000000 prev resetlogs terminal rcv count: 0x0 scn: 0x0000.00000000 Low scn: 0x0000.055c5e3c (89939516) 07/26/2012 11:17:42 Next scn: 0x0000.055cac62 (89959522) 07/26/2012 13:16:19 Enabled scn: 0x0000.00000001 (1) 08/16/2011 11:50:10 Thread closed scn: 0x0000.055cac61 (89959521) 07/26/2012 11:17:42 Disk cksum: 0x3088 Calc cksum: 0x3088 Terminal recovery stop scn: 0x0000.00000000 Terminal recovery 01/01/1988 00:00:00 Most recent redo scn: 0x0000.00000000 Largest LWN: 0 blocks End-of-redo stream : No Unprotected mode Miscellaneous flags: 0x0 Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000 --ORA-00600错误提示 *** 2012-07-26 13:21:11.566 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], [] Current SQL statement for this session: update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7, xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1 --ora-607 Error 607 in redo application callback TYP:0 CLS:16 AFN:1 DBA:0x00400012 OBJ:4294967295 SCN:0x0000.0551610e SEQ: 1 OP:5.1 ktudb redo: siz: 256 spc: 7892 flg: 0x0012 seq: 0x003d rec: 0x02 xid: 0x0000.026.00000035 ktubl redo: slt: 38 rci: 0 opc: 11.1 objn: 15 objd: 15 tsn: 0 Undo type: Regular undo Begin trans Last buffer split: No Temp Object: No Tablespace Undo: No 0x00000000 prev ctl uba: 0x00400012.003d.01 prev ctl max cmt scn: 0x0000.0550709b prev tx cmt scn: 0x0000.0550709c txn start scn: 0xffff.ffffffff logon user: 0 prev brb: 4194318 prev bcl: 0 KDO undo record: KTB Redo op: 0x04 ver: 0x01 op: L itl: xid: 0x0000.01e.00000035 uba: 0x00400012.003d.01 flg: C--- lkc: 0 scn: 0x0000.05511296 KDO Op code: URP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x0040006a hdba: 0x00400069 itli: 1 ispac: 0 maxfr: 4863 tabn: 0 slot: 1(0x1) flag: 0x2c lock: 0 ckix: 0 ncol: 17 nnew: 12 size: 0 col 1: [ 9] 5f 53 59 53 53 4d 55 31 24 col 2: [ 2] c1 02 col 3: [ 2] c1 03 col 4: [ 2] c1 0a col 5: [ 5] c4 5a 12 5a 14 col 6: [ 1] 80 col 7: [ 4] c3 08 5f 3d col 8: [ 4] c3 02 38 52 col 9: [ 1] 80 col 10: [ 2] c1 04 col 11: [ 2] c1 02 col 16: [ 2] c1 02 Block after image is corrupt: buffer tsn: 0 rdba: 0x00400012 (1/18) scn: 0x0000.0551610e seq: 0x01 flg: 0x04 tail: 0x610e0201 frmt: 0x02 chkval: 0x65f8 type: 0x02=KTU UNDO BLOCK
这里信息比较多:
1.dump redo部分得到file 1 block 18块可能异常
2.ora-600部分可以得出数据库在执行undo$对象update的回滚操作时候报错
3.通过ora-607信息得到update undo$记录对应的数据块是file 1 block 106(dba 0×00400069),在相同数据库版本数据库中查询.也就是说undo$这个回滚段回滚的时候出现错误.
SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME 2 FROM DBA_EXTENTS A 3 WHERE FILE_ID = &FILE_ID 4 AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1; Enter value for file_id: 1 old 3: WHERE FILE_ID = &FILE_ID new 3: WHERE FILE_ID = 1 Enter value for block_id: 106 old 4: AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1 new 4: AND 106 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1 OWNER ------------------------------ SEGMENT_NAME -------------------------------------------------------------------------------- SEGMENT_TYPE TABLESPACE_NAME PARTITION_NAME ------------------ ------------------------------ ------------------------------ SYS UNDO$ TABLE SYSTEM
4.发现dba 0×00400012发现坏块是file 1 block 18,查询坏块对象为
SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME 2 FROM DBA_EXTENTS A 3 WHERE FILE_ID = &FILE_ID 4 AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1; Enter value for file_id: 1 old 3: WHERE FILE_ID = &FILE_ID new 3: WHERE FILE_ID = 1 Enter value for block_id: 18 old 4: AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1 new 4: AND 18 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1 OWNER ------------------------------ SEGMENT_NAME -------------------------------------------------------------------------------- SEGMENT_TYPE TABLESPACE_NAME PARTITION_NAME ------------------ ------------------------------ ------------------------------ SYS SYSTEM ROLLBACK SYSTEM
通过这里的分析,大概可以确定这次故障的原因:
因为ROLLBACK(file 1 block 18)坏块,redo 恢复undo 出现异常出现ORA-607,使得undo和redo不一致从而出现ORA-00600[4194],导致undo$(file 1 block 106)中的一条update事务不能被正常提交或者回滚,从而使得该数据库不能被正常打开.
针对这个库因为ROLLBACK异常,使用隐含参数无法屏蔽该回滚段,因为这个数据量非常小,我们选择了挖数据文件.如果数据量比较大,可以通过bbed尝试提交undo$(file 1 block 106)数据块中事务,看人品是否能够正常启动.