标签归档:ORA-600 4194

ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

数据库启动ORA-00704,0RA-00604,ORA-01555导致数据库无法启动

Tue May 31 17:32:42 2016
SMON: enabling cache recovery
SUCCESS: diskgroup RECOVERY was mounted
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0004.3af84bee):
select ctime, mtime, stime from obj$ where obj# = :1
Archived Log entry 5 added for thread 1 sequence 10 ID 0x86a261e7 dest 1:
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 12779): terminating the instance due to error 704

通过bbed修改事务之后启动数据库

Tue May 31 17:35:49 2016
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to AL32UTF8
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p021_13862.trc  (incident=166002):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p010_13818.trc  (incident=165914):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p004_13794.trc  (incident=165866):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_13822.trc  (incident=165922):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p016_13842.trc  (incident=165962):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []

ORA-600 [kdiulk:kcbz_objdchk] trace文件

*** SESSION ID:(3.5) 2016-05-31 17:35:50.068
OBJD MISMATCH typ=6, seg.obj=-2, diskobj=222225, dsflg=0, dsobj=285890, tid=285890, cls=1
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []

Parallel Transaction recovery server caught exception 600
begin Parallel Recovery Context Dump
nsi: 48, nsactive: 48
, nirsi: 1, nidti: 1, ndt: 1, rescan: 0, ptrs: 48
[ktprsi] wdone: 50
[ktpritp 378651b8] ktprsi: 
37903b60 37903b78 37903b90 37903ba8 37903bc0 37903bd8 37903bf0 37903c08 37903c20 37903c38 37903c50 
37903c68 37903c80 37903c98 37903cb0 37903cc8 37903ce0 37903cf8 37903d10 37903d28 37903d40 37903d58 
37903d70 37903d88 37903da0 37903db8 37903dd0 37903de8 37903e00 37903e18 37903e30 37903e48 37903e60 
37903e78 37903e90 37903ea8 37903ec0 37903ed8 37903ef0 37903f08 37903f20 37903f38 37903f50 37903f68 
37903f80 37903f98 37903fb0 37903fc8 
[ktprht] nhb: 47, nfl: 20247, flg: 2

*** 2016-05-31 17:36:08.584
[ktprhb] nfl: 1, nelem: 97, flg: 0, sqn: 1
flist: 37698940 
nhe: [ktprhe 32] sqn: -1297235803
[kturur] uoff: -1797708320, sqn: 4
uba: 0x098004cd.07e4.0b
*-----------------------------
* Rec #0xb  slt: 0x07  objn: 123986(0x0001e452)  objd: 285891  tblspc: 10(0x0000000a)
*       Layer:  10 (Index)   opc: 22   rci 0x0a   
Undo type:  Regular undo   Last buffer split:  No 
Temp Object:  No 
Tablespace Undo:  No 
rdba: 0x00000000

这里基本上可以确定是由于undo index中的dataobj#和block中的dataobj# 不匹配.在数据库undo回滚之时出现该错误.可以通过跳过undo回滚,然后重建对象

Tue May 31 17:36:06 2016
Simulated error on redo application.
Block recovery from logseq 12, block 959 to scn 20401094719
Recovery of Online Redo Log: Thread 1 Group 3 Seq 12 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.263.802446627
Block recovery completed at rba 12.1012.16, scn 4.3221225536
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Simulated error for redo application done.
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p009_13814.trc  (incident=165906):
ORA-00600: 内部错误代码, 参数: [kdBlkCheckError], [26], [950417], [18025], [], [], [], [], [], [], [], []

这些错误是由于数据库block逻辑异常导致,错过参数含义
在10g中ORA-600 kddummy_blkchk 在11g中ORA-600 kdBlkCheckError

ARGUMENTS:
Arg [a] Absolute file number
Arg [b] Bock number
Arg 1 Internal error code returned from kcbchk() which indicates the problem encountered. 
See Note 46389.1 for details of block check codes.

根据QREF kddummy_blkchk / kdBlkCheckError Check Codes Listing (Full) (Doc ID 1264040.1)分析
这里的18025是代码的KCBTEMAP_EC_START + KTS4_EC_SBFREE部分异常,主要表现在Incorrect firstfree or nfree 可以通过设置一些参数进行屏蔽

在恢复过程中还有其他错误

ORA-600 encountered when generating server alert SMG-4128
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4406], [0x1026B65348], [0x000000000], [2], [6215], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORACLE Instance xifenfei (pid = 15) - Error 600 encountered while recovering transaction (10, 7) on object 123986.
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kturbleurec1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kewrose_1], [600], 
  [ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

通过整体分析错误主要是由于undo异常导致,通过设置_corrupted_rollback_segments设置db_block_checking等相关参数,清理SMON_SCN_TIME等操作数据库没有其他异常报错,让其通过逻辑方式重建库

发表在 Oracle备份恢复 | 标签为 , , , , , , , , , | 评论关闭

ORA-00354 ORA-00353 ORA-00312异常处理

数据库启动报错
WIN平台oracle 9.2.0.6版本数据库redo log block header损坏,ORA-00354 ORA-00353 ORA-00312错误导致数据库无法启动

SQL >alter database open;
*
ERROR at line 1:

ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 1892904 change 281470950178815
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\ZOYO\REDO03.LOG'
Sun Jan 24 15:44:05 2016
Database mounted in Exclusive Mode.
Completed: alter database mount exclusive
Sun Jan 24 15:44:05 2016
alter database open
Sun Jan 24 15:44:05 2016
Beginning crash recovery of 1 threads
Sun Jan 24 15:44:05 2016
Started redo scan
ORA-354 signalled during: alter database open...
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Sun Jan 24 15:44:32 2016
ALTER DATABASE CLOSE NORMAL
ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...

通过分析,确定损坏的redo03为当前redo,无法使用正常方法打开,加上_allow_resetlogs_corruption参数,尝试打开库,依旧失败

数据库报ORA-600 2662错误

Sun Jan 24 16:26:30 2016
SMON: enabling cache recovery
Sun Jan 24 16:26:30 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []

Sun Jan 24 16:26:31 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []

Sun Jan 24 16:26:31 2016
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 640
ORA-1092 signalled during: alter database open resetlogs...

ORA 600 2662的错误处理
根据经验,这个错误只需要推scn即可,可以通过bbed,隐含参数,event,oradebug,修改控制文件等方法进行,推scn之后,数据库报熟悉的ORA-00604 ORA-00607 ORA-600 4194错误,以前我们遇到的block大部分是128,这次报异常block为9.实际中跟版本有关系,在ORACLE 9.2.0.6中该错误为file 1 block 9.大部分版本为128

Sun Jan 24 16:29:39 2016
SMON: enabling cache recovery
Sun Jan 24 16:29:39 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []

Sun Jan 24 16:29:39 2016
Doing block recovery for fno: 1 blk: 401
Sun Jan 24 16:29:39 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Doing block recovery for fno: 1 blk: 9
Sun Jan 24 16:29:40 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Sun Jan 24 16:29:40 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []

Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 3432

ORA-00604 ORA-00607 ORA-600 4194分析trace文件

*** 2016-01-24 16:29:40.031
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
 
Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.01e112e1 seq: 0x01 flg: 0x04 tail: 0x12e10e01
frmt: 0x02 chkval: 0xba76 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47    
                  last map  0x00000000  #maps: 0      offset: 4128  
      Highwater::  0x00400191  ext#: 4      blk#: 0      ext size: 8     
  #blocks in seg. hdr's freelists: 0     
  #blocks below: 0     
  mapblk  0x00000000  offset: 4     
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7     
   0x00400011  length: 8     
   0x00400181  length: 8     
   0x00400189  length: 8     
   0x00400191  length: 8     
   0x00400199  length: 8     
  
  TRN CTL:: seq: 0x008e chd: 0x0060 ctl: 0x0024 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400191.008e.04 scn: 0x0000.01ded29c
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400191.008e.04 ext: 0x4  spc: 0x1c3e  
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334  
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788   
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     
  TRN TBL::

从这里可以确定undo segment header中的分配block记录有问题,清除ktuxc.fbp.fbp[N].kuba.kdba相关记录,数据库正常打开

    . struct ktuxc  kernel transaction undo xaction table control with 15 members
    . {
    .   struct kscn   scn with 3 members
    .   {
04148     ub4           bas      = 0X9CD2DE01 = 31380124
04152     ub2           wrp      = 0X0000 = 0
04154     cc32          pad      = 0X0000 = 0
    .   }
    .   struct kuba   uba with 4 members
    .   {
04156     kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04160     ub2           seq      = 0X8E00 = 142
04162     ub1           rec      = 0X04 = 4
04163     cc16          pad      = 0X00 = 0
    .   }
04164   sb2           flg      = 0X0100 = 1
04166   ub2           seq      = 0X8E00 = 142
04168   sb2           nfb      = 0X0100 = 1
04170   cc32          pad1     = 0X0000 = 0
04172   ub4           inc      = 0X00000000 = 0
04176   sb2           chd      = 0X6000 = 96
04178   sb2           ctl      = 0X2400 = 36
04180   ub2x          mgc      = 0X0280 = 0x8002
04182   ub2           ver      = 0X0100 = 1
04184   ub2           xts      = 0X6800 = 104
04186   cc32          pad2     = 0X0000 = 0
04188   ub4           opt      = 0XFEFFFF7F = 2147483646
    .   ktufb fbp[5] (array with 5 elements)
    .     struct fbp   [0] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04192         kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04196         ub2           seq      = 0X8E00 = 142
04198         ub1           rec      = 0X04 = 4
04199         cc16          pad      = 0X00 = 0
    .       }
04200       sb2           ext      = 0X0400 = 4
04202       sb2           spc      = 0X3E1C = 7230
    .     }
    .     struct fbp   [1] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04204         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04208         ub2           seq      = 0X2F00 = 47
04210         ub1           rec      = 0X21 = 33
04211         cc16          pad      = 0X00 = 0
    .       }
04212       sb2           ext      = 0X0500 = 5
04214       sb2           spc      = 0X3413 = 4916
    .     }
    .     struct fbp   [2] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04216         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04220         ub2           seq      = 0X2E00 = 46
04222         ub1           rec      = 0X37 = 55
04223         cc16          pad      = 0X00 = 0
    .       }
04224       sb2           ext      = 0X0400 = 4
04226       sb2           spc      = 0X8807 = 1928
    .     }
    .     struct fbp   [3] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04228         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04232         ub2           seq      = 0X0000 = 0
04234         ub1           rec      = 0X00 = 0
04235         cc16          pad      = 0X00 = 0
    .       }
04236       sb2           ext      = 0X0000 = 0
04238       sb2           spc      = 0X0000 = 0
    .     }
    .     struct fbp   [4] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04240         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04244         ub2           seq      = 0X0000 = 0
04246         ub1           rec      = 0X00 = 0
04247         cc16          pad      = 0X00 = 0
    .       }
04248       sb2           ext      = 0X0000 = 0
04250       sb2           spc      = 0X0000 = 0
    .     }
    . }
Sun Jan 24 16:44:52 2016
SMON: enabling tx recovery
Sun Jan 24 16:44:52 2016
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN
发表在 非常规恢复 | 标签为 , , , , , , | 评论关闭

重建控制文件丢失数据文件导致悲剧

在Oracle职业生涯中,恢复过生产环境数据库也有几百个.对于Oracle恢复我还是相当的自信,今天因为自己的一时过于自信,对于环境错了错误的判断,简单问题复杂化,差点变成悲剧
开发出来了Oracle Recovery Tools恢复MISSING00000文件故障工具,能够一键解决类似问题,实现快速恢复
数据库最初故障

Thu Sep 25 09:27:26 2014
MMON started with pid=15, OS id=1968 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = F:\oracle
Thu Sep 25 09:27:26 2014
ALTER DATABASE   MOUNT
Thu Sep 25 09:27:26 2014
MMNL started with pid=16, OS id=5976 
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_4624.trc:
ORA-00202: ????: ''F:\ORACLE\ORADATA\ORCL\CONTROL01.CTL
ORA-27070: ????/????
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Thu Sep 25 09:28:31 2014
ORA-204 signalled during: ALTER DATABASE   MOUNT...

因为硬件或者系统层面问题,导致控制文件无法正常访问

重建控制文件

Fri Sep 26 12:28:44 2014
Successful mount of redo thread 1, with mount id 1387065723
Completed: CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS ARCHIVELOG
    MAXLOGFILES 5
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 2
    MAXLOGHISTORY 226
LOGFILE
  GROUP 1 'F:\oracle\oradata\orcl\REDO01.LOG'  SIZE 50M,  --redo log ????
  GROUP 2 'F:\oracle\oradata\orcl\REDO02.LOG'  SIZE 50M,  --redo log ????
  GROUP 3 'F:\oracle\oradata\orcl\REDO03.LOG'  SIZE 50M  --redo log ????
-- STANDBY LOGFILE
DATAFILE
  'F:\oracle\oradata\orcl\SYSAUX01.DBF',  --sysaux???????
  'F:\oracle\oradata\orcl\SYSTEM01.DBF',
  'F:\oracle\oradata\orcl\USERS01.DBF', --user????????
  'F:\oracle\oradata\orcl\UNDOTBS01.DBF' --undo???????
CHARACTER SET ZHS16GBK
Fri Sep 26 12:29:55 2014
alter database open resetlogs
ORA-1194 signalled during: alter database open resetlogs...

埋下了雷,创建控制文件中未全部列举出来所有数据文件

进行不完全恢复,尝试resetlogs库发现redo异常

Fri Sep 26 14:13:24 2014
ALTER DATABASE   MOUNT
Fri Sep 26 14:13:24 2014
MMNL started with pid=16, OS id=9024 
Successful mount of redo thread 1, with mount id 1387037444
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Fri Sep 26 14:14:08 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Fri Sep 26 14:15:16 2014
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 2049 计数 6143 出错
ORA-00312: 联机日志 1 线程 1: 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Fri Sep 26 14:16:24 2014
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 1 计数 8191 出错
ORA-00312: 联机日志 1 线程 1: 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 1 计数 8191 出错
ARCH: All Archive destinations made inactive due to error 333

使用隐含参数尝试拉库,报ORA-600[2662]

Fri Sep 26 14:16:45 2014
SMON: enabling cache recovery
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc  (incident=57761):
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Incident details in: f:\oracle\diag\rdbms\orcl\orcl\incident\incdir_57761\orcl_ora_3720_i57761.trc
Fri Sep 26 14:16:45 2014
ARC3 started with pid=23, OS id=9692 
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 3720): terminating the instance due to error 704
Instance terminated by USER, pid = 3720
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (3720) as a result of ORA-1092

数据库在未使用所有数据文件的情况下,进行了resetlogs操作,悲剧的本质已经注定,我的失误是没有评估好现状,还继续在错误的道路上越走越远.

我开始接手该库现况

Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Fri Sep 26 14:18:55 2014
alter database open
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...
Fri Sep 26 14:19:31 2014
alter database open 
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open ...
Fri Sep 26 14:22:26 2014
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Fri Sep 26 14:22:26 2014
Media Recovery failed with error 16433
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...
Fri Sep 26 14:24:25 2014
ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'  
Media Recovery Start
Media Recovery failed with error 16433
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'  ...
Fri Sep 26 14:28:47 2014
alter database open read write
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open read write...
Fri Sep 26 14:31:48 2014
ALTER DATABASE RECOVER  datafile 'F:\oracle\oradata\orcl\SYSTEM01.DBF'  
Media Recovery Start
Media Recovery failed with error 16433
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 'F:\oracle\oradata\orcl\SYSTEM01.DBF'  ...

提示ORA-01110: 数据文件 1需要恢复,尝试recover操作

尝试recover操作

连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.


SQL> alter database backup controlfile to trace as 'd:\ctl.txt';
alter database backup controlfile to trace as 'd:\ctl.txt'
*
第 1 行出现错误:
ORA-16433: 必须以读/写模式打开数据库。


SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.

重建控制文件

SQL> shutdown immediate;
ORA-01109: 数据库未打开


已经卸载数据库。
ORACLE 例程已经关闭。
SQL> STARTUP NOMOUNT
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE orcl NORESETLOGS FORCE LOGGING ARCHIVELOG

  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'  SIZE 50M,
  9    GROUP 2 'F:\ORACLE\ORADATA\ORCL\REDO02.LOG'  SIZE 50M,
 10    GROUP 3 'F:\ORACLE\ORADATA\ORCL\REDO03.LOG'  SIZE 50M
 11  DATAFILE
 12    'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF',
 13    'F:\ORACLE\ORADATA\ORCL\SYSAUX01.DBF',
 14    'F:\ORACLE\ORADATA\ORCL\UNDOTBS01.DBF',
 15    'F:\ORACLE\ORADATA\ORCL\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;

控制文件已创建。

这一步严重发错,在恢复前未认真看alert日志,太依赖v$datafile查询出来结果,导致重建控制文件丢失数据文件,埋下大雷。根据前面alert日志报错ORA-600 2662,决定一并处理该问题,然后进行恢复

SQL> shutdown immediate;
ORA-01109: ??????


已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup pfile='d:\pfile.txt'  mount;
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [], [

数据库报ORA-600 4194,直接修改undo_management=manual,然后尝试启动数据库

SQL> conn / as sysdba
已连接到空闲例程。
SQL> startup pfile='d:\pfile.txt'
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
数据库已经打开。
SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------

F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF
F:\ORACLE\ORADATA\ORCL\SYSAUX01.DBF
F:\ORACLE\ORADATA\ORCL\UNDOTBS01.DBF
F:\ORACLE\ORADATA\ORCL\USERS01.DBF
F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00005
F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00006

已选择6行。

SQL> alter database rename file 'F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00005' 
2   to 'F:\oracle\oradata\SOURCE_DATA1.DBF';

数据库已更改。

SQL> alter database rename file 'F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00006' 
2   to 'F:\oracle\oradata\SOURCE_idx1.DBF';

数据库已更改。

SQL> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup mount pfile='d:\pfile.txt'
ORACLE 例程已经启动。

Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
SQL> alter datafile 5 online;
alter datafile 5 online
      *
第 1 行出现错误:
ORA-00940: 无效的 ALTER 命令


SQL> alter database datafile 5 online;

数据库已更改。

SQL> alter database datafile 6 online;

数据库已更改。

SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 5 belongs to an orphan incarnation
ORA-01110: data file 5: 'F:\ORACLE\ORADATA\SOURCE_DATA1.DBF'


SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01139: RESETLOGS 选项仅在不完全数据库恢复后有效


SQL> alter database datafile 6 offline;

数据库已更改。

SQL> alter database datafile 5 offline;

数据库已更改。

SQL> recover database until cancel;
完成介质恢复。
SQL> alter database datafile 6 online;

数据库已更改。

SQL> alter database datafile 5 online;

数据库已更改。

SQL> alter database open resetlogs;

数据库已更改。

还好结合一些隐含参数侥幸恢复成功,差点到了要使用bbed的程度,如果遇到极端情况无法处理可以参考:Oracle Recovery Tools恢复MISSING00000文件故障

这次的恢复告诉我:Oracle数据库恢复千万比大意,需要认真分析alert日志和咨询客户做了那些操作,不然可能导致万劫不复之禁地

发表在 Oracle备份恢复 | 标签为 , , , , | 评论关闭