标签归档：ORA-600 4193

ORA-600 4194/ORA-600 4193/ORA-600 4137故障解决

发表于 2016 年 8 月 4 日由惜分飞

对于常见的undo异常错误,ORA-600 4193,ORA-600 4194,ORA-600 4137等错误的处理一般步骤.
适用版本

Oracle Database - Enterprise Edition - Version 9.2.0.1 to 11.2.0.4 [Release 9.2 to 11.2]
Information in this document applies to any platform.

报错现象

The following error is occurring in the alert.log right before the database crashes.

ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], []

This error indicates that a mismatch has been detected between redo records and rollback (undo) records.

ARGUMENTS:

Arg [a] - Maximum Undo record number in Undo block
Arg [b] - Undo record number from Redo block

Since we are adding a new undo record to our undo block, we would expect that the new record number
 is equal to the maximum record number in the undo block plus one. Before Oracle can add 
a new undo record to the undo block it validates that this is correct. If this validation fails,
 then an ORA-600 [4194] will be triggered.

报错原因

This also can be cause by the following defect

Bug 8240762 Abstract: Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] after SHRINK

Details: 
Undo corruption may be caused after a shrink and the same undo block may be used 
for two different transactions causing several internal errors like:
ORA-600 [4193] / ORA-600 [4194] for new transactions
ORA-600 [4137] for a transaction rollback

处理步骤

Best practice to create a new undo tablespace.
This method includes segment check.

Create pfile from spfile to edit
>create pfile from spfile;

1. Shutdown the instance

2. set the following parameters in the pfile
    undo_management = manual
    event = '10513 trace name context forever, level 2'

3. >startup restrict pfile=<initsid.ora>

4. >select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';

This is critical - we are looking for all undo segments to be offline - System will always be online.

If any are 'PARTLY AVAILABLE' or 'NEEDS RECOVERY' - Please open an issue with Oracle Support or update the current SR.

If all offline then continue to the next step

5. Create new undo tablespace - example
>create undo tablespace <new undo tablespace> datafile <datafile> size 2000M;

6. Drop old undo tablespace
>drop tablespace <old undo tablespace> including contents and datafiles;

7. >shutdown immediate;

8 >startup nomount;  --> Using your Original spfile

9 modify the spfile with the new undo tablespace name

  Alter system set undo_tablespace = '<new tablespace created in step 5>' scope=spfile;

10. >shutdown immediate;

11. >startup;  --> Using spfile
 


The reason we create a new undo tablespace first is to use new undo segment numbers
 that are higher then the current segments being used.
This way when a transaction goes to do block clean-out 
the reference to that undo segment does not exist and continues with the block clean-out.

参考：tep by step to resolve ORA-600 4194 4193 4197 on database crash (Doc ID 1428786.1)

发表在 Oracle备份恢复 | 标签为 ORA-600, ORA-600 4137, ORA-600 4193, ORA-600 4194, ORA-600恢复 | 评论关闭

ORA-600 4193 错误说明和解决

发表于 2016 年 7 月 27 日由惜分飞

ORA-600 4193 解释说明

ERROR:              

  Format: ORA-600 [4193] [a] [b]

VERSIONS:           
  versions 6.0 to 12.1

DESCRIPTION:        

  A mismatch has been detected between Redo records and Rollback (Undo) 
  records.

  We are validating the Undo block sequence number in the undo block against 
  the Redo block sequence number relating to the change being applied.

  This error is reported when this validation fails.

ARGUMENTS:
  Arg [a] Undo record seq number
  Arg [b] Redo record seq number

FUNCTIONALITY:
  KERNEL TRANSACTION UNDO





ORA-600 [4193] [a] [b] [ ] [ ]  [ ]        
Versions: 7.2.2  - 9.2.0                              Source: ktuc.c
===========================================================================
Meaning: seq# mismatch while adding an undo record to an undo block. This 
         is done by the application of redo. 
---------------------------------------------------------------------------
Argument Description:

    a. (ktubhseq): undo record seq# - this is the seq# of the block that 
                                      this undo record WILL BE APPLIED TO. 
                                      This is from the Undo Block. It is 
                                      NOT the seq# of the undo block itself.
                                      
    b. (ktudbseq): redo RECORD seq# - this is the seq# number in the block 
                                      that this redo WILL BE APPLIED TO. 
                                      This is from the Redo Record. 

---------------------------------------------------------------------------
Diagnosis:

    This error is raised in kturdb which handles the adding of undo records 
    by the application of redo. 
    
    When we try to apply redo to an undo block (forward changes are made by 
    the application of redo to a block) we check that the seq# in the undo 
    record matches the seq# in the redo record. These seq# should be the 
    same because when we apply a redo record we must apply it to the 
    correct version of the block. We can only apply a redo record to a 
    block that contains the same seq# as in the redo record. 

    If the seq# do not match then this error is raised. This implies some 
    kind of block corruption in either the redo or the undo block. 

7.3.x - 8.1.7.x
ASSERT2(ubh->ktubhseq == db->ktudbseq, OERI(4193), KSESVSGN,
            ubh->ktubhseq, db->ktudbseq);
9.2.x
ksesic2(OERI(4193), ksenrg(ubh->ktubhseq), ksenrg(db->ktudbseq));

struct ktubh
{
  kxid  ktubhxid;      /* txid of tx currently using or last used this block */
  ub2   ktubhseq;                              /* undo block sequence number */
  ub1   ktubhcnt;    /* high water mark record index, number of undo entries */
  ub1   ktubhirb;  /* rollback record index, rec index to start the rollback */
  ub1   ktubhicl;  /* collecting record index, rec index to start retrieving col info */
  ub1   ktubhflg;                                                 /* dummy */
  ub2   ktubhidx[1];     /* byte offset of record in block, grows at runtime */
};

struct ktudb   Kernel Transaction Undo Data operation Block (redo)
{
  ub2    ktudbsiz;                                          /* size of entry */
  ub2    ktudbspc;                 /* verification: space left in undo block */
  ub2    ktudbflg;            /* flag to indicate the kind of redo operation */
  kxid   ktudbxid;                                          /* current tx id */
  ub2    ktudbseq;                                  /* block sequence number */
  ub1    ktudbrec;                       /* new record index for this change */
};

ORA 600 4193 处理方法同How to resolve ORA-600 [4194] errors

发表在 Oracle备份恢复 | 标签为 ORA-600 4193 | 评论关闭

ORACLE 8.1.7 数据库ORA-600 4000故障恢复

发表于 2014 年 8 月 22 日由惜分飞

在数据库的恢复过程中遇到ORA-600 4000错误挺多的，但是在oracle 8i(8.1.7)中遇到此类问题，还是第一次，做个记忆，供参考：
数据库故障起因：因为存储异常，导致当前redo损坏，并_allow_resetlogs_corruption参数尝试打开数据库

Media Recovery Log 
kcrrga: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup cont...
Wed Aug 20 23:01:43 2014
ALTER DATABASE RECOVER    CANCEL  
Media Recovery Cancelled
Completed: ALTER DATABASE RECOVER    CANCEL  
Wed Aug 20 23:01:50 2014
alter database open resetlogs

RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 262618871
Wed Aug 20 23:01:50 2014
Thread 1 opened at log sequence 1
  Current log# 3 seq# 1 mem# 0: F:\REDO01.LOG
Successful open of redo thread 1.
Wed Aug 20 23:01:50 2014
SMON: enabling cache recovery
Wed Aug 20 23:01:50 2014
Errors in file C:\oracle\admin\YCFD\udump\ORA00320.TRC:
ORA-00600: ??????????: [4000], [3], [], [], [], [], [], []

SMON: disabling cache recovery
Wed Aug 20 23:01:51 2014
ORA-704 signalled during: alter database open resetlogs

数据库遭遇ORA-600 4000错误，数据库无法打开，分析对应trace日志

Dump file C:\oracle\admin\YCFD\udump\ORA00320.TRC
Wed Aug 20 23:01:50 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: ycfd

Redo thread mounted by this instance: 1

Oracle process number: 8

Windows thread id: 320, image: ORACLE.EXE


*** SESSION ID:(7.1) 2014-08-20 23:01:50.838
*** 2014-08-20 23:01:50.838
ksedmp: internal or fatal error
ORA-00600: ??????????: [4000], [3], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
----- Call Stack Trace -----

这里可以看出来，是因为数据库在启动之时需要执行select ctime, mtime, stime from obj$ where obj# = :1语句，但是由于每种原因出现ORA-600 4000导致数据库无法正常启动，继续分析日志

lock header dump:  0x0040003e
 Object id on Block? Y
 seg/obj: 0x12  csc: 0x00.fb5c5c5  itc: 1  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0003.012.0002ae94    uba: 0x00801f5b.5389.11  --U-    1  fsc 0x0000.0fb5c5c6


SQL> select checkpoint_change# from v$database;

263570122

此处比较明显,通过xid可以知道三号回滚段中对应一个事务出现问题：
1. 该block为file 1 bock 62，object_id为 18（obj$)上有一个事务
2. 该事务的scn为263,570,886>database scn(263570122)导致该故障发生
3. 当数据库访问到file 1 block 62的时候，发现有一个事务，而该事务的scn大于数据库scn，从而出现ORA-600[4000]
解决该问题有几种方法
1. 修改block 62，人工提交该事务
2. 修改数据库scn，让数据库scn大于itl scn
解决block 62 事务问题后出现如下错误

Wed Aug 20 23:03:55 2014
SMON: enabling cache recovery
Wed Aug 20 23:03:55 2014
Dictionary check beginning
Dictionary check complete
Wed Aug 20 23:03:55 2014
SMON: enabling tx recovery
Wed Aug 20 23:03:56 2014
Errors in file C:\oracle\admin\YCFD\bdump\ycfdSMON.TRC:
ORA-00600: internal error code, arguments: [4193], [21173], [21181], [], [], [], [], []

Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: F:\REDO03.LOG
SMON: terminating instance due to error 600
Instance terminated by SMON, pid = 2468

数据库出现ORA-600 4193,这个是常见错误,因为redo记录和undo记录不匹配导致,可以直接使用_corrupted_rollback_segments/_offline_rollback_segments屏蔽回滚段跳过

Wed Aug 20 23:08:10 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
SMON: about to recover undo segment 1
SMON: mark undo segment 1 as needs recovery
SMON: about to recover undo segment 2
SMON: mark undo segment 2 as needs recovery
SMON: about to recover undo segment 3
SMON: mark undo segment 3 as needs recovery
SMON: about to recover undo segment 4
SMON: mark undo segment 4 as needs recovery
SMON: about to recover undo segment 5
SMON: mark undo segment 5 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 7
SMON: mark undo segment 7 as needs recovery
SMON: about to recover undo segment 1
SMON: mark undo segment 1 as needs recovery
SMON: about to recover undo segment 2
SMON: mark undo segment 2 as needs recovery
SMON: about to recover undo segment 3
SMON: mark undo segment 3 as needs recovery
SMON: about to recover undo segment 4
SMON: mark undo segment 4 as needs recovery
SMON: about to recover undo segment 5
SMON: mark undo segment 5 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 7
SMON: mark undo segment 7 as needs recovery
Wed Aug 20 23:08:15 2014
Completed: alter database open

其他类似文章：
ORA-600[4194]/[4193]解决
 通过bbed解决ORA-600 4000案例
 通过bbed解决ORA-00600[4000]案例
 记录一次ORA-600 4000数据库故障恢复