联系:手机/微信(+86 17813235971) QQ(107644445)
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
几个月以前的一个数据库故障,今天拿出来在win上重新分析,数据库启动报ORA-600 6711错
C:\Users\XFF>SQLPLUS / AS SYSDBA SQL*Plus: Release 12.1.0.2.0 Production on 星期日 7月 14 16:17:32 2024 Copyright (c) 1982, 2014, Oracle. All rights reserved. 已连接到空闲例程。 SQL> startup mount pfile='d:/pfile.txt' ORACLE 例程已经启动。 Total System Global Area 6442450944 bytes Fixed Size 6205768 bytes Variable Size 1493175992 bytes Database Buffers 4932501504 bytes Redo Buffers 10567680 bytes 数据库装载完毕。 SQL> alter database open; alter database open * 第 1 行出现错误: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00600: internal error code, arguments: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], [] 进程 ID: 44144 会话 ID: 67 序列号: 39084
根据经验该报错为:ORA-600 [6711] “Cluster Key Chain corruption”,也就是说很可能是cluster相关对象异常导致该问题.
对启动过程进行跟踪
PARSING IN CURSOR #17695456 len=189 dep=4 tim=233428646426 hv=186852205 ad='7ffda1eea168' sqlid='2tkw12w5k68vd' select user#,password,datats#,tempts#,type#,defrole,resource$, ptime, decode(defschclass,NULL,'DEFAULT_CONSUMER_GROUP',defschclass), spare1,spare4,ext_username,spare2 from user$ where name=:1 END OF STMT PARSE #17695456:c=0,e=168,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=0,tim=233428646426 BINDS #17695456: Bind#0 oacdty=01 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00 oacflg=18 fl2=0001 frm=01 csi=871 siz=32 off=0 kxsbbbfp=010b2df0 bln=32 avl=03 flg=05 value="SYS" EXEC #17695456:c=0,e=418,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=1457651150,tim=233428646901 WAIT #17695456: nam='db file sequential read' ela= 126 file#=1 block#=417 blocks=1 obj#=46 tim=233428647046 FETCH #17695456:c=0,e=153,p=1,cr=2,cu=0,mis=0,r=1,dep=4,og=4,plh=1457651150,tim=233428647069 STAT #17695456 id=1 cnt=1 pid=0 pos=1 obj=22 op='TABLE ACCESS BY INDEX ROWID USER$ (cr=2 pr=1 pw=0 time=151 us cost=1 size=139 card=1)' STAT #17695456 id=2 cnt=1 pid=1 pos=1 obj=46 op='INDEX UNIQUE SCAN I_USER1 (cr=1 pr=1 pw=0 time=149 us)' CLOSE #17695456:c=0,e=2,dep=4,type=0,tim=233428647111 Incident 2601 created, dump file: C:\APP\XFF\diag\rdbms\ecp\ecp\incident\incdir_2601\ecp_ora_40516_i2601.trc ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], [] FETCH #15289752:c=2062500,e=2544215,p=13,cr=65626,cu=28,mis=0,r=0,dep=3,og=3,plh=3312420081,tim=233431176536 ===================== PARSE ERROR #387363008:len=50 dep=1 uid=0 oct=3 lid=0 tim=233431176680 err=600 select cost from resource_cost$ where resource#=:1 ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], [] ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []
这个操作触发了递归查询
PARSING IN CURSOR #387319440 len=151 dep=5 lid=0 tim=233428641503 hv=2507062328 ad='7ffd9ffa23a8' sqlid='7u49y06aqxg1s' select /*+ rule */ bucket, endpoint, col#, epvalue, epvalue_raw, ep_repeat_count from histgrm$ where obj#=:1 and intcol#=:2 and row#=:3 order by bucket END OF STMT PARSE #387319440:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641503 BINDS #387319440: Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000001 frm=00 csi=00 siz=72 off=0 kxsbbbfp=00eb2be0 bln=22 avl=02 flg=05 value=22 Bind#1 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24 kxsbbbfp=00eb2bf8 bln=22 avl=02 flg=01 value=2 Bind#2 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=48 kxsbbbfp=00eb2c10 bln=22 avl=01 flg=01 value=0 EXEC #387319440:c=0,e=105,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641652 WAIT #387319440: nam='db file sequential read' ela= 124 file#=1 block#=45660 blocks=1 obj#=66 tim=233428641792 FETCH #387319440:c=0,e=173,p=1,cr=3,cu=0,mis=0,r=20,dep=5,og=3,plh=3312420081,tim=233428641834 STAT #387319440 id=1 cnt=20 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=3 pr=1 pw=0 time=169 us cost=0 size=0 card=0)' STAT #387319440 id=2 cnt=20 pid=1 pos=1 obj=66 op='TABLE ACCESS CLUSTER HISTGRM$ (cr=3 pr=1 pw=0 time=148 us)' STAT #387319440 id=3 cnt=1 pid=2 pos=1 obj=65 op='INDEX UNIQUE SCAN I_OBJ#_INTCOL# (cr=2 pr=0 pw=0 time=2 us)' CLOSE #387319440:c=0,e=36,dep=5,type=3,tim=233428641886
查看对应的trace文件
[TOC00000] Jump to table of contents Dump continued from file: C:\APP\XFF\diag\rdbms\ecp\ecp\trace\ecp_ora_40516.trc [TOC00001] ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], [] [TOC00001-END] [TOC00002] ========= Dump for incident 2601 (ORA 600 [6711]) ======== [TOC00003] ----- Beginning of Customized Incident Dump(s) ----- kdsDumpState: cdb: 0 dspdb: 0 type: 3 *** ENTER: kds state dump *** row 0x0043b1a5.28 continuation at: 0x0043b1a5.0 file# 1 block# 242085 slot 0 (dscnt: 0) KDSTABN_GET: 1 ..... ntab: 2 curSlot: 0 ..... nrows: 40 Dumping kcb descriptor: kcbds 0x0000000017100DF0 : tsn 0, rdba 0x0043b1a5, afn 1, objd 64, cls 1, tidflg 0x0 0x0 0x0 dsflg 0x00100000, dsflg2 0x00004000, lobid 00000000:00000000, cnt 0, addr 0x00007FFD55D1C014 dx 0x0000000000000000 env [0x0000000017178C7C]: (scn: 0x0000.54290647 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000660) kcb_dw_scan_dumpctx: not in DW scan kdsgrp1_dump database not fully open *** EXIT: kds state dump *** ----- End of Customized Incident Dump(s) ----- [TOC00003-END]
通过对相关rdba进行dump分析,确认对象id为64和trace中报的信息匹配
DUL> rdba 0x0043b1a5 rdba : 0x0043b1a5=4436389 rfile# : 1 block# : 242085 DUL> dump datafile 1 block 242085 header Block Header: block type=0x06 (table/index/cluster segment data block) block format=0xa2 (oracle 10) block rdba=0x0043b1a5 (file#=1, block#=242085) scn=0x0000.438d4a86, seq=1, tail=0x4a860601 block checksum value=0xd591=54673, flag=6 Data Block Header Dump: Object id on Block? Y seg/obj: 0x40=64 csc: 0x00.438d4a80 itc: 2 flg: - typ: 1 (data) fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0002.01f.00014b92 0x00c01897.6e20.07 C--- 0 scn 0x0000.438c5fca 0x02 0x000a.01a.0011bb8e 0x00c0292c.0317.42 --U- 22 fsc 0x0000.438d4a86 Data Block Dump: ================ flag=0x0 -------- ntab=2 nrow=41 frre=23 fsbo=0x68 ffeo=0xb90 avsp=0x1ce1 tosp=0x1ce1
进一步分析该id为什么对象,使用dul unload obj$
确认对对象为cluster C_OBJ#_INTCOL#,对应的表为HISTGRM$(统计信息中存储直方图信息表),明白这一些,处理起来就比较容易了,open数据库过程中绕过该对象访问,然后对该表进行处理即可