分类目录归档:Oracle ASM

ORA-15130: diskgroup “ORADATA” is being dismounted

磁盘组mount之后,立马又dismount

Sat Dec 25 17:48:45 2021
SQL> alter diskgroup ORADATA mount 
NOTE: cache registered group ORADATA number=5 incarn=0xd4b7ac6a
NOTE: cache began mount (first) of group ORADATA number=5 incarn=0xd4b7ac6a
NOTE: Assigning number (5,24) to disk (/dev/mapper/data31)
NOTE: Assigning number (5,26) to disk (/dev/mapper/data33)
NOTE: Assigning number (5,21) to disk (/dev/mapper/data29)
NOTE: Assigning number (5,23) to disk (/dev/mapper/data30)
NOTE: Assigning number (5,25) to disk (/dev/mapper/data32)
NOTE: Assigning number (5,19) to disk (/dev/mapper/data27)
NOTE: Assigning number (5,20) to disk (/dev/mapper/data28)
NOTE: Assigning number (5,18) to disk (/dev/mapper/data26)
NOTE: Assigning number (5,14) to disk (/dev/mapper/data22)
NOTE: Assigning number (5,17) to disk (/dev/mapper/data25)
NOTE: Assigning number (5,16) to disk (/dev/mapper/data24)
NOTE: Assigning number (5,15) to disk (/dev/mapper/data23)
NOTE: Assigning number (5,13) to disk (/dev/mapper/data21)
NOTE: Assigning number (5,12) to disk (/dev/mapper/data20)
NOTE: Assigning number (5,10) to disk (/dev/mapper/data19)
NOTE: Assigning number (5,9) to disk (/dev/mapper/data18)
NOTE: Assigning number (5,8) to disk (/dev/mapper/data17)
NOTE: Assigning number (5,3) to disk (/dev/mapper/data12)
NOTE: Assigning number (5,22) to disk (/dev/mapper/data3)
NOTE: Assigning number (5,2) to disk (/dev/mapper/data11)
NOTE: Assigning number (5,7) to disk (/dev/mapper/data16)
NOTE: Assigning number (5,28) to disk (/dev/mapper/data5)
NOTE: Assigning number (5,32) to disk (/dev/mapper/data9)
NOTE: Assigning number (5,6) to disk (/dev/mapper/data15)
NOTE: Assigning number (5,5) to disk (/dev/mapper/data14)
NOTE: Assigning number (5,4) to disk (/dev/mapper/data13)
NOTE: Assigning number (5,1) to disk (/dev/mapper/data10)
NOTE: Assigning number (5,30) to disk (/dev/mapper/data7)
NOTE: Assigning number (5,29) to disk (/dev/mapper/data6)
NOTE: Assigning number (5,31) to disk (/dev/mapper/data8)
NOTE: Assigning number (5,11) to disk (/dev/mapper/data2)
NOTE: Assigning number (5,27) to disk (/dev/mapper/data4)
NOTE: Assigning number (5,0) to disk (/dev/mapper/data1)
Sat Dec 25 17:48:52 2021
NOTE: GMON heartbeating for grp 5
GMON querying group 5 at 153 for pid 32, osid 68608
NOTE: cache opening disk 0 of grp 5: ORADATA_0000 path:/dev/mapper/data1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 5: ORADATA_0001 path:/dev/mapper/data10
NOTE: cache opening disk 2 of grp 5: ORADATA_0002 path:/dev/mapper/data11
NOTE: cache opening disk 3 of grp 5: ORADATA_0003 path:/dev/mapper/data12
NOTE: cache opening disk 4 of grp 5: ORADATA_0004 path:/dev/mapper/data13
NOTE: cache opening disk 5 of grp 5: ORADATA_0005 path:/dev/mapper/data14
NOTE: cache opening disk 6 of grp 5: ORADATA_0006 path:/dev/mapper/data15
NOTE: cache opening disk 7 of grp 5: ORADATA_0007 path:/dev/mapper/data16
NOTE: cache opening disk 8 of grp 5: ORADATA_0008 path:/dev/mapper/data17
NOTE: cache opening disk 9 of grp 5: ORADATA_0009 path:/dev/mapper/data18
NOTE: cache opening disk 10 of grp 5: ORADATA_0010 path:/dev/mapper/data19
NOTE: cache opening disk 11 of grp 5: ORADATA_0011 path:/dev/mapper/data2
NOTE: cache opening disk 12 of grp 5: ORADATA_0012 path:/dev/mapper/data20
NOTE: cache opening disk 13 of grp 5: ORADATA_0013 path:/dev/mapper/data21
NOTE: cache opening disk 14 of grp 5: ORADATA_0014 path:/dev/mapper/data22
NOTE: cache opening disk 15 of grp 5: ORADATA_0015 path:/dev/mapper/data23
NOTE: cache opening disk 16 of grp 5: ORADATA_0016 path:/dev/mapper/data24
NOTE: cache opening disk 17 of grp 5: ORADATA_0017 path:/dev/mapper/data25
NOTE: cache opening disk 18 of grp 5: ORADATA_0018 path:/dev/mapper/data26
NOTE: cache opening disk 19 of grp 5: ORADATA_0019 path:/dev/mapper/data27
NOTE: cache opening disk 20 of grp 5: ORADATA_0020 path:/dev/mapper/data28
NOTE: cache opening disk 21 of grp 5: ORADATA_0021 path:/dev/mapper/data29
NOTE: cache opening disk 22 of grp 5: ORADATA_0022 path:/dev/mapper/data3
NOTE: cache opening disk 23 of grp 5: ORADATA_0023 path:/dev/mapper/data30
NOTE: cache opening disk 24 of grp 5: ORADATA_0024 path:/dev/mapper/data31
NOTE: cache opening disk 25 of grp 5: ORADATA_0025 path:/dev/mapper/data32
NOTE: cache opening disk 26 of grp 5: ORADATA_0026 path:/dev/mapper/data33
NOTE: cache opening disk 27 of grp 5: ORADATA_0027 path:/dev/mapper/data4
NOTE: cache opening disk 28 of grp 5: ORADATA_0028 path:/dev/mapper/data5
NOTE: cache opening disk 29 of grp 5: ORADATA_0029 path:/dev/mapper/data6
NOTE: cache opening disk 30 of grp 5: ORADATA_0030 path:/dev/mapper/data7
NOTE: cache opening disk 31 of grp 5: ORADATA_0031 path:/dev/mapper/data8
NOTE: cache opening disk 32 of grp 5: ORADATA_0032 path:/dev/mapper/data9
NOTE: cache mounting (first) external redundancy group 5/0xD4B7AC6A (ORADATA)
Sat Dec 25 17:48:52 2021
* allocate domain 5, invalid = TRUE 
kjbdomatt send to inst 2
Sat Dec 25 17:48:52 2021
NOTE: attached to recovery domain 5
NOTE: starting recovery of thread=1 ckpt=92.6417 group=5 (ORADATA)
NOTE: advancing ckpt for group 5 (ORADATA) thread=1 ckpt=92.6418
NOTE: cache recovered group 5 to fcn 0.9502919
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Sat Dec 25 17:48:52 2021
NOTE: LGWR attempting to mount thread 1 for diskgroup 5 (ORADATA)
NOTE: LGWR found thread 1 closed at ABA 92.6417
NOTE: LGWR mounted thread 1 for diskgroup 5 (ORADATA)
NOTE: LGWR opening thread 1 at fcn 0.9502919 ABA 93.6418
NOTE: cache mounting group 5/0xD4B7AC6A (ORADATA) succeeded
NOTE: cache ending mount (success) of group ORADATA number=5 incarn=0xd4b7ac6a
Sat Dec 25 17:48:53 2021
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 5
SUCCESS: diskgroup ORADATA was mounted
SUCCESS: alter diskgroup ORADATA mount
Sat Dec 25 17:48:53 2021
NOTE: diskgroup resource ora.ORADATA.dg is online
WARNING:cache read  a corrupt block: group=5(ORADATA)dsk=5 blk=2 disk=5(ORADATA_0005)incarn=2406 au=0 blk=2 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
NOTE: a corrupted block from group ORADATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc
WARNING:cache read(retry)a corrupt block:group=5(ORADATA)dsk=5 blk=2 disk=5(ORADATA_0005)incarn=2406 au=0 blk=2 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
ERROR: cache failed to read group=5(ORADATA) dsk=5 blk=2 from disk(s): 5(ORADATA_0005)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
NOTE: cache initiating offline of disk 5 group ORADATA
NOTE: process _rbal_+asm1 (48956) initiating offline of disk 5.240607694 (ORADATA_0005) with mask 0x7e in group 5
NOTE: initiating PST update: grp = 5, dsk = 5/0xe5761ce, mask = 0x6a, op = clear
GMON updating disk modes for group 5 at 155 for pid 18, osid 48956
ERROR: Disk 5 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 5)
Sat Dec 25 17:48:55 2021
NOTE: cache dismounting (not clean) group 5/0xD4B7AC6A (ORADATA) 
WARNING: Offline for disk ORADATA_0005 in mode 0x7f failed.
Sat Dec 25 17:48:55 2021
NOTE: halting all I/Os to diskgroup 5 (ORADATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 22744, image: oracle@wxzldb1 (B000)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc  (incident=1289754):
ORA-15335: ASM metadata corruption detected in disk group 'ORADATA'
ORA-15130: diskgroup "ORADATA" is being dismounted
ORA-15066: offlining disk "ORADATA_0005" in group "ORADATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_1289754/+ASM1_rbal_48956_i1289754.trc
NOTE: LGWR doing non-clean dismount of group 5 (ORADATA)
NOTE: LGWR sync ABA=93.6418 last written ABA 93.6418
kjbdomdet send to inst 2
detach from dom 5, sending detach message to inst 2
Sat Dec 25 17:48:56 2021
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 4)
Sat Dec 25 17:48:56 2021
Sweep [inc][1289754]: completed
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 5 invalid = TRUE 
 41 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
freeing rdom 5
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_1289754/+ASM1_rbal_48956_i1289754.trc
WARNING: dirty detached from domain 5
NOTE: cache dismounted group 5/0xD4B7AC6A (ORADATA) 

问题比较明显是由于disk=5 au=0 blk=2有问题导致磁盘组mount之后立马异常.通过kfed分析对应block情况

C:\Users\XFF>kfed read h:\temp\asmdisk\data14.dd|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483653 ; 0x008: disk=5
kfbh.check:                   314993330 ; 0x00c: 0x12c66ab2
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        5 ; 0x024: 0x0005
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:            ORADATA_0005 ; 0x028: length=12
kfdhdb.grpname:                 ORADATA ; 0x048: length=7
kfdhdb.fgname:             ORADATA_0005 ; 0x068: length=12

C:\Users\XFF>kfed read h:\temp\asmdisk\data14.dd aun=0 blkn=2|more
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
0066D8200 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

通过kfed分析,该block确实异常,该block主要记录au的分配信息,如果asm 磁盘组的空间不变化,不执行rebalance,一般不会主动访问该block,不访问该block磁盘组也就不会dismount,按照这个解决思路,通过patch解决,让oradata磁盘组不再执行rebalance和分配/回收空间即可一直稳定的mount
20211227210314


数据库直接open成功,实现数据0丢失
20211227210411

发表在 Oracle ASM, Oracle备份恢复 | 标签为 , , , , , | 评论关闭

ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

asm磁盘组增加磁盘进行扩容之后报ORA-15335: ASM metadata corruption detected in disk group ‘DATA’和ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479],磁盘组dismount,然后mount之后立马dismount掉.

Tue Jun 29 09:19:09 2021
SQL> ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */ 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (2,1) to disk (/dev/raw/raw5)
NOTE: requesting all-instance membership refresh for group=2
NOTE: initializing header on grp 2 disk DATA_0001
NOTE: requesting all-instance disk validation for group=2
Tue Jun 29 09:19:11 2021
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: requesting all-instance disk validation for group=2
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: initiating PST update: grp = 2
Tue Jun 29 09:19:16 2021
GMON updating group 2 at 7 for pid 27, osid 25020
NOTE: PST update grp = 2 completed successfully 
NOTE: membership refresh pending for group 2/0xb0c845ce (DATA)
GMON querying group 2 at 8 for pid 18, osid 3852
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/raw/raw5
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
GMON querying group 2 at 9 for pid 18, osid 3852
SUCCESS: refreshed membership for 2/0xb0c845ce (DATA)
Tue Jun 29 09:19:20 2021
SUCCESS: ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */
NOTE: starting rebalance of group 2/0xb0c845ce (DATA) at power 1
Starting background process ARB0
Tue Jun 29 09:19:21 2021
ARB0 started with pid=33, OS id=25176 
NOTE: assigning ARB0 to group 2/0xb0c845ce (DATA) with 1 parallel I/O
cellip.ora not found.
Tue Jun 29 09:19:24 2021
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Jun 29 09:19:46 2021
WARNING: cache read  a corrupt block: group=2(DATA) dsk=0 blk=7 disk=0 (DATA_0000) incarn=3915953476 au=0 blk=7 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ERROR: cache failed to read group=2(DATA) dsk=0 blk=7 from disk(s): 0(DATA_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: cache initiating offline of disk 0 group DATA
NOTE: process _arb0_+asm1 (25176) initiating offline of disk 0.3915953476 (DATA_0000) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 0/0xe968b544, mask = 0x6a, op = clear
Tue Jun 29 09:19:46 2021
GMON updating disk modes for group 2 at 10 for pid 33, osid 25176
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Jun 29 09:19:46 2021
NOTE: cache dismounting (not clean) group 2/0xB0C845CE (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 25395, image: oracle@frsrac1 (B000)
Tue Jun 29 09:19:46 2021
NOTE: halting all I/Os to diskgroup 2 (DATA)
Tue Jun 29 09:19:46 2021
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=11.10715 last written ABA 11.10715
WARNING: Offline for disk DATA_0000 in mode 0x7f failed.
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc  (incident=54665):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_54665/+ASM1_arb0_25176_i54665.trc
Tue Jun 29 09:19:46 2021
kjbdomdet send to inst 2
detach from dom 2, sending detach message to inst 2
Tue Jun 29 09:19:46 2021
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 24)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 2 invalid = TRUE 
 796 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Tue Jun 29 09:19:46 2021
WARNING: dirty detached from domain 2
NOTE: cache dismounted group 2/0xB0C845CE (DATA) 
SQL> alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */ 
Tue Jun 29 09:19:47 2021
ERROR: ORA-15130 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Tue Jun 29 09:19:47 2021
NOTE: stopping process ARB0
Tue Jun 29 09:19:47 2021
Sweep [inc][54665]: completed
Tue Jun 29 09:19:47 2021
Sweep [inc2][54665]: completed
NOTE: cache deleting context for group DATA 2/0xb0c845ce
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_3852.trc:
ORA-15130: diskgroup "DATA" is being dismounted
GMON dismounting group 2 at 11 for pid 27, osid 25395
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup DATA was dismounted
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */

通过kfed分析报错block,确认错误

kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       7 ; 0x004: blk=7
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2183628676 ; 0x00c: 0x82278784 <<======该值错误,应该为:686982479
kfbh.fcn.base:                     3430 ; 0x010: 0x00000d66
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdatb.aunum:                      2240 ; 0x000: 0x000008c0
kfdatb.shrink:                      448 ; 0x004: 0x01c0
kfdatb.ub2pad:                        0 ; 0x006: 0x0000

通过修复该错误,并且禁止reblance操作[增加磁盘数据需要重新分布],mount磁盘组,然后open库,发现redo已经被覆盖(非归档),强制打开库报错

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [0], [2691201882], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [0], [2691201881], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [2691201879], [0],
[2691227745], [12583040], [], [], [], [], [], []
Process ID: 25110
Session ID: 287 Serial number: 3

通过对scn进行处理,数据库顺利open

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 5044088832 bytes
Fixed Size		    2261928 bytes
Variable Size		 1442843736 bytes
Database Buffers	 3590324224 bytes
Redo Buffers		    8658944 bytes
Database mounted.
SQL> alter database open;

Database altered.
发表在 Oracle ASM | 标签为 , , | 评论关闭

ORA-600 kffmLoad_1 kffmVerify_4

有朋友asm运行一段时间asm实例会报错导致数据库实例异常

Wed Dec 23 08:31:55 2020
Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_6729.trc:
ORA-00600: internal error code, arguments: [kffmLoad_1], [4365], [1], [], [], [], [], []
Wed Dec 23 08:31:55 2020
Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_6729.trc:
ORA-00600: internal error code, arguments: [kffmLoad_1], [4365], [1], [], [], [], [], []

Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_29743.trc:
ORA-00600: internal error code, arguments: [kffmLoad_1], [670], [1], [], [], [], [], []
Wed Dec 23 09:10:22 2020
Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_29743.trc:
ORA-00600: internal error code, arguments: [kffmLoad_1], [670], [1], [], [], [], [], []
Wed Dec 23 09:10:22 2020

Wed Dec 23 10:18:33 2020
Errors in file /u01/app/oracle/admin/+ASM/udump/+asm1_ora_25890.trc:
ORA-00600: internal error code, arguments: [kffmVerify_4], [0], [0], [887], [1005986561], [1352], [1], [0]

对应的trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db
System name:	Linux
Node name:	shb01
Release:	2.6.18-348.el5
Version:	#1 SMP Wed Nov 28 21:22:00 EST 2012
Machine:	x86_64
Instance name: +ASM1
Redo thread mounted by this instance: 0 <none>
Oracle process number: 29
Unix process pid: 26337, image: oracle@xff01 (TNS V1-V3)

*** ACTION NAME:() 2020-12-22 19:03:41.272
*** MODULE NAME:(sp_ocap@xff01 (TNS V1-V3)) 2020-12-22 19:03:41.272
*** SERVICE NAME:() 2020-12-22 19:03:41.272
*** SESSION ID:(143.1) 2020-12-22 19:03:41.272
*** 2020-12-22 19:03:41.272
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kffmVerify_4], [0], [0], [1657], [1005987045], [152], [1], [0]
Current SQL statement for this session:
DECLARE 
fileType varchar2(16); 
fileName varchar2(1024); 
blkSz number; 
fileSz number; 
hdl number; 
plksz number;
BEGIN
fileName := '+DATA4/xifenfei/onlinelog/group_6.1657.1005987045'; 
BEGIN
dbms_diskgroup.getfileattr(fileName,fileType,fileSz, blkSz); 
dbms_diskgroup.open(fileName,'r',fileType,blkSz,hdl,plkSz,fileSz); 
EXCEPTION
WHEN OTHERS then
  :rc := SQLCODE;
  :err_msg := SQLERRM;
  return;
END;
:handle := hdl; 
:bsz := blkSz; 
:bcnt := fileSz; 
:rc := 0;
END;
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x15ce59360        96  package body SYS.X$DBMS_DISKGROUP
0x15cd88568        12  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFBDFC3450 ? 7FFFBDFC34B0 ?
                                                   7FFFBDFC33F0 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFBDFC3450 ? 7FFFBDFC34B0 ?
                                                   7FFFBDFC33F0 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FFFBDFC3450 ? 7FFFBDFC34B0 ?
                                                   7FFFBDFC33F0 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FFFBDFC3450 ? 7FFFBDFC34B0 ?
                                                   7FFFBDFC33F0 ? 000000000 ?
kgeasnmierr()+163    call     kgerinv()            0068996E0 ? 009AA2670 ?
                                                   7FFFBDFC34B0 ? 7FFFBDFC33F0 ?
                                                   000000000 ? 000000000 ?
kffmVerify()+379     call     kgeasnmierr()        0068996E0 ? 009AA2670 ?
                                                   7FFFBDFC34B0 ? 7FFFBDFC33F0 ?
                                                   000000000 ? 000000000 ?
kfioIdentify()+1276  call     kffmVerify()         000000000 ? 00000000D ?
                                                   000000001 ?
                                                   927B814400000004 ?
                                                   3BF624E500000679 ?
                                                   000000000 ?
ksfd_osmopn()+1138   call     kfioIdentify()       7FFFBDFC4820 ? 15DB873F4 ?
                                                   15DB87556 ? 000000200 ?
                                                   7FFF00000003 ? 15DB873C8 ?
ksfdopn()+1014       call     ksfd_osmopn()        7FFFBDFC4820 ? 00000002D ?
                                                   000000200 ? 000000003 ?
                                                   2B3800020000 ? 15F3031F0 ?
kfpkgDGOpenFile()+2  call     ksfdopn()            7FFFBDFC4820 ? 00000002D ?
301                                                000000200 ? 000000003 ?
                                                   000020000 ? 15F3031F0 ?
pevm_icd_call_commo  call     kfpkgDGOpenFile()    2B383F459FA8 ? 00000002D ?
n()+1003                                           2B383F439070 ? 000000003 ?
                                                   000020000 ? 15F3031F0 ?
pfrinstr_ICAL()+228  call     pevm_icd_call_commo  7FFFBDFC5700 ? 000000000 ?
                              n()                  000000001 ? 000000001 ?
                                                   000000007 ? 7FFF00000000 ?
pfrrun_no_tool()+65  call     pfrinstr_ICAL()      2B383F459FA8 ? 005DBD8AA ?
                                                   2B383F45A010 ? 000000001 ?
                                                   000000007 ? 7FFF00000000 ?
pfrrun()+906         call     pfrrun_no_tool()     2B383F459FA8 ? 005DBD8AA ?
                                                   2B383F45A010 ? 000000001 ?
                                                   000000007 ? 7FFF00000000 ?
plsql_run()+841      call     pfrrun()             2B383F459FA8 ? 000000000 ?
                                                   2B383F45A010 ? 7FFFBDFC5700 ?
                                                   000000007 ? 15CD77BD6 ?
peicnt()+298         call     plsql_run()          2B383F459FA8 ? 000000001 ?
                                                   000000000 ? 7FFFBDFC5700 ?
                                                   000000007 ? 900000000 ?
kkxexe()+503         call     peicnt()             7FFFBDFC5700 ? 2B383F459FA8 ?
                                                   2B383F438830 ? 7FFFBDFC5700 ?
                                                   2B383F4367D8 ? 900000000 ?
opiexe()+4691        call     kkxexe()             2B383F4561D8 ? 2B383F459FA8 ?
                                                   2B383F438830 ? 15C160BD8 ?
                                                   0040D677F ? 900000000 ?
kpoal8()+2273        call     opiexe()             000000049 ? 000000003 ?
                                                   7FFFBDFC6950 ? 000000001 ?
                                                   0040D677F ? 900000000 ?
opiodr()+984         call     kpoal8()             00000005E ? 000000017 ?
                                                   7FFFBDFC9830 ? 000000001 ?
                                                   000000001 ? 900000000 ?
ttcpip()+1012        call     opiodr()             00000005E ? 000000017 ?
                                                   7FFFBDFC9830 ? 000000000 ?
                                                   0059C35D0 ? 900000000 ?
opitsk()+1322        call     ttcpip()             0068A13B0 ? 7FFFBDFC75A0 ?
                                                   7FFFBDFC9830 ? 000000000 ?
                                                   7FFFBDFC9328 ? 7FFFBDFC9998 ?
opiino()+1026        call     opitsk()             000000003 ? 000000000 ?
                                                   7FFFBDFC9830 ? 000000001 ?
                                                   000000000 ? 4E6111C00000001 ?
opiodr()+984         call     opiino()             00000003C ? 000000004 ?
                                                   7FFFBDFCA9F8 ? 000000001 ?
                                                   000000000 ? 4E6111C00000001 ?
opidrv()+547         call     opiodr()             00000003C ? 000000004 ?
                                                   7FFFBDFCA9F8 ? 000000000 ?
                                                   0059C3080 ? 4E6111C00000001 ?
sou2o()+114          call     opidrv()             00000003C ? 000000004 ?
                                                   7FFFBDFCA9F8 ? 000000000 ?
                                                   0059C3080 ? 4E6111C00000001 ?
opimai_real()+163    call     sou2o()              7FFFBDFCA9D0 ? 00000003C ?
                                                   000000004 ? 7FFFBDFCA9F8 ?
                                                   0059C3080 ? 4E6111C00000001 ?
main()+116           call     opimai_real()        000000002 ? 7FFFBDFCAA60 ?
                                                   000000004 ? 7FFFBDFCA9F8 ?
                                                   0059C3080 ? 4E6111C00000001 ?
__libc_start_main()  call     main()               000000002 ? 7FFFBDFCAA60 ?
+244                                               000000004 ? 7FFFBDFCA9F8 ?
                                                   0059C3080 ? 4E6111C00000001 ?
_start()+41          call     __libc_start_main()  0007230B8 ? 000000002 ?
                                                   7FFFBDFCABB8 ? 000000000 ?
                                                   0059C3080 ? 000000002 ?
 
--------------------- Binary Stack Dump ---------------------

结合mos信息ORA-600[KFFMVERIFY_4] OR ORA-600 [kffmLoad_1], [131635] REPORTED ON THE ASMINSTANCE (Doc ID 794103.1)的描述,由于多个进程/现场使用dbms_diskgroup访问不同磁盘组之时可能触发
BUG:6377738 – ASMB ORA-00600 [KFFMVERIFY_4]
BUG:8328467 – ASM CRASHED WITH ORA-600[KFFMVERIFY_4] OR [KFFMVERIFY_4] AND [KFFMLOAD_1]
从而导致asm实例crash,引起数据库异常.结合客户这边的情况,确认他们是使用了多个SharePlex程序同步数据,而且redo放在多个磁盘组中,从而出现该问题.临时解决方案为把所有的redo和归档放一个磁盘组,这样多个SharePlex进程调用dbms_diskgroup访问redo/arch不会触发该bug.

发表在 Oracle ASM | 标签为 , | 评论关闭