标签归档:ORA-15196

删除asmlib磁盘导致磁盘组故障恢复

有客户执行drop disk磁盘组操作之后,然后立刻从oracle asmlib层面执行了oracleasm deletedisk,并且在操作系统层面delete partition(删除磁盘分区),导致磁盘组直接dismount

Tue Nov 26 16:44:04 2024
SQL> alter diskgroup data drop disk DATA_0008 
NOTE: GroupBlock outside rolling migration privileged region
Tue Nov 26 08:44:05 2024
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 2/0x28dec0d5 (DATA)
NOTE: requesting all-instance membership refresh for group=2
NOTE: membership refresh pending for group 2/0x28dec0d5 (DATA)
Tue Nov 26 08:44:14 2024
GMON querying group 2 at 48 for pid 18, osid 27385
SUCCESS: refreshed membership for 2/0x28dec0d5 (DATA)
SUCCESS: alter diskgroup data drop disk DATA_0008
NOTE: starting rebalance of group 2/0x28dec0d5 (DATA) at power 2
Starting background process ARB0
Tue Nov 26 08:44:14 2024
ARB0 started with pid=38, OS id=56987 
NOTE: assigning ARB0 to group 2/0x28dec0d5 (DATA) with 2 parallel I/Os
Tue Nov 26 08:44:17 2024
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Nov 26 08:44:57 2024
cellip.ora not found.
Tue Nov 26 17:08:46 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 17:10:30 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 09:34:38 2024
WARNING: cache read  a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8 (DATA_0008) incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc
WARNING:cache read (retry) a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8(DATA_0008)incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ERROR: cache failed to read group=2(DATA) dsk=8 blk=98 from disk(s): 8(DATA_0008)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _arb0_+asm1(56987)initiating offline of disk 8.3911069755 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e303b, mask = 0x6a, op = clear
Tue Nov 26 09:34:38 2024
GMON updating disk modes for group 2 at 49 for pid 38, osid 56987
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Nov 26 09:34:38 2024
NOTE: cache dismounting (not clean) group 2/0x28DEC0D5 (DATA) 
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 89645, image: oracle@ahptdb5 (B000)
Tue Nov 26 09:34:38 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc  (incident=413105):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
Tue Nov 26 09:34:39 2024
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x28dec0d5 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_27385.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ERROR: ORA-15335 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: stopping process ARB0
Tue Nov 26 09:34:40 2024
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=716.2684 last written ABA 716.2684

通过重新分区,并且kfed repair修复磁盘头操作之后,重新mount磁盘组报错

SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=2 incarn=0x73bec220
NOTE: cache began mount (first) of group DATA number=2 incarn=0x73bec220
NOTE: Assigning number (2,16) to disk (/dev/oracleasm/disks/DATA208)
NOTE: Assigning number (2,15) to disk (/dev/oracleasm/disks/DATA207)
NOTE: Assigning number (2,14) to disk (/dev/oracleasm/disks/DATA206)
NOTE: Assigning number (2,13) to disk (/dev/oracleasm/disks/DATA205)
NOTE: Assigning number (2,12) to disk (/dev/oracleasm/disks/DATA204)
NOTE: Assigning number (2,11) to disk (/dev/oracleasm/disks/DATA203)
NOTE: Assigning number (2,10) to disk (/dev/oracleasm/disks/DATA202)
NOTE: Assigning number (2,9) to disk (/dev/oracleasm/disks/DATA201)
NOTE: Assigning number (2,6) to disk (/dev/oracleasm/disks/DATA07)
NOTE: Assigning number (2,5) to disk (/dev/oracleasm/disks/DATA06)
NOTE: Assigning number (2,4) to disk (/dev/oracleasm/disks/DATA05)
NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
NOTE: Assigning number (2,3) to disk (/dev/oracleasm/disks/DATA04)
NOTE: Assigning number (2,2) to disk (/dev/oracleasm/disks/DATA03)
NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
NOTE: Assigning number (2,8) to disk (/dev/oracleasm/disks/DATA101)
Tue Nov 26 11:48:22 2024
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 83 for pid 27, osid 15781
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
NOTE: F1X0 found on disk 0 au 2 fcn 0.127835487
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/oracleasm/disks/DATA03
NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/oracleasm/disks/DATA04
NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/oracleasm/disks/DATA05
NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/oracleasm/disks/DATA06
NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/oracleasm/disks/DATA07
NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/oracleasm/disks/DATA101
NOTE: cache opening disk 9 of grp 2: DATA_0009 path:/dev/oracleasm/disks/DATA201
NOTE: cache opening disk 10 of grp 2: DATA_0010 path:/dev/oracleasm/disks/DATA202
NOTE: cache opening disk 11 of grp 2: DATA_0011 path:/dev/oracleasm/disks/DATA203
NOTE: cache opening disk 12 of grp 2: DATA_0012 path:/dev/oracleasm/disks/DATA204
NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/oracleasm/disks/DATA205
NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/oracleasm/disks/DATA206
NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/oracleasm/disks/DATA207
NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/oracleasm/disks/DATA208
NOTE: cache mounting (first) external redundancy group 2/0x73BEC220 (DATA)
Tue Nov 26 11:48:22 2024
* allocate domain 2, invalid = TRUE 
kjbdomatt send to inst 2
Tue Nov 26 11:48:22 2024
NOTE: attached to recovery domain 2
NOTE: starting recovery of thread=1 ckpt=716.1536 group=2 (DATA)
NOTE: starting recovery of thread=2 ckpt=763.6248 group=2 (DATA)
NOTE: recovery initiating offline of disk 8 group 2 (*)
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _user15781_+asm1 (15781) initiating offline of disk 8.3911069996 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e312c, mask = 0x6a, op = clear
GMON updating disk modes for group 2 at 84 for pid 27, osid 15781
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
Tue Nov 26 11:48:23 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
NOTE: recovery (pass 2) of diskgroup 2 (DATA) caught error ORA-15130
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15781.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15131: block 97 of file 8 in diskgroup 2 could not be read
ORA-15196: invalid ASM block header [kfc.c:7600] [endian_kfbh] [2147483656] [97] [0 != 1]

由于客户执行了oracleasm deletedisk,根据经验确认该操作是对asm磁盘头的前1M数据进行了清空,而客户这个asm刚好是drop disk触发了rebalance操作的时候干掉磁盘的,基于这样的情况,直接通过修复磁盘1M数据并且mount磁盘组继续使用该磁盘组的概率不大.因此处理建议:
1. 直接恢复出来该磁盘组数据然后打开该库
2. 直接提取客户需要的核心表数据
有过客户有类似操作是asmlib重新创建了磁盘信息恢复:分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
删除分区信息之后数据库恢复案例:删除分区 oracle asm disk 恢复

发表在 Oracle ASM, Oracle备份恢复 | 标签为 , , | 留下评论

kfed修复ORA-15196

有朋友的asm磁盘组因为以前遗留问题(在另外一套机器上的asm disk被加入到了一个新的asm磁盘组中,导致老的dg直接dismount,新加入asm disk的磁盘组一直在使用,未听建议进行重建),昨天突然意外dismount了

Mon Dec 18 08:38:13 2023
NOTE: No asm libraries found in the system
ASM Health Checker found 1 new failures
Mon Dec 18 08:38:35 2023
NOTE: client his2:his registered, osid 3998514, mbr 0x1
Thu Jan 04 21:44:55 2024
WARNING: cache read  a corrupt block: group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496145 au=3 blk=87 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc
WARNING: cache read (retry) a corrupt block: 
 group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496145 au=3 blk=87 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ERROR: cache failed to read group=2(DATA) fn=1 blk=6743 from disk(s): 8(DATA_0008)
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _user4915366_+asm4 (4915366) initiating offline of disk 8.1428496145 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0x55251f11, mask = 0x6a, op = clear
Thu Jan 04 21:44:55 2024
GMON updating disk modes for group 2 at 9 for pid 24, osid 4915366
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Thu Jan 04 21:44:55 2024
NOTE: cache dismounting (not clean) group 2/0x7F35EE0E (DATA)
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
Thu Jan 04 21:44:55 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 3473846, image: oracle@zzzx1 (B000)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc  (incident=4023553):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM4/incident/incdir_4023553/+ASM4_ora_4915366_i4023553.trc
Thu Jan 04 21:44:57 2024
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x7f35ee0e (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_rbal_2228716.trc:
ORA-15130: diskgroup "DATA" is being dismounted

尝试重新mount 磁盘组,片刻之后自动dismount

Thu Jan 04 23:10:35 2024
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
SUCCESS: diskgroup DATA was mounted
SUCCESS: alter diskgroup data mount
Thu Jan 04 23:10:42 2024
NOTE: diskgroup resource ora.DATA.dg is online
Thu Jan 04 23:10:47 2024
NOTE: client his2:his registered, osid 3998052, mbr 0x1
Thu Jan 04 23:11:00 2024
WARNING: cache read  a corrupt block: group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496181 au=3 blk=87 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4129826.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4129826.trc
WARNING: cache read (retry) a corrupt block: 
  group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496181 au=3 blk=87 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4129826.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ERROR: cache failed to read group=2(DATA) fn=1 blk=6743 from disk(s): 8(DATA_0008)
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743]
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _user4129826_+asm4 (4129826) initiating offline of disk 8.1428496181 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0x55251f35, mask = 0x6a, op = clear
Thu Jan 04 23:11:01 2024
GMON updating disk modes for group 2 at 21 for pid 35, osid 4129826
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Thu Jan 04 23:11:01 2024
NOTE: cache dismounting (not clean) group 2/0x1CB5EE3B (DATA)
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 5112822, image: oracle@zzzx1 (B000)

从报错信息看是DATA_0008磁盘的au 3 blkn 87的block异常,应该是block 6743被写成了6999导致了该问题

kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            4 ; 0x002: KFBTYP_FILEDIR
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                    6999 ; 0x004: blk=6999
kfbh.block.obj:                       1 ; 0x008: file=1
kfbh.check:                  3317183844 ; 0x00c: 0xc5b83564
kfbh.fcn.base:                165670551 ; 0x010: 0x09dfee97
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfffdb.node.incarn:          1145623147 ; 0x000: A=1 NUMM=0x22246935
kfffdb.node.frlist.number:   4294967295 ; 0x004: 0xffffffff
kfffdb.node.frlist.incarn:            0 ; 0x008: A=0 NUMM=0x0
kfffdb.hibytes:                       0 ; 0x00c: 0x00000000
kfffdb.lobytes:                83482624 ; 0x010: 0x04f9d800

这个处理比较简单吧

kfbh.block.blk:                    6999 ; 0x004: blk=6999
修改为
kfbh.block.blk:                    6743; 0x004: blk=6743

然后kefd merge并且尝试mount磁盘组
20240105202425


通过检查确认磁盘组不再dismount,但是由于后续元数据还有问题,导致asm无法创建新的文件,后续建议:在数据库在mount状态下,rman备份,重建该磁盘组

发表在 Oracle备份恢复 | 标签为 , , , | 评论关闭

ORA-15335 ORA-15130 ORA-15066 ORA-15196

客户反馈,数据库无法正常启动,通过分析asm的alert日志发现,data磁盘组mount成功之后,没有一会儿自动dismount掉

Mon Sep 26 16:40:14 2022
SQL> /* ASMCMD */ALTER DISKGROUP data MOUNT  
NOTE: cache registered group DATA number=2 incarn=0x9dfa705f
NOTE: cache began mount (first) of group DATA number=2 incarn=0x9dfa705f
NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
Mon Sep 26 16:40:20 2022
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 68 for pid 25, osid 14650
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
NOTE: cache mounting (first) external redundancy group 2/0x9DFA705F (DATA)
Mon Sep 26 16:40:20 2022
* allocate domain 2, invalid = TRUE 
kjbdomatt send to inst 2
Mon Sep 26 16:40:20 2022
NOTE: attached to recovery domain 2
NOTE: cache recovered group 2 to fcn 0.321845
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Mon Sep 26 16:40:20 2022
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA)
NOTE: LGWR found thread 1 closed at ABA 20.3546
NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA)
NOTE: LGWR opening thread 1 at fcn 0.321845 ABA 21.3547
NOTE: cache mounting group 2/0x9DFA705F (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0x9dfa705f
Mon Sep 26 16:40:20 2022
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
SUCCESS: diskgroup DATA was mounted
SUCCESS: /* ASMCMD */ALTER DISKGROUP data MOUNT 
Mon Sep 26 16:40:22 2022
WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD)
Mon Sep 26 16:40:47 2022
NOTE: client xff1:xff registered, osid 14742, mbr 0x0
Mon Sep 26 16:40:57 2022
WARNING: cache read  a corrupt block: group=2(DATA) dsk=1 blk=257 disk=1 (DATA_0001) 
incarn=3916071178 au=113792 blk=1 count=1
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc
WARNING: cache read (retry) a corrupt block: group=2(DATA) dsk=1 blk=257 
disk=1 (DATA_0001) incarn=3916071178 au=113792 blk=1 count=1
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ERROR: cache failed to read group=2(DATA) dsk=1 blk=257 from disk(s): 1(DATA_0001)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
NOTE: cache initiating offline of disk 1 group DATA
NOTE: process _user14778_+asm1 (14778) initiating offline of 
disk 1.3916071178 (DATA_0001) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 1/0xe96a810a, mask = 0x6a, op = clear
Mon Sep 26 16:40:58 2022
GMON updating disk modes for group 2 at 70 for pid 28, osid 14778
ERROR: Disk 1 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Mon Sep 26 16:40:58 2022
NOTE: cache dismounting (not clean) group 2/0x9DFA705F (DATA) 
WARNING: Offline for disk DATA_0001 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 14782, image: oracle@oracle11grac1 (B000)
Mon Sep 26 16:40:58 2022
NOTE: halting all I/Os to diskgroup 2 (DATA)
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc  (incident=144548):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
Incident details in: /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548/+ASM1_ora_14778_i144548.trc
Mon Sep 26 16:40:58 2022
Sweep [inc][144548]: completed
System State dumped to trace file /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548/+ASM1_ora_14778_i144548.trc
Mon Sep 26 16:40:58 2022
NOTE: AMDU dump of disk group DATA created at /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548
Mon Sep 26 16:41:00 2022
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=21.3550 last written ABA 21.3550
Mon Sep 26 16:41:00 2022
Sweep [inc2][144548]: completed
Mon Sep 26 16:41:00 2022
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x9dfa705f (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5162.trc:
ORA-15130: diskgroup "DATA" is being dismounted

这里看主要是由于asm 磁盘组需要做COD recovery导致无法正常稳定的mount,主要原因是遭遇到asm disk的逻辑坏块(存储物理上看是ok的,但是实际数据在asm中看是异常的)

数据库alert日志报错

Mon Sep 26 16:40:52 2022
Successful mount of redo thread 1, with mount id 1097279951
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: alter database mount
alter database open
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
LGWR: STARTING ARCH PROCESSES
Mon Sep 26 16:40:56 2022
ARC0 started with pid=40, OS id=14761 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Sep 26 16:40:57 2022
ARC1 started with pid=41, OS id=14764 
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_lgwr_14479.trc:
ORA-00313: ??????? 1 (???? 1) ???
Mon Sep 26 16:40:57 2022
ARC2 started with pid=42, OS id=14766 
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_lgwr_14479.trc:
ORA-00313: ??????? 2 (???? 1) ???
Mon Sep 26 16:40:57 2022
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-00313: open failed for members of log group 1 of thread 1
Mon Sep 26 16:40:57 2022
ARC3 started with pid=44, OS id=14770 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-00313: open failed for members of log group 1 of thread 1
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc2_14766.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc1_14764.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc  (incident=180281):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc0_14761.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc3_14770.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc0_14761.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc3_14770.trc:
ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员
ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215
ORA-15130: diskgroup "DATA" is being dismounted
Unable to create archive log file '+DATA'
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-19816: WARNING: Files may exist in db_recovery_file_dest that are not known to database.
ORA-17502: ksfdcre:4 Failed to create file +DATA
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1]
*************************************************************
WARNING: A file of type ARCHIVED LOG may exist in
db_recovery_file_dest that is not known to the database.
Use the RMAN command CATALOG RECOVERY AREA to re-catalog
any such files. If files cannot be cataloged, then manually
delete them using OS command. This is most likely the
result of a crash during file creation.
*************************************************************
ARCH: Error 19504 Creating archive log file to '+DATA'
NOTE: Deferred communication with ASM instance
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-15130: diskgroup "DATA" is being dismounted
NOTE: deferred map free for map id 23
Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc:
ORA-16038: log 1 sequence# 14235 cannot be archived
ORA-19504: failed to create file ""
ORA-00312: online log 1 thread 1: '+DATA/xff/onlinelog/group_1.271.1025610215'
ORA-00312: online log 1 thread 1: '+ARCH/xff/onlinelog/group_1.279.1025610217'
Mon Sep 26 16:40:58 2022
Sweep [inc][180281]: completed
Sweep [inc2][180281]: completed
USER (ospid: 14732): terminating the instance due to error 16038
Mon Sep 26 16:40:59 2022
System state dump requested by (instance=1, osid=14732), summary=[abnormal instance termination].
Instance terminated by USER, pid = 14732

对于这类故障处理相对比较容易,通过patch asm,让data磁盘组稳定mount,然后open库,迁移数据,实现数据0丢失,完美恢复

发表在 Oracle ASM | 标签为 , , , , , | 评论关闭