标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,697)
- DB2 (22)
- MySQL (74)
- Oracle (1,558)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (93)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
- ORA-15411: Failure groups in disk group DATA have different number of disks.
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
分类目录归档:Oracle ASM
asm disk被分区,格式化为ext4恢复
有客户因为没有认识到linux中的磁盘被asm使用,对其进行分区并且做成了ext4的文件系统,从history中获取客户操作命令
600 fdisk -l 601 fdisk /dev/sdb 602 mkfs ext4 /dev/sdb1 603 fdisk -l 604 mkfs -t ext4 /dev/sdb1 605 cd / 606 mkdir u01 607 mount /dev/sdb1 /u01 608 df -h
确认磁盘情况,确认sdb直接被asm磁盘使用(asmdisk1)
[grid@racdb3 trace]$ ls -l /dev/asm* brw-rw---- 1 grid asmadmin 8, 16 Sep 30 14:34 /dev/asmdisk1 [grid@racdb3 trace]$ ls -l /dev/sd* brw-rw---- 1 root disk 8, 0 Jul 27 2021 /dev/sda brw-rw---- 1 root disk 8, 1 Jul 27 2021 /dev/sda1 brw-rw---- 1 root disk 8, 2 Jul 27 2021 /dev/sda2 brw-rw---- 1 root disk 8, 16 Sep 30 11:23 /dev/sdb brw-rw---- 1 root disk 8, 17 Sep 30 11:23 /dev/sdb1 brw-rw---- 1 root disk 8, 32 Jul 27 2021 /dev/sdc
asm日志报错
Fri Sep 30 11:31:41 2022 NOTE: SMON starting instance recovery for group DATA domain 1 (mounted) NOTE: SMON skipping disk 0 - no header NOTE: cache initiating offline of disk 0 group DATA NOTE: process _smon_+asm3 (2989) initiating offline of disk 0.3915953109 (DATA_0000) with mask 0x7e in group 1 NOTE: initiating PST update: grp = 1, dsk = 0/0xe968b3d5, mask = 0x6a, op = clear Fri Sep 30 11:31:41 2022 GMON updating disk modes for group 1 at 4 for pid 17, osid 2989 ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 1) Fri Sep 30 11:31:41 2022 NOTE: cache dismounting (not clean) group 1/0x34F84324 (DATA) WARNING: Offline for disk DATA_0000 in mode 0x7f failed. Fri Sep 30 11:31:41 2022 NOTE: halting all I/Os to diskgroup 1 (DATA) ERROR: No disks with F1X0 found on disk group DATA NOTE: aborting instance recovery of domain 1 due to diskgroup dismount NOTE: SMON skipping lock domain (1) validation because diskgroup being dismounted
数据库日志报错
Fri Sep 30 11:31:44 2022 Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_lmon_26356.trc: ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097' ORA-15078: ASM diskgroup was forcibly dismounted Fri Sep 30 11:31:45 2022 Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc: ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc: ORA-00221: error on write to control file ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted CKPT (ospid: 26388): terminating the instance due to error 221
通过kfed 查看asm disk被破坏情况
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F4FAAD45400 00000000 00000000 00000000 00000000 [................] Repeat 26 times 7F4FAAD455B0 00000000 00000000 45C222C8 01000000 [.........".E....] 7F4FAAD455C0 FE830001 003FFFFF E9D60000 0000FFFF [......?.........] 7F4FAAD455D0 00000000 00000000 00000000 00000000 [................] Repeat 1 times 7F4FAAD455F0 00000000 00000000 00000000 AA550000 [..............U.] 7F4FAAD45600 00000000 00000000 00000000 00000000 [................] Repeat 223 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=2 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F64E77A0400 00000000 00000000 00000000 00000000 [................] Repeat 223 times 7F64E77A1200 000081F9 000181F9 000281F9 000381F9 [................] 7F64E77A1210 000481F9 000C81F9 000D81F9 001881F9 [................] 7F64E77A1220 002881F9 003E81F9 007981F9 00AB81F9 [..(...>...y.....] 7F64E77A1230 013881F9 016C81F9 044581F9 04B081F9 [..8...l...E.....] 7F64E77A1240 061A81F9 0CD081F9 1E8481F9 00000000 [................] 7F64E77A1250 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=3 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F8D101FF400 00000000 00000000 00000000 00000000 [................] Repeat 223 times 7F8D10200200 000082F9 000182F9 000282F9 000382F9 [................] 7F8D10200210 000482F9 000C82F9 000D82F9 001882F9 [................] 7F8D10200220 002882F9 003E82F9 007982F9 00AB82F9 [..(...>...y.....] 7F8D10200230 013882F9 016C82F9 044582F9 04B082F9 [..8...l...E.....] 7F8D10200240 061A82F9 0CD082F9 1E8482F9 00000000 [................] 7F8D10200250 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=4 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F142949C400 00000000 00000000 00000000 00000000 [................] Repeat 223 times 7F142949D200 000083F9 000183F9 000283F9 000383F9 [................] 7F142949D210 000483F9 000C83F9 000D83F9 001883F9 [................] 7F142949D220 002883F9 003E83F9 007983F9 00AB83F9 [..(...>...y.....] 7F142949D230 013883F9 016C83F9 044583F9 04B083F9 [..8...l...E.....] 7F142949D240 061A83F9 0CD083F9 1E8483F9 00000000 [................] 7F142949D250 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=5 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F0615CF6400 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
磁盘前几个au被破坏严重.而且相关的备份block都已经损坏,基于这种情况,直接参考:
asm磁盘dd破坏恢复
asm disk header 彻底损坏恢复
asm disk 磁盘部分被清空恢复
通过底层恢复出来相关数据文件,并检测正常
进一步通过au分配列表获恢复redo,ctl等文件
H:\TEMP\asm-ext4\other>dir 驱动器 H 中的卷是 SSD-SX 卷的序列号是 84EB-F434 H:\TEMP\asm-ext4\other 的目录 2022-09-30 21:52 25,165,824 256.dd 2022-09-30 21:52 25,165,824 257.dd 2022-09-30 23:52 52,429,312 258.dd.1 2022-09-30 23:54 52,429,312 259.dd.1 2022-09-30 23:55 52,429,312 260.dd.1 2022-09-30 23:55 52,429,312 261.dd.1 2022-09-30 23:56 52,429,312 270.dd.1 2022-09-30 23:57 52,429,312 271.dd.1 2022-09-30 23:57 52,429,312 272.dd.1 2022-09-30 23:57 52,429,312 273.dd.1 2022-09-30 23:58 52,429,312 274.dd.1 2022-10-01 00:01 52,429,312 275.dd.1 2022-10-01 00:00 52,429,312 276.dd.1 2022-10-01 00:00 52,429,312 277.dd.1 2022-10-01 00:00 52,429,312 278.dd.1 2022-09-30 23:59 52,429,312 279.dd.1 2022-09-30 23:59 52,429,312 280.dd.1 2022-09-30 23:59 52,429,312 281.dd.1
在另外的新机器上尝试恢复库
[oracle@xifenfei ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Sat Oct 1 10:18:58 2022 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup mount pfile='/tmp/pfile' ORACLE instance started. Total System Global Area 1519898624 bytes Fixed Size 2253464 bytes Variable Size 939527528 bytes Database Buffers 570425344 bytes Redo Buffers 7692288 bytes ORA-00227: corrupt block detected in control file: (block 8, # blocks 1) ORA-00202: control file: '/oradata/256.dd'
控制文件损坏,重建ctl
SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS NOARCHIVELOG 2 MAXLOGFILES 50 3 MAXLOGMEMBERS 5 4 MAXDATAFILES 100 5 MAXINSTANCES 8 6 MAXLOGHISTORY 226 7 LOGFILE 8 group 7 '/oradata/270.dd.1' size 50M, 9 group 8 '/oradata/272.dd.1' size 50M, 10 group 5 '/oradata/274.dd.1' size 50M, 11 group 6 '/oradata/276.dd.1' size 50M, 12 group 3 '/oradata/278.dd.1' size 50M, 13 group 4 '/oradata/280.dd.1' size 50M, 14 group 1 '/oradata/258.dd.1' size 50M, 15 group 2 '/oradata/260.dd.1' size 50M 16 DATAFILE 17 '/oradata/1', 18 '/oradata/2', 19 '/oradata/3', 20 '/oradata/4', 21 '/oradata/5', 22 '/oradata/6', 23 '/oradata/7', 24 '/oradata/8', 25 '/oradata/9', 26 '/oradata/10', 27 '/oradata/11' 28 CHARACTER SET ZHS16GBK 29 ; Control file created.
尝试open库,报ORA-600 kqfidps_update_stats:2,ORA-600 4194等错误
SQL> recover database; Media recovery complete. SQL> alter database open ; alter database open * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00600: internal error code, arguments: [kqfidps_update_stats:2], [0x7FFCCBEB3EC0], [], [], [], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [4193], [19319], [l.ok
解决该异常,open数据库成功
SQL> startup mount pfile='/tmp/pfile'; ORACLE instance started. Total System Global Area 1519898624 bytes Fixed Size 2253464 bytes Variable Size 939527528 bytes Database Buffers 570425344 bytes Redo Buffers 7692288 bytes Database mounted. SQL> alter database open; Database altered.
导出数据库,遭遇个别表如下ORA-08103和ORA-01555两种错误,这种是由于个别block在做成文件系统的时候被损坏,底层恢复的时候block被置空导致,对其异常表进行单独处理即可
. . 正在导出表 ALBUM EXP-00056: 遇到 ORACLE 错误 8103 ORA-08103: 对象不再存在 . . 正在导出表 M_PUSH_CONTENT EXP-00056: 遇到 ORACLE 错误 1555 ORA-01555: 快照过旧: 回退段号 (名称为 "") 过小 ORA-22924: 快照太旧
通过上述操作,实现客户数据的恢复,最大限度挽回客户损坏,再次提醒对于asm disk进行了误操作,建议第一时间保护现场(不要有任何的写入操作,可以最大限度恢复数据)
ORA-15335 ORA-15130 ORA-15066 ORA-15196
客户反馈,数据库无法正常启动,通过分析asm的alert日志发现,data磁盘组mount成功之后,没有一会儿自动dismount掉
Mon Sep 26 16:40:14 2022 SQL> /* ASMCMD */ALTER DISKGROUP data MOUNT NOTE: cache registered group DATA number=2 incarn=0x9dfa705f NOTE: cache began mount (first) of group DATA number=2 incarn=0x9dfa705f NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02) NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01) Mon Sep 26 16:40:20 2022 NOTE: GMON heartbeating for grp 2 GMON querying group 2 at 68 for pid 25, osid 14650 NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01 NOTE: F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02 NOTE: cache mounting (first) external redundancy group 2/0x9DFA705F (DATA) Mon Sep 26 16:40:20 2022 * allocate domain 2, invalid = TRUE kjbdomatt send to inst 2 Mon Sep 26 16:40:20 2022 NOTE: attached to recovery domain 2 NOTE: cache recovered group 2 to fcn 0.321845 NOTE: redo buffer size is 256 blocks (1053184 bytes) Mon Sep 26 16:40:20 2022 NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA) NOTE: LGWR found thread 1 closed at ABA 20.3546 NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA) NOTE: LGWR opening thread 1 at fcn 0.321845 ABA 21.3547 NOTE: cache mounting group 2/0x9DFA705F (DATA) succeeded NOTE: cache ending mount (success) of group DATA number=2 incarn=0x9dfa705f Mon Sep 26 16:40:20 2022 NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2 SUCCESS: diskgroup DATA was mounted SUCCESS: /* ASMCMD */ALTER DISKGROUP data MOUNT Mon Sep 26 16:40:22 2022 WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD) Mon Sep 26 16:40:47 2022 NOTE: client xff1:xff registered, osid 14742, mbr 0x0 Mon Sep 26 16:40:57 2022 WARNING: cache read a corrupt block: group=2(DATA) dsk=1 blk=257 disk=1 (DATA_0001) incarn=3916071178 au=113792 blk=1 count=1 Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] NOTE: a corrupted block from group DATA was dumped to /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc WARNING: cache read (retry) a corrupt block: group=2(DATA) dsk=1 blk=257 disk=1 (DATA_0001) incarn=3916071178 au=113792 blk=1 count=1 Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ERROR: cache failed to read group=2(DATA) dsk=1 blk=257 from disk(s): 1(DATA_0001) ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] NOTE: cache initiating offline of disk 1 group DATA NOTE: process _user14778_+asm1 (14778) initiating offline of disk 1.3916071178 (DATA_0001) with mask 0x7e in group 2 NOTE: initiating PST update: grp = 2, dsk = 1/0xe96a810a, mask = 0x6a, op = clear Mon Sep 26 16:40:58 2022 GMON updating disk modes for group 2 at 70 for pid 28, osid 14778 ERROR: Disk 1 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) Mon Sep 26 16:40:58 2022 NOTE: cache dismounting (not clean) group 2/0x9DFA705F (DATA) WARNING: Offline for disk DATA_0001 in mode 0x7f failed. NOTE: messaging CKPT to quiesce pins Unix process pid: 14782, image: oracle@oracle11grac1 (B000) Mon Sep 26 16:40:58 2022 NOTE: halting all I/Os to diskgroup 2 (DATA) Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14778.trc (incident=144548): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] Incident details in: /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548/+ASM1_ora_14778_i144548.trc Mon Sep 26 16:40:58 2022 Sweep [inc][144548]: completed System State dumped to trace file /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548/+ASM1_ora_14778_i144548.trc Mon Sep 26 16:40:58 2022 NOTE: AMDU dump of disk group DATA created at /opt/grid/diag/asm/+asm/+ASM1/incident/incdir_144548 Mon Sep 26 16:41:00 2022 NOTE: LGWR doing non-clean dismount of group 2 (DATA) NOTE: LGWR sync ABA=21.3550 last written ABA 21.3550 Mon Sep 26 16:41:00 2022 Sweep [inc2][144548]: completed Mon Sep 26 16:41:00 2022 ERROR: ORA-15130 in COD recovery for diskgroup 2/0x9dfa705f (DATA) ERROR: ORA-15130 thrown in RBAL for group number 2 Errors in file /opt/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5162.trc: ORA-15130: diskgroup "DATA" is being dismounted
这里看主要是由于asm 磁盘组需要做COD recovery导致无法正常稳定的mount,主要原因是遭遇到asm disk的逻辑坏块(存储物理上看是ok的,但是实际数据在asm中看是异常的)
数据库alert日志报错
Mon Sep 26 16:40:52 2022 Successful mount of redo thread 1, with mount id 1097279951 Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE) Lost write protection disabled Completed: alter database mount alter database open This instance was first to open Picked broadcast on commit scheme to generate SCNs LGWR: STARTING ARCH PROCESSES Mon Sep 26 16:40:56 2022 ARC0 started with pid=40, OS id=14761 ARC0: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Mon Sep 26 16:40:57 2022 ARC1 started with pid=41, OS id=14764 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_lgwr_14479.trc: ORA-00313: ??????? 1 (???? 1) ??? Mon Sep 26 16:40:57 2022 ARC2 started with pid=42, OS id=14766 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_lgwr_14479.trc: ORA-00313: ??????? 2 (???? 1) ??? Mon Sep 26 16:40:57 2022 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc: ORA-00313: open failed for members of log group 1 of thread 1 Mon Sep 26 16:40:57 2022 ARC3 started with pid=44, OS id=14770 ARC1: Archival started ARC2: Archival started ARC1: Becoming the 'no FAL' ARCH ARC1: Becoming the 'no SRL' ARCH ARC2: Becoming the heartbeat ARCH Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc: ORA-00313: open failed for members of log group 1 of thread 1 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc2_14766.trc: ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc1_14764.trc: ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc (incident=180281): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc0_14761.trc: ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员 ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215' ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215 ORA-15130: diskgroup "DATA" is being dismounted Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc3_14770.trc: ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员 ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215' ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215 ORA-15130: diskgroup "DATA" is being dismounted Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc0_14761.trc: ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员 ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215' ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215 ORA-15130: diskgroup "DATA" is being dismounted Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_arc3_14770.trc: ORA-00313: 无法打开日志组 1 (用于线程 1) 的成员 ORA-00312: 联机日志 1 线程 1: '+DATA/xff/onlinelog/group_1.271.1025610215' ORA-17503: ksfdopn: 2 未能打开文件 +DATA/xff/onlinelog/group_1.271.1025610215 ORA-15130: diskgroup "DATA" is being dismounted Unable to create archive log file '+DATA' Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc: ORA-19816: WARNING: Files may exist in db_recovery_file_dest that are not known to database. ORA-17502: ksfdcre:4 Failed to create file +DATA ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0001" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483649] [257] [0 != 1] ************************************************************* WARNING: A file of type ARCHIVED LOG may exist in db_recovery_file_dest that is not known to the database. Use the RMAN command CATALOG RECOVERY AREA to re-catalog any such files. If files cannot be cataloged, then manually delete them using OS command. This is most likely the result of a crash during file creation. ************************************************************* ARCH: Error 19504 Creating archive log file to '+DATA' NOTE: Deferred communication with ASM instance Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc: ORA-15130: diskgroup "DATA" is being dismounted NOTE: deferred map free for map id 23 Errors in file /opt/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_14732.trc: ORA-16038: log 1 sequence# 14235 cannot be archived ORA-19504: failed to create file "" ORA-00312: online log 1 thread 1: '+DATA/xff/onlinelog/group_1.271.1025610215' ORA-00312: online log 1 thread 1: '+ARCH/xff/onlinelog/group_1.279.1025610217' Mon Sep 26 16:40:58 2022 Sweep [inc][180281]: completed Sweep [inc2][180281]: completed USER (ospid: 14732): terminating the instance due to error 16038 Mon Sep 26 16:40:59 2022 System state dump requested by (instance=1, osid=14732), summary=[abnormal instance termination]. Instance terminated by USER, pid = 14732
对于这类故障处理相对比较容易,通过patch asm,让data磁盘组稳定mount,然后open库,迁移数据,实现数据0丢失,完美恢复
发表在 Oracle ASM
标签为 invalid ASM block header, kfc.c:26368, ORA-15066, ORA-15130, ORA-15196, ORA-15335
评论关闭
ORA-15063: ASM discovered an insufficient number of disks for diskgroup 恢复
客户反馈三个磁盘组无法正常mount,报错类似ORA-15032 ORA-15017 ORA-15063
SQL> ALTER DISKGROUP ASM_DATA MOUNT /* asm agent *//* {0:0:2} */ NOTE: cache registered group ASM_DATA number=1 incarn=0xffa85ccd NOTE: cache began mount (first) of group ASM_DATA number=1 incarn=0xffa85ccd ERROR: no read quorum in group: required 2, found 0 disks NOTE: cache dismounting (clean) group 1/0xFFA85CCD (ASM_DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 5709, image: oracle@XFF (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 1/0xFFA85CCD (ASM_DATA) NOTE: cache ending mount (fail) of group ASM_DATA number=1 incarn=0xffa85ccd NOTE: cache deleting context for group ASM_DATA 1/0xffa85ccd Tue Jun 21 12:24:38 2022 NOTE: No asm libraries found in the system ASM Health Checker found 1 new failures GMON dismounting group 1 at 16 for pid 19, osid 5709 ERROR: diskgroup ASM_DATA was not mounted ORA-15032: not all alterations performed ORA-15017: diskgroup "ASM_DATA" cannot be mounted ORA-15063: ASM discovered an insufficient number of disks for diskgroup "ASM_DATA" ERROR: ALTER DISKGROUP ASM_DATA MOUNT /* asm agent *//* {0:0:2} */
初步判断是asm disk异常导致(比如asm disk不能被扫描到,或者丢失,或者磁盘头损坏等),分析客户的asm disk的udev文件配置
KERNEL=="sdd1", NAME="asm_grid", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sde1", NAME="asm_system", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sdf1", NAME="asm_data", OWNER="grid", GROUP="asmadmin", MODE="0660"
从udev的配置中可以看出来,客户以前是对3个磁盘进行分析,然后使用udev映射别名给asm使用的.通过对其中一个磁盘进行分析
通过上述winhex查看,可以确认该分区的磁盘头信息异常[该信息属于磁盘刚分区的时候信息,而不是asm disk的信息],和kfed看到的结果一致[磁盘头位置肯定损坏,其他位置目前未知]
H:\TEMP\dd>kfed read sdf_sdf1.dd kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 0064D8400 00000000 00000000 00000000 00000000 [................] Repeat 26 times 0064D85B0 00000000 00000000 00000000 02000000 [................] 0064D85C0 FE8E0001 003FFFFF DFFC0000 0000257F [......?......%..] 0064D85D0 00000000 00000000 00000000 00000000 [................] Repeat 1 times 0064D85F0 00000000 00000000 00000000 AA550000 [..............U.] 0064D8600 00000000 00000000 00000000 00000000 [................] Repeat 223 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
分析其他位置的block情况,初步看基本上ok[运气还不错]
H:\TEMP\dd>kfed read sdf_sdf1.dd blkn=2|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL H:\TEMP\dd>kfed read sdf_sdf1.dd blkn=3|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL H:\TEMP\dd>kfed read sdf_sdf1.dd blkn=1 aun=2|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
通过检索备份出来的部分磁盘文件,找出来ORCLDISK信息部分(asm disk header)
然后利用这个部分对损坏的磁盘头进行修复,并且dd回生产环境中,并尝试mount磁盘组,数据库open成功
至此这个数据库运气不错,没有过多损坏,算完美恢复,可以进行了逻辑导出和rman备份,全部正常.为了后续安全,建议对其进行迁移