标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,670)
- DB2 (22)
- MySQL (73)
- Oracle (1,532)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (21)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (65)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (560)
- Oracle安装升级 (91)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (78)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
- ORA-600 16703故障再现
- 数据库启动报ORA-27102 OSD-00026 O/S-Error: (OS 1455)
- .[metro777@cock.li].Elbie勒索病毒加密数据库恢复
- 应用连接错误,初始化mysql数据库恢复
- RAC默认服务配置优先节点
- Oracle 19c RAC 替换私网操作
- 监听报TNS-12541 TNS-12560 TNS-00511错误
- drop tablespace xxx including contents恢复
- Linux 8 修改网卡名称
标签归档:KFED-00322
asm disk被分区,格式化为ext4恢复
有客户因为没有认识到linux中的磁盘被asm使用,对其进行分区并且做成了ext4的文件系统,从history中获取客户操作命令
600 fdisk -l 601 fdisk /dev/sdb 602 mkfs ext4 /dev/sdb1 603 fdisk -l 604 mkfs -t ext4 /dev/sdb1 605 cd / 606 mkdir u01 607 mount /dev/sdb1 /u01 608 df -h
确认磁盘情况,确认sdb直接被asm磁盘使用(asmdisk1)
[grid@racdb3 trace]$ ls -l /dev/asm* brw-rw---- 1 grid asmadmin 8, 16 Sep 30 14:34 /dev/asmdisk1 [grid@racdb3 trace]$ ls -l /dev/sd* brw-rw---- 1 root disk 8, 0 Jul 27 2021 /dev/sda brw-rw---- 1 root disk 8, 1 Jul 27 2021 /dev/sda1 brw-rw---- 1 root disk 8, 2 Jul 27 2021 /dev/sda2 brw-rw---- 1 root disk 8, 16 Sep 30 11:23 /dev/sdb brw-rw---- 1 root disk 8, 17 Sep 30 11:23 /dev/sdb1 brw-rw---- 1 root disk 8, 32 Jul 27 2021 /dev/sdc
asm日志报错
Fri Sep 30 11:31:41 2022 NOTE: SMON starting instance recovery for group DATA domain 1 (mounted) NOTE: SMON skipping disk 0 - no header NOTE: cache initiating offline of disk 0 group DATA NOTE: process _smon_+asm3 (2989) initiating offline of disk 0.3915953109 (DATA_0000) with mask 0x7e in group 1 NOTE: initiating PST update: grp = 1, dsk = 0/0xe968b3d5, mask = 0x6a, op = clear Fri Sep 30 11:31:41 2022 GMON updating disk modes for group 1 at 4 for pid 17, osid 2989 ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 1) Fri Sep 30 11:31:41 2022 NOTE: cache dismounting (not clean) group 1/0x34F84324 (DATA) WARNING: Offline for disk DATA_0000 in mode 0x7f failed. Fri Sep 30 11:31:41 2022 NOTE: halting all I/Os to diskgroup 1 (DATA) ERROR: No disks with F1X0 found on disk group DATA NOTE: aborting instance recovery of domain 1 due to diskgroup dismount NOTE: SMON skipping lock domain (1) validation because diskgroup being dismounted
数据库日志报错
Fri Sep 30 11:31:44 2022 Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_lmon_26356.trc: ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097' ORA-15078: ASM diskgroup was forcibly dismounted Fri Sep 30 11:31:45 2022 Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc: ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc: ORA-00221: error on write to control file ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted ORA-00206: error in writing (block 5, # blocks 1) of control file ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted CKPT (ospid: 26388): terminating the instance due to error 221
通过kfed 查看asm disk被破坏情况
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F4FAAD45400 00000000 00000000 00000000 00000000 [................] Repeat 26 times 7F4FAAD455B0 00000000 00000000 45C222C8 01000000 [.........".E....] 7F4FAAD455C0 FE830001 003FFFFF E9D60000 0000FFFF [......?.........] 7F4FAAD455D0 00000000 00000000 00000000 00000000 [................] Repeat 1 times 7F4FAAD455F0 00000000 00000000 00000000 AA550000 [..............U.] 7F4FAAD45600 00000000 00000000 00000000 00000000 [................] Repeat 223 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=2 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F64E77A0400 00000000 00000000 00000000 00000000 [................] Repeat 223 times 7F64E77A1200 000081F9 000181F9 000281F9 000381F9 [................] 7F64E77A1210 000481F9 000C81F9 000D81F9 001881F9 [................] 7F64E77A1220 002881F9 003E81F9 007981F9 00AB81F9 [..(...>...y.....] 7F64E77A1230 013881F9 016C81F9 044581F9 04B081F9 [..8...l...E.....] 7F64E77A1240 061A81F9 0CD081F9 1E8481F9 00000000 [................] 7F64E77A1250 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=3 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F8D101FF400 00000000 00000000 00000000 00000000 [................] Repeat 223 times 7F8D10200200 000082F9 000182F9 000282F9 000382F9 [................] 7F8D10200210 000482F9 000C82F9 000D82F9 001882F9 [................] 7F8D10200220 002882F9 003E82F9 007982F9 00AB82F9 [..(...>...y.....] 7F8D10200230 013882F9 016C82F9 044582F9 04B082F9 [..8...l...E.....] 7F8D10200240 061A82F9 0CD082F9 1E8482F9 00000000 [................] 7F8D10200250 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=4 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F142949C400 00000000 00000000 00000000 00000000 [................] Repeat 223 times 7F142949D200 000083F9 000183F9 000283F9 000383F9 [................] 7F142949D210 000483F9 000C83F9 000D83F9 001883F9 [................] 7F142949D220 002883F9 003E83F9 007983F9 00AB83F9 [..(...>...y.....] 7F142949D230 013883F9 016C83F9 044583F9 04B083F9 [..8...l...E.....] 7F142949D240 061A83F9 0CD083F9 1E8483F9 00000000 [................] 7F142949D250 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=5 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F0615CF6400 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
磁盘前几个au被破坏严重.而且相关的备份block都已经损坏,基于这种情况,直接参考:
asm磁盘dd破坏恢复
asm disk header 彻底损坏恢复
asm disk 磁盘部分被清空恢复
通过底层恢复出来相关数据文件,并检测正常
进一步通过au分配列表获恢复redo,ctl等文件
H:\TEMP\asm-ext4\other>dir 驱动器 H 中的卷是 SSD-SX 卷的序列号是 84EB-F434 H:\TEMP\asm-ext4\other 的目录 2022-09-30 21:52 25,165,824 256.dd 2022-09-30 21:52 25,165,824 257.dd 2022-09-30 23:52 52,429,312 258.dd.1 2022-09-30 23:54 52,429,312 259.dd.1 2022-09-30 23:55 52,429,312 260.dd.1 2022-09-30 23:55 52,429,312 261.dd.1 2022-09-30 23:56 52,429,312 270.dd.1 2022-09-30 23:57 52,429,312 271.dd.1 2022-09-30 23:57 52,429,312 272.dd.1 2022-09-30 23:57 52,429,312 273.dd.1 2022-09-30 23:58 52,429,312 274.dd.1 2022-10-01 00:01 52,429,312 275.dd.1 2022-10-01 00:00 52,429,312 276.dd.1 2022-10-01 00:00 52,429,312 277.dd.1 2022-10-01 00:00 52,429,312 278.dd.1 2022-09-30 23:59 52,429,312 279.dd.1 2022-09-30 23:59 52,429,312 280.dd.1 2022-09-30 23:59 52,429,312 281.dd.1
在另外的新机器上尝试恢复库
[oracle@xifenfei ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Sat Oct 1 10:18:58 2022 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup mount pfile='/tmp/pfile' ORACLE instance started. Total System Global Area 1519898624 bytes Fixed Size 2253464 bytes Variable Size 939527528 bytes Database Buffers 570425344 bytes Redo Buffers 7692288 bytes ORA-00227: corrupt block detected in control file: (block 8, # blocks 1) ORA-00202: control file: '/oradata/256.dd'
控制文件损坏,重建ctl
SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS NOARCHIVELOG 2 MAXLOGFILES 50 3 MAXLOGMEMBERS 5 4 MAXDATAFILES 100 5 MAXINSTANCES 8 6 MAXLOGHISTORY 226 7 LOGFILE 8 group 7 '/oradata/270.dd.1' size 50M, 9 group 8 '/oradata/272.dd.1' size 50M, 10 group 5 '/oradata/274.dd.1' size 50M, 11 group 6 '/oradata/276.dd.1' size 50M, 12 group 3 '/oradata/278.dd.1' size 50M, 13 group 4 '/oradata/280.dd.1' size 50M, 14 group 1 '/oradata/258.dd.1' size 50M, 15 group 2 '/oradata/260.dd.1' size 50M 16 DATAFILE 17 '/oradata/1', 18 '/oradata/2', 19 '/oradata/3', 20 '/oradata/4', 21 '/oradata/5', 22 '/oradata/6', 23 '/oradata/7', 24 '/oradata/8', 25 '/oradata/9', 26 '/oradata/10', 27 '/oradata/11' 28 CHARACTER SET ZHS16GBK 29 ; Control file created.
尝试open库,报ORA-600 kqfidps_update_stats:2,ORA-600 4194等错误
SQL> recover database; Media recovery complete. SQL> alter database open ; alter database open * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00600: internal error code, arguments: [kqfidps_update_stats:2], [0x7FFCCBEB3EC0], [], [], [], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [4193], [19319], [l.ok
解决该异常,open数据库成功
SQL> startup mount pfile='/tmp/pfile'; ORACLE instance started. Total System Global Area 1519898624 bytes Fixed Size 2253464 bytes Variable Size 939527528 bytes Database Buffers 570425344 bytes Redo Buffers 7692288 bytes Database mounted. SQL> alter database open; Database altered.
导出数据库,遭遇个别表如下ORA-08103和ORA-01555两种错误,这种是由于个别block在做成文件系统的时候被损坏,底层恢复的时候block被置空导致,对其异常表进行单独处理即可
. . 正在导出表 ALBUM EXP-00056: 遇到 ORACLE 错误 8103 ORA-08103: 对象不再存在 . . 正在导出表 M_PUSH_CONTENT EXP-00056: 遇到 ORACLE 错误 1555 ORA-01555: 快照过旧: 回退段号 (名称为 "") 过小 ORA-22924: 快照太旧
通过上述操作,实现客户数据的恢复,最大限度挽回客户损坏,再次提醒对于asm disk进行了误操作,建议第一时间保护现场(不要有任何的写入操作,可以最大限度恢复数据)
ORA-15196: invalid ASM block header [kfc.c:26368]故障恢复
有客户对asm的data磁盘组增加磁盘进行扩容,在做reblance的过程中重启了主机,结果导致data磁盘组mount之后自动dismount
Fri Oct 09 20:48:06 2020 NOTE: PST enabling heartbeating (grp 1) Fri Oct 09 20:48:06 2020 NOTE: ASM did background COD recovery for group 1/0x739536c (DATA) NOTE: starting rebalance of group 1/0x739536c (DATA) at power 10 Starting background process ARB0 Fri Oct 09 20:48:07 2020 ARB0 started with pid=28, OS id=39278 NOTE: assigning ARB0 to group 1/0x739536c (DATA) with 10 parallel I/Os cellip.ora not found. WARNING:cache read a corrupt block:group=1(DATA) dsk=8 blk=7 disk=8(DATA_0008)incarn=3916014506 au=0 blk=7 count=1 Errors in file /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_39278.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] NOTE: a corrupted block from group DATA was dumped to /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_39278.trc WARNING:cache read(retry)a corrupt block:group=1(DATA) dsk=8 blk=7 disk=8(DATA_0008)incarn=3916014506 au=0 blk=7 count=1 Errors in file /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_39278.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] ERROR: cache failed to read group=1(DATA) dsk=8 blk=7 from disk(s): 8(DATA_0008) Fri Oct 09 20:48:13 2020 NOTE: GroupBlock outside rolling migration privileged region ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] NOTE: requesting all-instance membership refresh for group=1 NOTE: cache initiating offline of disk 8 group DATA NOTE: process _arb0_+asm1 (39278) initiating offline of disk 8.3916014506 (DATA_0008) with mask 0x7e in group 1 NOTE: initiating PST update: grp = 1, dsk = 8/0xe969a3aa, mask = 0x6a, op = clear GMON updating disk modes for group 1 at 7 for pid 28, osid 39278 ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 1) Fri Oct 09 20:48:13 2020 NOTE: cache dismounting (not clean) group 1/0x0739536C (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 39346, image: oracle@rac1 (B000) Fri Oct 09 20:48:13 2020 NOTE: halting all I/Os to diskgroup 1 (DATA) Fri Oct 09 20:48:13 2020 NOTE: LGWR doing non-clean dismount of group 1 (DATA) NOTE: LGWR sync ABA=32.4749 last written ABA 32.4749 WARNING: Offline for disk DATA_0008 in mode 0x7f failed. Fri Oct 09 20:48:13 2020 kjbdomdet send to inst 2 detach from dom 1, sending detach message to inst 2 Fri Oct 09 20:48:13 2020 List of instances: 1 2 Dirty detach reconfiguration started (new ddet inc 2, cluster inc 4) Errors in file /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_39278.trc (incident=337185): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483656] [7] [2182009786 != 2190395015] Incident details in: /u01/grid/diag/asm/+asm/+ASM1/incident/incdir_337185/+ASM1_arb0_39278_i337185.trc Global Resource Directory partially frozen for dirty detach * dirty detach - domain 1 invalid = TRUE 2341 GCS resources traversed, 0 cancelled Dirty Detach Reconfiguration complete freeing rdom 1 Fri Oct 09 20:48:13 2020 WARNING: dirty detached from domain 1 NOTE: cache dismounted group 1/0x0739536C (DATA)
错误信息比较明显dsk=8 blk=7 au=0 blk=7 的check值不对,本来应该是2190395015现在变为了2182009786,通过kfed分析确实如此
C:\Users\Administrator>kfed read f:/temp/xff/2.dd blkn=7|more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.datfmt: 2 ; 0x003: 0x02 kfbh.block.blk: 7 ; 0x004: blk=7 kfbh.block.obj: 2147483656 ; 0x008: disk=8 kfbh.check: 2182009786 ; 0x00c: 0x820ed3ba kfbh.fcn.base: 2711248 ; 0x010: 0x00295ed0 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdatb.aunum: 2240 ; 0x000: 0x000008c0 kfdatb.shrink: 448 ; 0x004: 0x01c0 kfdatb.ub2pad: 0 ; 0x006: 0x0000 kfdatb.auinfo[0].link.next: 8 ; 0x008: 0x0008 kfdatb.auinfo[0].link.prev: 8 ; 0x00a: 0x0008 kfdatb.auinfo[1].link.next: 12 ; 0x00c: 0x000c kfdatb.auinfo[1].link.prev: 12 ; 0x00e: 0x000c kfdatb.auinfo[2].link.next: 16 ; 0x010: 0x0010 kfdatb.auinfo[2].link.prev: 16 ; 0x012: 0x0010 kfdatb.auinfo[3].link.next: 20 ; 0x014: 0x0014 kfdatb.auinfo[3].link.prev: 20 ; 0x016: 0x0014 kfdatb.auinfo[4].link.next: 24 ; 0x018: 0x0018 kfdatb.auinfo[4].link.prev: 24 ; 0x01a: 0x0018 kfdatb.auinfo[5].link.next: 28 ; 0x01c: 0x001c kfdatb.auinfo[5].link.prev: 28 ; 0x01e: 0x001c kfdatb.auinfo[6].link.next: 32 ; 0x020: 0x0020 kfdatb.auinfo[6].link.prev: 32 ; 0x022: 0x0020 kfdatb.spare: 0 ; 0x024: 0x00000000
修改该值之后,再次mount data磁盘组,报错如下
Sat Oct 10 13:49:22 2020 ARB0 started with pid=28, OS id=10329 NOTE: assigning ARB0 to group 1/0x3759521c (DATA) with 10 parallel I/Os cellip.ora not found. Sat Oct 10 13:49:26 2020 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=1 WARNING: cache read a corrupt block: group=1(DATA) dsk=8 blk=8 disk=8 (DATA_0008) incarn=3916014011 au=0 blk=8 count=1 Errors in file /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_10329.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] NOTE: a corrupted block from group DATA was dumped to /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_10329.trc WARNING:cache read(retry)a corrupt block: group=1(DATA)dsk=8 blk=8 disk=8(DATA_0008)incarn=3916014011 au=0 blk=8 count=1 Errors in file /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_10329.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] ERROR: cache failed to read group=1(DATA) dsk=8 blk=8 from disk(s): 8(DATA_0008) ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] NOTE: cache initiating offline of disk 8 group DATA NOTE: process _arb0_+asm1 (10329) initiating offline of disk 8.3916014011 (DATA_0008) with mask 0x7e in group 1 NOTE: initiating PST update: grp = 1, dsk = 8/0xe969a1bb, mask = 0x6a, op = clear GMON updating disk modes for group 1 at 64 for pid 28, osid 10329 ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 1) Sat Oct 10 13:49:28 2020 NOTE: cache dismounting (not clean) group 1/0x3759521C (DATA) WARNING: Offline for disk DATA_0008 in mode 0x7f failed. Sat Oct 10 13:49:28 2020 NOTE: halting all I/Os to diskgroup 1 (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 10346, image: oracle@rac1 (B000) Errors in file /u01/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_10329.trc (incident=363107): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [8] [0 != 1] Incident details in: /u01/grid/diag/asm/+asm/+ASM1/incident/incdir_363107/+ASM1_arb0_10329_i363107.trc
该报错为:dsk=8 blk=7 au=0 blk=8异常,通过kfed查看发现
C:\Users\Administrator>kfed read f:/temp/xff/2.dd blkn=8 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 006BE8C00 00000000 00000000 00000000 00000000 [................] Repeat 31 times 006BE8E00 012C0000 04AFFC07 003BFFCD 03F15BD0 [..,.......;..[..] 006BE8E10 012BFC30 00000000 00000002 00000002 [0.+.............] 006BE8E20 00008000 00008000 00002000 564F22AF [......... ..."OV] 006BE8E30 5F805293 FFFF0002 0001EF53 00000001 [.R._....S.......] 006BE8E40 545AE384 00000000 00000000 00000001 [..ZT............] 006BE8E50 00000000 0000000B 00000100 0000003C [............<...] 006BE8E60 00000242 0000007B 52438BA0 C44FFA90 [B...{.....CR..O.] 006BE8E70 33B6F381 919E2DBA 00000000 00000000 [...3.-..........] 006BE8E80 00000000 00000000 6361622F 0070756B [......../backup.] 006BE8E90 00000000 00000000 00000000 00000000 [................] Repeat 2 times 006BE8EC0 00000000 00000000 00000000 03ED0000 [................] 006BE8ED0 00000000 00000000 00000000 00000000 [................] 006BE8EE0 00000008 00000000 00000000 AC3C87D6 [..............<.] 006BE8EF0 F1401174 F4F036BD 274FB92F 00000101 [t.@..6../.O'....] 006BE8F00 0000000C 00000000 545AE384 0002F30A [..........ZT....] 006BE8F10 00000004 00000000 00000000 00007FFF [................] 006BE8F20 02508000 00007FFF 00000001 0250FFFF [..P...........P.] 006BE8F30 00000000 00000000 00000000 00000000 [................] 006BE8F40 00000000 00000000 00000000 08000000 [................] 006BE8F50 00000000 00000000 00000000 001C001C [................] 006BE8F60 00000001 00000000 00000000 00000000 [................] 006BE8F70 00000000 00000004 A9AF72B9 0000003B [.........r..;...] 006BE8F80 00000000 00000000 00000000 00000000 [................] Repeat 167 times 006BE9A00 00001CC4 00800101 00001CC9 00800101 [................] 006BE9A10 00001CCD 00800101 00001CD2 00800101 [................] 006BE9A20 00001CD7 00800101 00001CDE 00800101 [................] 006BE9A30 00001CE3 00800101 00001CE8 00800101 [................] 006BE9A40 00001CEC 00800101 00000000 00000000 [................] 006BE9A50 00000000 00000000 00000000 00000000 [................] Repeat 26 times KFED-00322:Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
该block完全损坏,基本上无直接修复的可能,通过对data 磁盘组进行patch操作,让其mount之后不再dismount
NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 76 for pid 27, osid 14466 NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/emcpowere NOTE: F1X0 found on disk 0 au 2 fcn 0.2708382 NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/emcpowerf NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/emcpowerg NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/emcpowerh NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/emcpoweri NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/emcpowerj NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/emcpowerk NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/emcpowerl NOTE: cache opening disk 8 of grp 1: DATA_0008 path:/dev/emcpowerc NOTE: cache mounting (first) external redundancy group 1/0x47495222 (DATA) Sat Oct 10 13:59:38 2020 * allocate domain 1, invalid = TRUE Sat Oct 10 13:59:38 2020 NOTE: attached to recovery domain 1 NOTE: starting recovery of thread=1 ckpt=53.6778 group=1 (DATA) NOTE: advancing ckpt for group 1 (DATA) thread=1 ckpt=53.6778 NOTE: cache recovered group 1 to fcn 0.2961429 NOTE: redo buffer size is 256 blocks (1053184 bytes) Sat Oct 10 13:59:38 2020 NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA) NOTE: LGWR found thread 1 closed at ABA 53.6777 NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA) NOTE: LGWR opening thread 1 at fcn 0.2961429 ABA 54.6778 NOTE: cache mounting group 1/0x47495222 (DATA) succeeded NOTE: cache ending mount (success) of group DATA number=1 incarn=0x47495222 NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1 SUCCESS: diskgroup DATA was mounted SUCCESS: alter diskgroup data mount
然后通过rman备份数据库,删除老磁盘组,创建新磁盘组,恢复数据,实现数据库完美恢复,数据0丢失.
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type]
在oracle asm的使用过程中由于操作系统层面的错误操作导致asm disk 被破坏,这里列举了几种破坏之后的kfed报错现象(KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type])
asm mount 磁盘组报错(ORA-15040 ORA-15042)
SQL> alter diskgroup DATA mount; alter diskgroup DATA mount * ERROR at line 1: ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "2" is missing from group number "2"
asm alert日志报错(ORA-15335 ORA-15066 ORA-15196等)
ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0002" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483651] [48] [0 != 1]
kfed查看磁盘头报错
文件文件头(不光是disk header的4k,可能是连续的几个au,甚至更多)可能彻底损坏,一般kfed 读取都会看到KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type]之类错误
[oracle@fcomtaep2 disks]$ kfed read ASMRECO03 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7FC18D899400 00000000 00000000 00000000 00000000 [................] Repeat 27 times 7FC18D8995C0 FEEE0001 0001FFFF FFFF0000 00000FFF [................] 7FC18D8995D0 00000000 00000000 00000000 00000000 [................] Repeat 1 times 7FC18D8995F0 00000000 00000000 00000000 AA550000 [..............U.] 7FC18D899600 20494645 54524150 00010000 0000005C [EFI PART....\...] <==== **** Here ****** 7FC18D899610 BD82BBB3 00000000 00000001 00000000 [................] 7FC18D899620 0FFFFFFF 00000000 00000022 00000000 [........".......] 7FC18D899630 0FFFFFDE 00000000 FD8857E5 42D7B49B [.........W.....B] 7FC18D899640 0901FA87 6B3DB5AA 00000002 00000000 [......=k........] 7FC18D899650 00000080 00000080 FE48EB77 00000000 [........w.H.....] 7FC18D899660 00000000 00000000 00000000 00000000 [................] Repeat 25 times 7FC18D899800 EBD0A0A2 4433B9E5 B668C087 C79926B7 [......3D..h..&..] 7FC18D899810 5381F6DF 4626F988 0E4F468D D78D3B28 [...S..&F.FO.(;..] 7FC18D899820 000007A1 00000000 0FFFF85F 00000000 [........_.......] 7FC18D899830 00000000 00000000 00720070 006D0069 [........p.r.i.m.] 7FC18D899840 00720061 00000079 00000000 00000000 [a.r.y...........] 7FC18D899850 00000000 00000000 00000000 00000000 [................] Repeat 186 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
“EFI PART”是分区的元数据,一般是被分区导致asm disk损坏.
[ebernal@dbaasm new2]$ kfed read emcpowerl | head -25 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 2ABD671E9400 00000000 00000000 00000000 00000000 [................] Repeat 31 times 2ABD671E9600 4542414C 454E4F4C 00000001 00000000 [LABELONE........] 2ABD671E9610 E4E1DDB1 00000020 324D564C 31303020 [.... ...LVM2 001] <==== **** Here ****** 2ABD671E9620 50365A77 71327874 34303156 4B4E6136 [wZ6Ptx2qV1046aNK] 2ABD671E9630 35395159 5147634C 487A5A38 63575A37 [YQ95LcGQ8ZzH7ZWc] 2ABD671E9640 00000000 00000019 00030000 00000000 [................] 2ABD671E9650 00000000 00000000 00000000 00000000 [................] 2ABD671E9660 00000000 00000000 00001000 00000000 [................] 2ABD671E9670 0002F000 00000000 00000000 00000000 [................] 2ABD671E9680 00000000 00000000 00000000 00000000 [................] Repeat 215 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
“LVM2 001” 是逻辑卷的名字,该asm disk很可能被做为lvm管理而被破坏
[ebernal@dbaasm tars]$ kfed read rhdisk16 kfbh.endian: 65 ; 0x000: 0x41 kfbh.hard: 73 ; 0x001: 0x49 kfbh.type: 88 ; 0x002: *** Unknown Enum *** kfbh.datfmt: 32 ; 0x003: 0x20 kfbh.block.blk: 1111709260 ; 0x004: blk=1111709260 kfbh.block.obj: 1634861056 ; 0x008: file=131072 kfbh.check: 119 ; 0x00c: 0x00000077 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 2B6FE2AC1400 20584941 4243564C 61720000 00000077 [AIX LVCB..raw...] <==== **** Here ****** 2B6FE2AC1410 00000000 00000000 00000000 00000000 [................] 2B6FE2AC1420 00000000 00000000 30300000 38306430 [..........000d08] 2B6FE2AC1430 30306131 34643030 30303030 31303030 [1a0000d400000001] 2B6FE2AC1440 61006533 766C6D73 7461645F 00003161 [3e.asmlv_data1..] 2B6FE2AC1450 00000000 00000000 00000000 00000000 [................] Repeat 2 times 2B6FE2AC1480 54000000 4D206575 20207961 31312037 [...Tue May 7 11] 2B6FE2AC1490 3A33343A 32203633 0A333130 00000000 [:43:36 2013.....] 2B6FE2AC14A0 65755400 79614D20 20372020 343A3131 [.Tue May 7 11:4] 2B6FE2AC14B0 34323A38 31303220 00000A33 44000000 [8:24 2013......D] 2B6FE2AC14C0 41313830 30303444 6D6D7900 02007900 [081AD400.ymm.y..] 2B6FE2AC14D0 0100E40C 656E6F4E 00000000 00000000 [....None........] 2B6FE2AC14E0 00000000 00000000 00000000 00000000 [................] Repeat 14 times 2B6FE2AC15D0 00000000 00000000 65310000 61653934 [..........1e49ea] 2B6FE2AC15E0 342E3862 00000000 00000000 00000000 [b8.4............] 2B6FE2AC15F0 00000000 00000000 00000000 00000000 [................] Repeat 224 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][88]
这里的“AIX LVCB..raw” 是AIX OS volume 的元数据库,也就是说,asm disk 被作为了aix os层面破坏
[oracle@dbep2 disks]$ kfed read asm-disk3 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 06000000 00000000 00000000 00000000 00000000 [................] Repeat 25 times 0602100 51e2b7f6 00ed4e00 00000000 00000001 [...Q.N..........] 0602120 00000000 0000000b 00000100 0000003c [............<...] 0602140 00000242 0000007b 5d8468e7 6147782a [B...{....h.]*xGa] 0602160 d17851a2 327552e2 00000000 00000000 [.Qx..Ru2........] 0602200 00000000 00000000 3130752f 91a4f000 [......../u01....] <==== **** Here ****** 0602220 ff8808e4 d5104cff 000000ac 00000100 [.....L..........] 0602240 00000000 00000000 00000000 09d18000 [................] Repeat 254 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][88]
这里的/u01很可能表明该asm disk被文件系统覆盖
对于asm disk的各种破坏情况,如果是normal/high冗余,那么asm dg没有问题,可以考虑通过删除异常盘,然后重新加入;如果是外部冗余遭遇到asm disk 被破坏,一般asm disk 会dismount,而且无法正常mount,如果有备份的磁盘头,可以尝试还原磁盘头,mount 磁盘组,然后只读方式迁移数据;如果没有备份磁盘头或者还原之后也无法mount,可能需要通过一些额外的方式处理比如通过工具在asm dismount状态下恢复数据文件,甚至通过对asm block/oracle block碎片重组的方式恢复数据.参考相关文章:
oracle asm系列文章汇总
pvid=yes导致asm无法mount
asm disk header 彻底损坏恢复
分区无法识别导致asm diskgroup无法mount
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统
asm disk误设置pvid导致asm diskgroup无法mount恢复
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
ORA-15042: ASM disk “N” is missing from group number “M” 故障恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com
发表在 Oracle ASM, Oracle备份恢复
标签为 endian_kfbh, Invalid OSM block type, kfbtTraverseBlock, KFED-00322, ORA-15042, ORA-15196
评论关闭