标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,697)
- DB2 (22)
- MySQL (74)
- Oracle (1,558)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (93)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
- ORA-15411: Failure groups in disk group DATA have different number of disks.
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
标签归档:asm 恢复
asm磁盘组操作不当导致数据文件丢失恢复
最近遇到数据库恢复case,客户是要更换存储,在数据库mount状态把使用omf方式存储数据的asm 磁盘组通过rman copy到新的通过别名方式存储的新的asm 磁盘组的存储中,但是由于操作人员粗心,copy语句中部分目标磁盘组的数据文件别名重复了,最后执行rename file之后,导致部分数据文件彻底丢失.我们通过底层碎片扫描(参考:asm disk header 彻底损坏恢复)对于该用户的数据实现完全恢复.
因为整个过程重现比较麻烦,这里测试从一个data磁盘组中有一个omf方式存储的含有两个数据文件的表空间,通过rman copy 把这个表空间的两个文件拷贝到datanew磁盘组中,但是由于粗心把两个数据文件的别名写成一样,结果导致该表空间的一个数据文件彻底丢失的测试.
创建测试表空间
在datanew磁盘组中创建omf方式管理的xifenfei表空间,含有两个数据文件,file#分别为14和15
SQL> create tablespace xifenfei datafile '+DATA' SIZE 128m; Tablespace created. SQL> ALTER TABLESPACE XIFENFEI ADD DATAFILE '+DATA' SIZE 128m AUTOEXTEND ON; Tablespace altered. SQL> SELECT FILE_NAME,FILE_ID FROM DBA_DATA_FILES WHERE TABLESPACE_NAME='XIFENFEI'; FILE_NAME -------------------------------------------------------------------------------- FILE_ID ---------- +DATA/XFF/DATAFILE/xifenfei.276.961143809 14 +DATA/XFF/DATAFILE/xifenfei.277.961143825 15
rman copy datafile 14
通过rman copy把datafile 14拷贝到data磁盘组中,目标端为别名方式存储
RMAN> copy datafile 14 to '+datanew/xifenfei.dbf'; Starting backup at 27-NOV-17 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=24 device type=DISK channel ORA_DISK_1: starting datafile copy input datafile file number=00014 name=+DATA/XFF/DATAFILE/xifenfei.276.961143809 output file name=+DATANEW/xifenfei.dbf tag=TAG20171127T082643 RECID=4 STAMP=961144006 channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07 Finished backup at 27-NOV-17 [grid@localhost ~]$ asmcmd ASMCMD> cd datanew ASMCMD> ls XFF/ xifenfei.dbf ASMCMD> ls -l Type Redund Striped Time Sys Name Y XFF/ DATAFILE UNPROT COARSE NOV 27 08:00:00 N xifenfei.dbf => +DATANEW/XFF/DATAFILE/XIFENFEI.256.961144003 ASMCMD>
这里通过asmcmd的ls命令,可以看到虽然我们存储的为datanew磁盘组的别名文件,实际上是link到asm的omf方式的文件(本质上asm中的文件都是omf方式存储,只是在使用的时候体现asm的客户端程序方式不一样,是直接asm中的omf方式,还是asm中的别名).
rman copy datafile 15
通过rman copy把datafile 15 拷贝到和datafile 14别名一样的文件了
RMAN> copy datafile 15 to '+datanew/xifenfei.dbf'; Starting backup at 27-NOV-17 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile copy input datafile file number=00015 name=+DATA/XFF/DATAFILE/xifenfei.277.961143825 output file name=+DATANEW/xifenfei.dbf tag=TAG20171127T082731 RECID=5 STAMP=961144053 channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03 Finished backup at 27-NOV-17 ASMCMD> ls -l Type Redund Striped Time Sys Name Y XFF/ DATAFILE UNPROT COARSE NOV 27 08:00:00 N xifenfei.dbf => +DATANEW/XFF/DATAFILE/XIFENFEI.256.961144003 ASMCMD> cd xff ASMCMD> ls DATAFILE/ ASMCMD> cd datafile ASMCMD> ls XIFENFEI.256.961144003 ASMCMD>
这里可以看出来,在data磁盘组中,file 14被file 15覆盖掉了
rename file
把data磁盘组中的数据文件rename 到datanew磁盘组中
SQL> alter database rename file '+DATA/XFF/DATAFILE/xifenfei.276.961143809' to '+datanew/xifenfei.dbf'; Database altered. SQL> alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to '+datanew/xifenfei.dbf'; alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to '+datanew/xifenfei.dbf' * ERROR at line 1: ORA-01511: error in renaming log/data files ORA-01523: cannot rename data file to '+data/xifenfei.dbf' - file already part of database
这里我们可以看到,file 14 rename 成功,但是file 15 rename失败,因为在数据库中,已经有了别名的文件(数据文件的路径)
omf自动删除文件
查看原磁盘组datanew中,发现datafile 14被自动删除
ASMCMD> pwd +DATA/XFF/DATAFILE ASMCMD> ls -l Type Redund Striped Time Sys Name DATAFILE UNPROT COARSE NOV 27 08:00:00 Y SYSAUX.257.942061433 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y SYSTEM.256.942061393 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y UNDOTBS1.258.942061449 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y USERS.259.942061449 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y XIFENFEI.277.961143825 ASMCMD>
alert日志证实数据文件被删除
2017-11-27T09:05:03.054741-05:00 alter database rename file '+DATA/XFF/DATAFILE/xifenfei.276.961143809' to '+datanew/xifenfei.dbf' 2017-11-27T09:05:03.114947-05:00 NOTE: Under CF enqueue, no dependency request for disk group DATANEW Deleted Oracle managed file +DATA/XFF/DATAFILE/xifenfei.276.961143809 Completed: alter database rename file '+DATA/XFF/DATAFILE/xifenfei.276.961143809' to '+datanew/xifenfei.dbf' 2017-11-27T09:05:21.471474-05:00 alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to '+data/xifenfei.dbf' ORA-1511 signalled during:alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to'+datanew/xifenfei.dbf'
这里可以证实,数据文件的omf方式管理,在数据文件执行rename file的时候,会自动删除掉老的数据文件.这里悲剧已经发生,由于rman copy 覆盖了datanew磁盘组中的datafile 14,rename file又导致data磁盘组中的datafile 14被自动删除,从而使得datafile 14这个数据文件在两个磁盘组中都丢失.从常规角度来说,如果没有合适的备份该文件无法恢复.如果遭遇到oracle asm中数据文件丢失或者部分覆盖,请保护现场,联系我们(ORACLE数据库恢复技术支持),将为您提供专业数据库技术支持:Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com最大限度抢救您的数据
asm磁盘头全部损坏数据0丢失恢复
接到朋友反馈说他们公司的10.2.0.4(无磁盘头备份)asm 磁盘头都损坏了,确定是被人恶意dd掉了磁盘头的1k,他通过查询发过来结果如下
分析alert日志,确定磁盘组和磁盘信息
asm mount data磁盘组报错
Sun Apr 16 21:39:31 2017 NOTE: cache dismounting group 2/0x3F94036B (DATA) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DATA was not mounted Sun Apr 16 21:39:31 2017 ERROR: no PST quorum in group 3: required 2, found 0
data磁盘组和磁盘信息
Mon Mar 20 16:21:59 2017 NOTE: Hbeat: instance not first (grp 3) NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/raw/raw21 Mon Mar 20 16:21:59 2017 NOTE: F1X0 found on disk 2 fcn 0.47624333 NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/raw/raw22 NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/raw/raw23 NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/raw/raw24 NOTE: F1X0 found on disk 5 fcn 0.47624333 NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/raw/raw26 NOTE: cache opening disk 7 of grp 2: DATA_0007 path:/dev/raw/raw25 NOTE: F1X0 found on disk 7 fcn 0.47624333 NOTE: cache mounting (not first) group 2/0x01B869DC (DATA) Mon Mar 20 16:21:59 2017 kjbdomatt send to node 1 Mon Mar 20 16:21:59 2017 NOTE: attached to recovery domain 2 Mon Mar 20 16:21:59 2017 NOTE: opening chunk 2 at fcn 0.201560874 ABA NOTE: seq=614 blk=4144 Mon Mar 20 16:21:59 2017 NOTE: cache mounting group 2/0x01B869DC (DATA) succeeded SUCCESS: diskgroup DATA was mounted
最后一次正常mount是使用了raw21-raw26的裸设备为data磁盘组,但是这里从DATA_002开始,表明很可能最初了两个asm disk被删除,继续分析alert日志
Mon Oct 15 01:53:16 2012 CREATE DISKGROUP DATA Normal REDUNDANCY DISK '/dev/raw/raw6' SIZE 1144409M , '/dev/raw/raw7' SIZE 1144409M Sat Dec 27 22:41:39 2014 alter diskgroup data add disk '/dev/raw/raw21' Sat Dec 27 22:41:54 2014 alter diskgroup data add disk '/dev/raw/raw22' Sat Dec 27 22:42:14 2014 alter diskgroup data add disk '/dev/raw/raw23' Sat Dec 27 22:42:31 2014 alter diskgroup data add disk '/dev/raw/raw24' Sat Dec 27 22:42:51 2014 alter diskgroup data add disk '/dev/raw/raw26' Sat Dec 27 22:43:10 2014 alter diskgroup data add disk '/dev/raw/raw25' Mon Dec 29 17:55:07 2014 alter diskgroup data drop disk 'DATA_0000' Tue Dec 30 03:14:42 2014 alter diskgroup data drop disk 'DATA_0001'
kfed确认磁盘头损坏情况
通过kfed分析dd出来的磁盘头发现每个磁盘头都一样,第一个block损坏
[oracle@rac1 xifenfei]$ kfed read raw22 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 7F21AF427400 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] [oracle@rac1 xifenfei]$ kfed read raw22 blkn=2|more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 2 ; 0x004: blk=2 kfbh.block.obj: 2147483651 ; 0x008: disk=3 kfbh.check: 2184525105 ; 0x00c: 0x82353531 kfbh.fcn.base: 47625389 ; 0x010: 0x02d6b4ad kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdatb10.aunum: 0 ; 0x000: 0x00000000 kfdatb10.shrink: 448 ; 0x004: 0x01c0 kfdatb10.ub2pad: 0 ; 0x006: 0x0000 kfdatb10.auinfo[0].link.next: 8 ; 0x008: 0x0008 kfdatb10.auinfo[0].link.prev: 8 ; 0x00a: 0x0008 kfdatb10.auinfo[0].free: 0 ; 0x00c: 0x0000 kfdatb10.auinfo[0].total: 448 ; 0x00e: 0x01c0 kfdatb10.auinfo[1].link.next: 16 ; 0x010: 0x0010 kfdatb10.auinfo[1].link.prev: 16 ; 0x012: 0x0010 kfdatb10.auinfo[1].free: 0 ; 0x014: 0x0000 kfdatb10.auinfo[1].total: 0 ; 0x016: 0x0000 kfdatb10.auinfo[2].link.next: 24 ; 0x018: 0x0018 kfdatb10.auinfo[2].link.prev: 24 ; 0x01a: 0x0018 kfdatb10.auinfo[2].free: 0 ; 0x01c: 0x0000 kfdatb10.auinfo[2].total: 0 ; 0x01e: 0x0000 kfdatb10.auinfo[3].link.next: 32 ; 0x020: 0x0020 kfdatb10.auinfo[3].link.prev: 32 ; 0x022: 0x0020
恢复思路
确定磁盘是只被干掉了第一个block,但是由于asm 是10.2.0.4的,没有asm disk header的备份,因此也只能自己去人工kfed修复.但是考虑到该case中所有的asm disk header 全部丢失,无任何参考,完全修复比较麻烦,另外这个库也比较小,考虑修复asm disk header 关键部位,然后通过工具直接拷贝出来数据文件,在文件系统中open库的思路.主要需要恢复磁盘头基本信息(diskname,disksize,disknum,ausize,blocksize,file directory等)
通过kfed找出来file directory
kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 4 ; 0x002: KFBTYP_FILEDIR kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 2 ; 0x004: blk=2 kfbh.block.obj: 1 ; 0x008: file=1 kfbh.check: 2363360058 ; 0x00c: 0x8cde033a kfbh.fcn.base: 48245591 ; 0x010: 0x02e02b57 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfffdb.node.incarn: 1 ; 0x000: A=1 NUMM=0x0 kfffdb.node.frlist.number: 4294967295 ; 0x004: 0xffffffff kfffdb.node.frlist.incarn: 0 ; 0x008: A=0 NUMM=0x0 kfffdb.hibytes: 0 ; 0x00c: 0x00000000 kfffdb.lobytes: 1048576 ; 0x010: 0x00100000 kfffdb.xtntcnt: 3 ; 0x014: 0x00000003 kfffdb.xtnteof: 3 ; 0x018: 0x00000003 kfffdb.blkSize: 4096 ; 0x01c: 0x00001000 kfffdb.flags: 65 ; 0x020: O=1 S=0 S=0 D=0 C=0 I=0 R=1 A=0 kfffdb.fileType: 15 ; 0x021: 0x0f kfffdb.dXrs: 19 ; 0x022: SCHE=0x1 NUMB=0x3
通过kfed找出来disk directory
kfffde[0].xptr.au: 4 ; 0x4a0: 0x00000004 kfffde[0].xptr.disk: 7 ; 0x4a4: 0x0007 kfffde[0].xptr.flags: 0 ; 0x4a6: L=0 E=0 D=0 S=0 kfffde[0].xptr.chk: 41 ; 0x4a7: 0x29 kfffde[1].xptr.au: 17405 ; 0x4a8: 0x000043fd kfffde[1].xptr.disk: 6 ; 0x4ac: 0x0006 kfffde[1].xptr.flags: 0 ; 0x4ae: L=0 E=0 D=0 S=0 kfffde[1].xptr.chk: 146 ; 0x4af: 0x92 kfffde[2].xptr.au: 330031 ; 0x4b0: 0x0005092f kfffde[2].xptr.disk: 4 ; 0x4b4: 0x0004 kfffde[2].xptr.flags: 0 ; 0x4b6: L=0 E=0 D=0 S=0 kfffde[2].xptr.chk: 13 ; 0x4b7: 0x0d kfddde[2].entry.incarn: 1 ; 0x3a4: A=1 NUMM=0x0 kfddde[2].entry.hash: 2 ; 0x3a8: 0x00000002 kfddde[2].entry.refer.number:4294967295 ; 0x3ac: 0xffffffff kfddde[2].entry.refer.incarn: 0 ; 0x3b0: A=0 NUMM=0x0 kfddde[2].dsknum: 2 ; 0x3b4: 0x0002 kfddde[2].state: 2 ; 0x3b6: KFDSTA_NORMAL kfddde[2].ddchgfl: 0 ; 0x3b7: 0x00 kfddde[2].dskname: DATA_0002 ; 0x3b8: length=9 kfddde[2].fgname: DATA_0002 ; 0x3d8: length=9 kfddde[2].crestmp.hi: 33010550 ; 0x3f8: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfddde[2].crestmp.lo: 2793310208 ; 0x3fc: USEC=0x0 MSEC=0x3a2 SECS=0x27 MINS=0x29 kfddde[2].failstmp.hi: 0 ; 0x400: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0 kfddde[2].failstmp.lo: 0 ; 0x404: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfddde[2].timer: 0 ; 0x408: 0x00000000 kfddde[2].size: 1258291 ; 0x40c: 0x00133333 kfddde[2].srRloc.super.hiStart: 0 ; 0x410: 0x00000000 kfddde[2].srRloc.super.loStart: 0 ; 0x414: 0x00000000 kfddde[2].srRloc.super.length: 0 ; 0x418: 0x00000000 kfddde[2].srRloc.incarn: 0 ; 0x41c: 0x00000000 kfddde[2].dskrprtm: 0 ; 0x420: 0x00000000 kfddde[2].start0: 0 ; 0x424: 0x00000000 kfddde[2].size0: 1258291 ; 0x428: 0x00133333 kfddde[2].used0: 1258229 ; 0x42c: 0x001332f5 kfddde[2].slot: 0 ; 0x430: 0x00000000 kfddde[3].entry.incarn: 1 ; 0x564: A=1 NUMM=0x0 kfddde[3].entry.hash: 3 ; 0x568: 0x00000003 kfddde[3].entry.refer.number:4294967295 ; 0x56c: 0xffffffff kfddde[3].entry.refer.incarn: 0 ; 0x570: A=0 NUMM=0x0 kfddde[3].dsknum: 3 ; 0x574: 0x0003 kfddde[3].state: 2 ; 0x576: KFDSTA_NORMAL kfddde[3].ddchgfl: 0 ; 0x577: 0x00 kfddde[3].dskname: DATA_0003 ; 0x578: length=9 kfddde[3].fgname: DATA_0003 ; 0x598: length=9 kfddde[3].crestmp.hi: 33010550 ; 0x5b8: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfddde[3].crestmp.lo: 2811397120 ; 0x5bc: USEC=0x0 MSEC=0xa1 SECS=0x39 MINS=0x29 kfddde[3].failstmp.hi: 0 ; 0x5c0: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0 kfddde[3].failstmp.lo: 0 ; 0x5c4: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfddde[3].timer: 0 ; 0x5c8: 0x00000000 kfddde[3].size: 1258291 ; 0x5cc: 0x00133333 kfddde[3].srRloc.super.hiStart: 0 ; 0x5d0: 0x00000000 kfddde[3].srRloc.super.loStart: 0 ; 0x5d4: 0x00000000 kfddde[3].srRloc.super.length: 0 ; 0x5d8: 0x00000000 kfddde[3].srRloc.incarn: 0 ; 0x5dc: 0x00000000 kfddde[3].dskrprtm: 0 ; 0x5e0: 0x00000000 kfddde[3].start0: 0 ; 0x5e4: 0x00000000 kfddde[3].size0: 1258291 ; 0x5e8: 0x00133333 kfddde[3].used0: 1258128 ; 0x5ec: 0x00133290 kfddde[3].slot: 0 ; 0x5f0: 0x00000000 kfddde[4].entry.incarn: 1 ; 0x724: A=1 NUMM=0x0 kfddde[4].entry.hash: 4 ; 0x728: 0x00000004 kfddde[4].entry.refer.number:4294967295 ; 0x72c: 0xffffffff kfddde[4].entry.refer.incarn: 0 ; 0x730: A=0 NUMM=0x0 kfddde[4].dsknum: 4 ; 0x734: 0x0004 kfddde[4].state: 2 ; 0x736: KFDSTA_NORMAL kfddde[4].ddchgfl: 0 ; 0x737: 0x00 kfddde[4].dskname: DATA_0004 ; 0x738: length=9 kfddde[4].fgname: DATA_0004 ; 0x758: length=9 kfddde[4].crestmp.hi: 33010550 ; 0x778: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfddde[4].crestmp.lo: 2834565120 ; 0x77c: USEC=0x0 MSEC=0x102 SECS=0xf MINS=0x2a kfddde[4].failstmp.hi: 0 ; 0x780: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0 kfddde[4].failstmp.lo: 0 ; 0x784: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfddde[4].timer: 0 ; 0x788: 0x00000000 kfddde[4].size: 1258291 ; 0x78c: 0x00133333 kfddde[4].srRloc.super.hiStart: 0 ; 0x790: 0x00000000 kfddde[4].srRloc.super.loStart: 0 ; 0x794: 0x00000000 kfddde[4].srRloc.super.length: 0 ; 0x798: 0x00000000 kfddde[4].srRloc.incarn: 0 ; 0x79c: 0x00000000 kfddde[4].dskrprtm: 0 ; 0x7a0: 0x00000000 kfddde[4].start0: 0 ; 0x7a4: 0x00000000 kfddde[4].size0: 1258291 ; 0x7a8: 0x00133333 kfddde[4].used0: 1258291 ; 0x7ac: 0x00133333 kfddde[4].slot: 0 ; 0x7b0: 0x00000000 kfddde[5].entry.incarn: 1 ; 0x8e4: A=1 NUMM=0x0 kfddde[5].entry.hash: 5 ; 0x8e8: 0x00000005 kfddde[5].entry.refer.number:4294967295 ; 0x8ec: 0xffffffff kfddde[5].entry.refer.incarn: 0 ; 0x8f0: A=0 NUMM=0x0 kfddde[5].dsknum: 5 ; 0x8f4: 0x0005 kfddde[5].state: 2 ; 0x8f6: KFDSTA_NORMAL kfddde[5].ddchgfl: 0 ; 0x8f7: 0x00 kfddde[5].dskname: DATA_0005 ; 0x8f8: length=9 kfddde[5].fgname: DATA_0005 ; 0x918: length=9 kfddde[5].crestmp.hi: 33010550 ; 0x938: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfddde[5].crestmp.lo: 2853560320 ; 0x93c: USEC=0x0 MSEC=0x178 SECS=0x21 MINS=0x2a kfddde[5].failstmp.hi: 0 ; 0x940: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0 kfddde[5].failstmp.lo: 0 ; 0x944: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfddde[5].timer: 0 ; 0x948: 0x00000000 kfddde[5].size: 1258291 ; 0x94c: 0x00133333 kfddde[5].srRloc.super.hiStart: 0 ; 0x950: 0x00000000 kfddde[5].srRloc.super.loStart: 0 ; 0x954: 0x00000000 kfddde[5].srRloc.super.length: 0 ; 0x958: 0x00000000 kfddde[5].srRloc.incarn: 0 ; 0x95c: 0x00000000 kfddde[5].dskrprtm: 0 ; 0x960: 0x00000000 kfddde[5].start0: 0 ; 0x964: 0x00000000 kfddde[5].size0: 1258291 ; 0x968: 0x00133333 kfddde[5].used0: 1258255 ; 0x96c: 0x0013330f kfddde[5].slot: 0 ; 0x970: 0x00000000 kfddde[6].entry.incarn: 1 ; 0xaa4: A=1 NUMM=0x0 kfddde[6].entry.hash: 6 ; 0xaa8: 0x00000006 kfddde[6].entry.refer.number:4294967295 ; 0xaac: 0xffffffff kfddde[6].entry.refer.incarn: 0 ; 0xab0: A=0 NUMM=0x0 kfddde[6].dsknum: 6 ; 0xab4: 0x0006 kfddde[6].state: 2 ; 0xab6: KFDSTA_NORMAL kfddde[6].ddchgfl: 0 ; 0xab7: 0x00 kfddde[6].dskname: DATA_0006 ; 0xab8: length=9 kfddde[6].fgname: DATA_0006 ; 0xad8: length=9 kfddde[6].crestmp.hi: 33010550 ; 0xaf8: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfddde[6].crestmp.lo: 2875645952 ; 0xafc: USEC=0x0 MSEC=0x1b8 SECS=0x36 MINS=0x2a kfddde[6].failstmp.hi: 0 ; 0xb00: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0 kfddde[6].failstmp.lo: 0 ; 0xb04: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfddde[6].timer: 0 ; 0xb08: 0x00000000 kfddde[6].size: 1258291 ; 0xb0c: 0x00133333 kfddde[6].srRloc.super.hiStart: 0 ; 0xb10: 0x00000000 kfddde[6].srRloc.super.loStart: 0 ; 0xb14: 0x00000000 kfddde[6].srRloc.super.length: 0 ; 0xb18: 0x00000000 kfddde[6].srRloc.incarn: 0 ; 0xb1c: 0x00000000 kfddde[6].dskrprtm: 0 ; 0xb20: 0x00000000 kfddde[6].start0: 0 ; 0xb24: 0x00000000 kfddde[6].size0: 1258291 ; 0xb28: 0x00133333 kfddde[6].used0: 1258247 ; 0xb2c: 0x00133307 kfddde[6].slot: 0 ; 0xb30: 0x00000000 kfddde[7].entry.incarn: 1 ; 0xc64: A=1 NUMM=0x0 kfddde[7].entry.hash: 7 ; 0xc68: 0x00000007 kfddde[7].entry.refer.number:4294967295 ; 0xc6c: 0xffffffff kfddde[7].entry.refer.incarn: 0 ; 0xc70: A=0 NUMM=0x0 kfddde[7].dsknum: 7 ; 0xc74: 0x0007 kfddde[7].state: 2 ; 0xc76: KFDSTA_NORMAL kfddde[7].ddchgfl: 0 ; 0xc77: 0x00 kfddde[7].dskname: DATA_0007 ; 0xc78: length=9 kfddde[7].fgname: DATA_0007 ; 0xc98: length=9 kfddde[7].crestmp.hi: 33010550 ; 0xcb8: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfddde[7].crestmp.lo: 2898849792 ; 0xcbc: USEC=0x0 MSEC=0x23c SECS=0xc MINS=0x2b kfddde[7].failstmp.hi: 0 ; 0xcc0: HOUR=0x0 DAYS=0x0 MNTH=0x0 YEAR=0x0 kfddde[7].failstmp.lo: 0 ; 0xcc4: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfddde[7].timer: 0 ; 0xcc8: 0x00000000 kfddde[7].size: 1258291 ; 0xccc: 0x00133333 kfddde[7].srRloc.super.hiStart: 0 ; 0xcd0: 0x00000000 kfddde[7].srRloc.super.loStart: 0 ; 0xcd4: 0x00000000 kfddde[7].srRloc.super.length: 0 ; 0xcd8: 0x00000000 kfddde[7].srRloc.incarn: 0 ; 0xcdc: 0x00000000 kfddde[7].dskrprtm: 0 ; 0xce0: 0x00000000 kfddde[7].start0: 0 ; 0xce4: 0x00000000 kfddde[7].size0: 1258291 ; 0xce8: 0x00133333 kfddde[7].used0: 1258209 ; 0xcec: 0x001332e1 kfddde[7].slot: 0 ; 0xcf0: 0x00000000
结合上述信息构造类似磁盘头文件
kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0 kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0 kfbh.check: 3123334821 ; 0x00c: 0xba2a4ea5 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISKVOL2 ; 0x000: length=12 kfdhdb.driver.reserved[0]: 827084630 ; 0x008: 0x314c4f56 kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 168820736 ; 0x020: 0x0a100000 kfdhdb.dsknum: 2 ; 0x024: 0x0002 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_NORMAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DATA_0002 ; 0x028: length=9 kfdhdb.grpname: DATA ; 0x048: length=4 kfdhdb.fgname: DATA_0002 ; 0x068: length=9 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 33010550 ; 0x3f8: HOUR=0x16 DAYS=0x1b MNTH=0xc YEAR=0x7de kfdhdb.crestmp.lo: 2793310208 ; 0x3fc: USEC=0x0 MSEC=0x3a2 SECS=0x27 MINS=0x29 kfdhdb.mntstmp.hi: 33049840 ; 0x0b0: HOUR=0x10 DAYS=0x7 MNTH=0x3 YEAR=0x7e1 kfdhdb.mntstmp.lo: 1588567040 ; 0x0b4: USEC=0x0 MSEC=0x3e7 SECS=0x2a MINS=0x17 kfdhdb.secsize: 512 ; 0x0b8: 0x0200 kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80 kfdhdb.dsksize: 1258291 ; 0x40c: 0x00133333 kfdhdb.pmcnt: 19 ; 0x0c8: 0x00000013 kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001 kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002 kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002
然后通过kfed merge分别对所有磁盘头进行重构,然后通过dul去识别拷贝文件
+DATA/XFF/spfileXFF.ora.265.869858859 +DATA/XFF/CONTROLFILE/Current.256.796701475 +DATA/XFF/ONLINELOG/group_1.257.796701475 +DATA/XFF/ONLINELOG/group_2.258.796701485 +DATA/XFF/ONLINELOG/group_3.266.796704261 +DATA/XFF/ONLINELOG/group_4.267.796704277 +DATA/XFF/ONLINELOG/group_5.1235.824131117 +DATA/XFF/ONLINELOG/group_6.1115.824200515 +DATA/XFF/ONLINELOG/group_7.1113.824200587 +DATA/XFF/ONLINELOG/group_8.1112.824200627 +DATA/XFF/ONLINELOG/group_9.1066.824201189 +DATA/XFF/ONLINELOG/group_10.1063.824201207 +DATA/XFF/ONLINELOG/group_11.1062.824201287 +DATA/XFF/ONLINELOG/group_12.1061.824201301 File 259 datafile size 2147491840, block size 8192 File 260 datafile size 32186048512, block size 8192 File 261 datafile size 6897541120, block size 8192 File 263 datafile size 27409784832, block size 8192 File 264 datafile size 34359730176, block size 8192 File 280 datafile size 31457288192, block size 8192 File 281 datafile size 31457288192, block size 8192 File 330 datafile size 5242888192, block size 8192 File 334 datafile size 20971528192, block size 8192 File 382 datafile size 20971528192, block size 8192 File 383 datafile size 20971528192, block size 8192 File 384 datafile size 31457288192, block size 8192 File 385 datafile size 31457288192, block size 8192 File 386 datafile size 31457288192, block size 8192 File 387 datafile size 4294975488, block size 8192 File 388 datafile size 4294975488, block size 8192 File 389 datafile size 4294975488, block size 8192 File 390 datafile size 4294975488, block size 8192 File 391 datafile size 4294975488, block size 8192 File 392 datafile size 4294975488, block size 8192 File 394 datafile size 31457288192, block size 8192 File 491 datafile size 20971528192, block size 8192 File 494 datafile size 20971528192, block size 8192 File 578 datafile size 31457288192, block size 8192 File 597 datafile size 20971528192, block size 8192 File 613 datafile size 4294975488, block size 8192 File 638 datafile size 31457288192, block size 8192 File 668 datafile size 16988184576, block size 8192 File 688 datafile size 20971528192, block size 8192 File 740 datafile size 31457288192, block size 8192 File 787 datafile size 31457288192, block size 8192 File 798 datafile size 31457288192, block size 8192 File 806 datafile size 31457288192, block size 8192 File 810 datafile size 31457288192, block size 8192 File 845 datafile size 31457288192, block size 8192 File 886 datafile size 31457288192, block size 8192 File 887 datafile size 31457288192, block size 8192 File 889 datafile size 31457288192, block size 8192 File 892 datafile size 31457288192, block size 8192 File 903 datafile size 31457288192, block size 8192 File 932 datafile size 31457288192, block size 8192 File 933 datafile size 3145736192, block size 8192 File 951 datafile size 20971528192, block size 8192 File 953 datafile size 31457288192, block size 8192 File 955 datafile size 31457288192, block size 8192 File 963 datafile size 31457288192, block size 8192 File 1000 datafile size 31457288192, block size 8192 File 1001 datafile size 12035563520, block size 8192 File 1031 datafile size 31457288192, block size 8192 File 1033 datafile size 31457288192, block size 8192 File 1035 datafile size 31457288192, block size 8192 File 1037 datafile size 31457288192, block size 8192 File 1045 datafile size 31457288192, block size 8192 File 1073 datafile size 4294975488, block size 8192 File 1074 datafile size 4294975488, block size 8192 File 1075 datafile size 4294975488, block size 8192 File 1076 datafile size 8589942784, block size 8192 File 1077 datafile size 31457288192, block size 8192 File 1078 datafile size 8589942784, block size 8192 File 1079 datafile size 8589942784, block size 8192 File 1080 datafile size 4294975488, block size 8192 File 1081 datafile size 8589942784, block size 8192 File 1082 datafile size 8589942784, block size 8192 File 1083 datafile size 8589942784, block size 8192 File 1084 datafile size 8589942784, block size 8192 File 1085 datafile size 32365355008, block size 8192 File 1086 datafile size 9071239168, block size 8192 File 1116 datafile size 8589942784, block size 8192 File 1133 datafile size 8589942784, block size 8192 File 1219 datafile size 31457288192, block size 8192 File 1245 datafile size 31457288192, block size 8192 File 1249 datafile size 31457288192, block size 8192 File 1251 datafile size 31457288192, block size 8192 File 1322 datafile size 4294975488, block size 8192 File 1442 datafile size 31457288192, block size 8192 File 1468 datafile size 1048584192, block size 8192 File 1508 datafile size 31457288192, block size 8192 File 1554 datafile size 4294975488, block size 8192 File 1570 datafile size 31457288192, block size 8192 File 2004 datafile size 31457288192, block size 8192 File 2005 datafile size 31457288192, block size 8192 File 2344 datafile size 31457288192, block size 8192 File 2345 datafile size 31457288192, block size 8192 File 2348 datafile size 31457288192, block size 8192 File 2617 datafile size 10737426432, block size 8192 File 2618 datafile size 21474844672, block size 8192 File 2766 datafile size 33554440192, block size 8192 File 2782 datafile size 31457288192, block size 8192 File 2784 datafile size 31457288192, block size 8192 File 2893 datafile size 31457288192, block size 8192 File 2924 datafile size 31457288192, block size 8192 File 2925 datafile size 31457288192, block size 8192 File 2926 datafile size 31457288192, block size 8192 File 2983 datafile size 31457288192, block size 8192 File 2984 datafile size 31457288192, block size 8192 File 3634 datafile size 31457288192, block size 8192 File 3909 datafile size 31457288192, block size 8192 File 3917 datafile size 31457288192, block size 8192 File 3920 datafile size 31457288192, block size 8192 File 3922 datafile size 31457288192, block size 8192
剩下的事情就比较简单了,通过把spfile,controlfile,datafile文件拷贝出来,本地启动数据库,恢复成功
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
有客户反馈他们重启系统之后,发现asmlib创建的asmdisk丢失了,然后又使用oracleasm deletedisk和createdisk重新创建的asm disk,最后发现asm diskgroup无法mount。让客户通过dd 备份5m数据,然后使用kfed分析
kefd分析结果
E:\OneDrive\ORACLE\recover\no_backup\asm\kfedwin>kfed read H:\temp\asmlib\xx.img kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0 kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0 kfbh.check: 3760689243 ; 0x00c: 0xe027905b kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 E:\OneDrive\ORACLE\recover\no_backup\asm\kfedwin>kfed read H:\temp\asmlib\xx.img blkn=1 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0 kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 E:\OneDrive\ORACLE\recover\no_backup\asm\kfedwin>kfed read H:\temp\asmlib\xx.img blkn=10 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0 kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 E:\OneDrive\ORACLE\recover\no_backup\asm\kfedwin>kfed read H:\temp\asmlib\xx.img blkn=255 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0 kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 E:\OneDrive\ORACLE\recover\no_backup\asm\kfedwin>kfed read H:\temp\asmlib\xx.img blkn=256|more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 17 ; 0x002: KFBTYP_PST_META kfbh.datfmt: 2 ; 0x003: 0x02 kfbh.block.blk: 256 ; 0x004: T=0 NUMB=0x100 kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0 kfbh.check: 3925268785 ; 0x00c: 0xe9f6d931 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdpHdrPairBv1.first.super.time.hi:32994098 ; 0x000: HOUR=0x12 DAYS=0x19 MNTH=0x c YEAR=0x7dd kfdpHdrPairBv1.first.super.time.lo:1614030848 ; 0x004: USEC=0x0 MSEC=0x10a SECS= 0x3 MINS=0x18 kfdpHdrPairBv1.first.super.last: 2 ; 0x008: 0x00000002 kfdpHdrPairBv1.first.super.next: 2 ; 0x00c: 0x00000002 kfdpHdrPairBv1.first.super.copyCnt: 1 ; 0x010: 0x01 kfdpHdrPairBv1.first.super.version: 1 ; 0x011: 0x01 kfdpHdrPairBv1.first.super.ub2spare: 0 ; 0x012: 0x0000 kfdpHdrPairBv1.first.super.incarn: 1 ; 0x014: 0x00000001 kfdpHdrPairBv1.first.super.copy[0]: 0 ; 0x018: 0x0000 kfdpHdrPairBv1.first.super.copy[1]: 0 ; 0x01a: 0x0000 kfdpHdrPairBv1.first.super.copy[2]: 0 ; 0x01c: 0x0000 ……
因为kfed默认每个block为4k,这里提示256是ok的,255是损坏的,从而推测出来,很可能oracleasm createdisk损坏了1M的数据。由于默认au是1m,而且数据库版本是11.2.0.3,而且第256个blkn开始没有损坏,因此初步判断可以考虑使用备份asm disk header来恢复磁盘头
检查还原磁盘头的asm disk
[grid@xifenfei1 disks]$ kfed read DATA1 kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 2776451033 ; 0x00c: 0xa57d47d9 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISKDATA1 ; 0x000: length=13 kfdhdb.driver.reserved[0]: 1096040772 ; 0x008: 0x41544144 kfdhdb.driver.reserved[1]: 49 ; 0x00c: 0x00000031 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 0 ; 0x024: 0x0000 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DATA_0000 ; 0x028: length=9 kfdhdb.grpname: DATA ; 0x048: length=4 kfdhdb.fgname: DATA_0000 ; 0x068: length=9 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 32994099 ; 0x0a8: HOUR=0x13 DAYS=0x19 MNTH=0xc YEAR=0x7dd kfdhdb.crestmp.lo: 2797442048 ; 0x0ac: USEC=0x0 MSEC=0x365 SECS=0x2b MINS=0x29 kfdhdb.mntstmp.hi: 33022061 ; 0x0b0: HOUR=0xd DAYS=0x3 MNTH=0x8 YEAR=0x7df kfdhdb.mntstmp.lo: 816879616 ; 0x0b4: USEC=0x0 MSEC=0x26 SECS=0xb MINS=0xc kfdhdb.secsize: 512 ; 0x0b8: 0x0200 kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 …………
证明磁盘头确实被比较完美的修复了,现在的任务是尝试mount磁盘组
mount磁盘组
[grid@xifenfei1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Thu Aug 6 20:54:53 2015 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter diskgroup data mount; Diskgroup altered. SQL> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options
asm diskgroup已经正常mount,使用asmcmd命令检查文件是否正常
分析磁盘组数据是否正常
[grid@xifenfei1 ~]$ asmcmd ASMCMD> lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED EXTERN N 512 4096 1048576 1622060 636493 0 636493 0 N DATA/ ASMCMD> cd data ASMCMD> ls ORCL/ ASMCMD> cd orcl ASMCMD> ls CONTROLFILE/ DATAFILE/ ONLINELOG/ PARAMETERFILE/ TEMPFILE/ spfileorcl.ora ASMCMD> cd datafile ASMCMD> ls XIFENFEI20130801.314.835191517 XIFENFEI20140101.321.835191571 XIFENFEI20140201.322.835191573 XIFENFEI20140301.323.835191573 ………… SYSAUX.270.835182535 SYSAUX.838.874669369 SYSTEM.271.835182533 SYSTEM.823.873555791 SYSTEM.945.883146947 …………
这里看到磁盘组里面的数据文件都正常,使用同样的方法,继续mount其他磁盘组。
尝试启动数据库
SQL> startup ORACLE 例程已经启动。 Total System Global Area 5010685952 bytes Fixed Size 2236968 bytes Variable Size 2013269464 bytes Database Buffers 2986344448 bytes Redo Buffers 8835072 bytes 数据库装载完毕。 ORA-16038: 日志 14 sequence# 21145 无法归档 ORA-19504: 无法创建文件"" ORA-00312: 联机日志 14 线程 2: '+DATA/orcl/onlinelog/group_14.284.835184569' ORA-00312: 联机日志 14 线程 2: '+ARCH/orcl/onlinelog/group_14.287.835184569'
查看数据库alert日志
ARC1: Archival started ARC2: Archival started ARC2: Becoming the 'no FAL' ARCH ARC2: Becoming the 'no SRL' ARCH ARC1: Becoming the heartbeat ARCH ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Thu Aug 06 21:04:06 2015 Thread 2 advanced to log sequence 21146 (thread recovery) Picked broadcast on commit scheme to generate SCNs Thread 2 advanced to log sequence 21147 (before internal thread enable) Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_27402.trc: ORA-19816: 警告: 文件可能存在于数据库未知的 db_recovery_file_dest 中。 ORA-17502: ksfdcre: 4 未能创建文件 +ARCH ORA-15196: invalid ASM block header [kfc.c:19572] [check_kfbh] [1] [47962] [1344818371 != 630731762] ORA-15130: diskgroup "ARCH" is being dismounted ORA-15066: offlining disk "ARCH_0000" in group "ARCH" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] ARCH: Error 19504 Creating archive log file to '+ARCH' NOTE: Deferred communication with ASM instance Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_27402.trc: ORA-15130: diskgroup "ARCH" is being dismounted NOTE: deferred map free for map id 754 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_27402.trc: ORA-16038: 日志 14 sequence# 21145 无法归档 ORA-19504: 无法创建文件"" ORA-00312: 联机日志 14 线程 2: '+DATA/orcl/onlinelog/group_14.284.835184569' ORA-00312: 联机日志 14 线程 2: '+ARCH/orcl/onlinelog/group_14.287.835184569' ORA-16038 signalled during: ALTER DATABASE OPEN... Thu Aug 06 21:04:10 2015 SUCCESS: diskgroup ARCH was dismounted SUCCESS: diskgroup ARCH was dismounted Thu Aug 06 21:04:10 2015 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ckpt_27353.trc: ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '+ARCH/orcl/controlfile/current.256.835182531' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ckpt_27353.trc: ORA-00221: error on write to control file ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '+ARCH/orcl/controlfile/current.256.835182531' ORA-15078: ASM diskgroup was forcibly dismounted ORA-15078: ASM diskgroup was forcibly dismounted Thu Aug 06 21:04:10 2015 System state dump requested by (instance=1, osid=27353 (CKPT)), summary=[abnormal instance termination]. System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_27318.trc CKPT (ospid: 27353): terminating the instance due to error 221 Instance terminated by CKPT, pid = 27353
查看asm alert日志
Thu Aug 06 21:04:07 2015 WARNING: cache read a corrupt block: group=2(ARCH) dsk=0 blk=1 disk=0 (ARCH_0000) incarn=3942486752 au=0 blk=1 count=1 Errors in file /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_ora_27462.trc: ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] NOTE: a corrupted block from group ARCH was dumped to /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_ora_27462.trc WARNING: cache read (retry) a corrupt block: group=2(ARCH) dsk=0 blk=1 disk=0 (ARCH_0000) incarn=3942486752 au=0 blk=1 count=1 Errors in file /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_ora_27462.trc: ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] ERROR: cache failed to read group=2(ARCH) dsk=0 blk=1 from disk(s): 0(ARCH_0000) ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26076] [endian_kfbh] [2147483648] [1] [0 != 1] NOTE: cache initiating offline of disk 0 group ARCH NOTE: process _user27462_+asm1 (27462) initiating offline of disk 0.3942486752 (ARCH_0000) with mask 0x7e in group 2 WARNING: Disk 0 (ARCH_0000) in group 2 in mode 0x7f is now being taken offline on ASM inst 1 NOTE: initiating PST update: grp = 2, dsk = 0/0xeafd92e0, mask = 0x6a, op = clear Thu Aug 06 21:04:07 2015 GMON updating disk modes for group 2 at 17 for pid 35, osid 27462 ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) Thu Aug 06 21:04:07 2015 NOTE: cache dismounting (not clean) group 2/0x723D6245 (ARCH) NOTE: messaging CKPT to quiesce pins Unix process pid: 27089, image: oracle@xifenfei1 (B000) WARNING: Offline of disk 0 (ARCH_0000) in group 2 and mode 0x7f failed on ASM inst 1 Thu Aug 06 21:04:07 2015 NOTE: halting all I/Os to diskgroup 2 (ARCH) System State dumped to trace file /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_ora_27462.trc NOTE: AMDU dump of disk group ARCH created at /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace Thu Aug 06 21:04:09 2015 NOTE: LGWR doing non-clean dismount of group 2 (ARCH) NOTE: LGWR sync ABA=126.806 last written ABA 126.806
这里可以看出来,报错的block为arch磁盘组的第一个磁盘的第一个au的第二个block,而我们在开始的时候,已经分析了asm disk的第一个au完全损坏,并且我们使用了备份磁盘头进行来还原,勉强可以让磁盘组mount起来,但是由于数据库在启动的时候,需要对redo进行归档,而归档的过程需要写到arch磁盘组里面,这个时候需要访问到au=0 blk=1,而这个块本身是坏的,因此这个时候该块盘的disk就被offline掉了,而这个磁盘组是外部冗余的,因此磁盘组dismount了,所以数据库无法启动.
分析第一个au里面到底有哪些东西
SQL> select DISK_NUMBER,path from v$asm_disk; DISK_NUMBER PATH ----------- -------------------------------------------------- 0 /dev/raw/raw1 2 /dev/raw/raw3 1 /dev/raw/raw2 [oracle@xifenfei raw]$ kfed read raw1 blkn=1|grep kfbh.type kfbh.type: 2 ; 0x002: KFBTYP_FREESPC [oracle@xifenfei raw]$ kfed read raw1 blkn=2|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL [oracle@xifenfei raw]$ kfed read raw1 blkn=3|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL [oracle@xifenfei raw]$ kfed read raw1 blkn=255|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL [oracle@xifenfei raw]$ kfed read raw2 blkn=1|grep kfbh.type kfbh.type: 2 ; 0x002: KFBTYP_FREESPC [oracle@xifenfei raw]$ kfed read raw2 blkn=2|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL [oracle@xifenfei raw]$ kfed read raw2 blkn=255|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL [oracle@xifenfei raw]$ kfed read raw3 blkn=1|grep kfbh.type kfbh.type: 2 ; 0x002: KFBTYP_FREESPC [oracle@xifenfei raw]$ kfed read raw3 blkn=2|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL [oracle@xifenfei raw]$ kfed read raw3 blkn=255|grep kfbh.type kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
通过一个测试机器的一个磁盘组进行分析,我们可以基本上确定asm 第一个au除了asm disk header的KFBTYP_DISKHEAD之外,其他主要是KFBTYP_FREESPC(Free Space Table)和KFBTYP_ALLOCTBL(allocator table),主要就是记录asm中au的分配情况,也就是进一步说明,如果我不对asm里面的数据使用更多的au分配或者回收au,在缺少第一个au的1-255个block的信息情况下,asm的磁盘组也不会dismount。根据这个思路,让数据库归档到本地,然后继续测试
继续open数据库
SQL> startup ORACLE 例程已经启动。 Total System Global Area 5010685952 bytes Fixed Size 2236968 bytes Variable Size 2013269464 bytes Database Buffers 2986344448 bytes Redo Buffers 8835072 bytes 数据库装载完毕。 SQL> alter database open; 数据库已更改。 LGWR: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Fri Aug 07 02:43:13 2015 ARC1 started with pid=34, OS id=22778 Fri Aug 07 02:43:13 2015 ARC2 started with pid=35, OS id=22780 Fri Aug 07 02:43:13 2015 ARC3 started with pid=36, OS id=22782 ARC1: Archival started ARC2: Archival started ARC2: Becoming the 'no FAL' ARCH ARC2: Becoming the 'no SRL' ARCH ARC1: Becoming the heartbeat ARCH ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Fri Aug 07 02:43:24 2015 Thread 1 opened at log sequence 18604 Current log# 10 seq# 18604 mem# 0: /tmp/xifenfei/otherfile/group_10.273.835182533 Current log# 10 seq# 18604 mem# 1: /tmp/xifenfei/otherfile/group_10.263.835182533 Successful open of redo thread 1 Fri Aug 07 02:43:24 2015 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Fri Aug 07 02:43:25 2015 SMON: enabling cache recovery Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads Fri Aug 07 02:43:26 2015 minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:21328 status:0x7 minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000 Fri Aug 07 02:43:26 2015 Redo thread 2 internally disabled at seq 21147 (CKPT) [21341] Successfully onlined Undo Tablespace 2. Undo initialization finished serial:0 start:96999124 end:97000624 diff:1500 (15 seconds) Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Database Characterset is ZHS16GBK No Resource Manager plan active Starting background process GTX0 Fri Aug 07 02:43:31 2015 GTX0 started with pid=37, OS id=22803 Starting background process RCBG Fri Aug 07 02:43:31 2015 RCBG started with pid=38, OS id=22805 replication_dependency_tracking turned off (no async multimaster replication found) Fri Aug 07 02:43:34 2015 Archived Log entry 73876 added for thread 2 sequence 21145 ID 0x513c613f dest 1: <---果然有归档操作发生 Starting background process QMNC Fri Aug 07 02:43:34 2015 QMNC started with pid=39, OS id=22812 Fri Aug 07 02:43:35 2015 Archived Log entry 73877 added for thread 2 sequence 21146 ID 0x513c613f dest 1: Fri Aug 07 02:43:35 2015 ARC0: Archiving disabled thread 2 sequence 21147 Archived Log entry 73878 added for thread 2 sequence 21147 ID 0x513c613f dest 1: Fri Aug 07 02:43:37 2015 Completed: alter database open
现在到了这一步,基本上可以确定,数据库是零丢失恢复。由于asm 第一个au丢失数据严重,想要彻底修复比较难,考虑把数据库启动到mount/read only状态然后使用rman备份数据,然后进行重建asm 磁盘组,再迁移回来。至此完美恢复asmlib的磁盘被oracleasm重写的故障恢复,实现数据0丢失.当然在整个恢复过程没有于此的简单,涉及到在votedisk损坏的情况下,如何mount磁盘组,vote diskgroup的损坏修复问题,磁盘组在10g/11.1和11.2还原磁盘头备份的问题等问题.
虽然本次的恢复案例中,由于asmlib的asm disk不可见就轻易使用oracleasm createdisk命令对磁盘进行了重建,犯了一个很大错误,但是在重建之后,发现磁盘组依旧异常,未继续操作(比如重建磁盘组等),为最后的数据完全恢复创造了必要条件,使得客户的没有任何数据损失。如果再对除磁盘组继续复写操作,可能会导致数据永久性丢失。这个教训告诉我们:遇到自己不能把握的事情,及时终止,不要让错误越走越远