标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,670)
- DB2 (22)
- MySQL (73)
- Oracle (1,532)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (21)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (65)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (560)
- Oracle安装升级 (91)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (78)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
- ORA-600 16703故障再现
- 数据库启动报ORA-27102 OSD-00026 O/S-Error: (OS 1455)
- .[metro777@cock.li].Elbie勒索病毒加密数据库恢复
- 应用连接错误,初始化mysql数据库恢复
- RAC默认服务配置优先节点
- Oracle 19c RAC 替换私网操作
- 监听报TNS-12541 TNS-12560 TNS-00511错误
- drop tablespace xxx including contents恢复
- Linux 8 修改网卡名称
标签归档:ORA-15040
Oracle Exadata坏盘导致磁盘组无法mount恢复
接到朋友求救有客户oracle exadata一体机 的 asm磁盘组无法mount,希望我们提供恢复支持服务
经过分析和了解,大致问题是:磁盘空间已经超容量使用(部分数据不能完成ASM镜像),最近又损坏一块盘,导致asm 磁盘组无法mount。我们分析后,通过重构exadata celldisk数据,将asm 磁盘组 mount成功后,实现五套数据库全部open成功(由于底层磁盘部分数据损坏,导致部分数据访问报错,需要在oracle层面进行处理)。
本次问题的具体分析和处理如下:
存放数据库文件的磁盘组不能mount
Wed Dec 12 21:29:04 2018 SQL> alter diskgroup DATA_XFF mount force NOTE: cache registered group DATA_XFF number=1 incarn=0x5fe882cb NOTE: cache began mount (first) of group DATA_XFF number=1 incarn=0x5fe882cb NOTE: Assigning number (1,36) to disk (o/192.168.10.5/DATA_XFF_CD_11_XFFCEL03) NOTE: Assigning number (1,34) to disk (o/192.168.10.5/DATA_XFF_CD_10_XFFCEL03) NOTE: Assigning number (1,37) to disk (o/192.168.10.5/DATA_XFF_CD_04_XFFCEL03) NOTE: Assigning number (1,38) to disk (o/192.168.10.5/DATA_XFF_CD_00_XFFCEL03) NOTE: Assigning number (1,39) to disk (o/192.168.10.5/DATA_XFF_CD_03_XFFCEL03) NOTE: Assigning number (1,40) to disk (o/192.168.10.5/DATA_XFF_CD_05_XFFCEL03) NOTE: Assigning number (1,41) to disk (o/192.168.10.5/DATA_XFF_CD_08_XFFCEL03) NOTE: Assigning number (1,42) to disk (o/192.168.10.5/DATA_XFF_CD_01_XFFCEL03) NOTE: Assigning number (1,43) to disk (o/192.168.10.5/DATA_XFF_CD_09_XFFCEL03) NOTE: Assigning number (1,44) to disk (o/192.168.10.5/DATA_XFF_CD_06_XFFCEL03) NOTE: Assigning number (1,45) to disk (o/192.168.10.5/DATA_XFF_CD_07_XFFCEL03) NOTE: Assigning number (1,46) to disk (o/192.168.10.5/DATA_XFF_CD_02_XFFCEL03) NOTE: Assigning number (1,22) to disk (o/192.168.10.4/DATA_XFF_CD_10_XFFCEL02) NOTE: Assigning number (1,18) to disk (o/192.168.10.4/DATA_XFF_CD_06_XFFCEL02) NOTE: Assigning number (1,19) to disk (o/192.168.10.4/DATA_XFF_CD_07_XFFCEL02) NOTE: Assigning number (1,15) to disk (o/192.168.10.4/DATA_XFF_CD_03_XFFCEL02) NOTE: Assigning number (1,20) to disk (o/192.168.10.4/DATA_XFF_CD_08_XFFCEL02) NOTE: Assigning number (1,17) to disk (o/192.168.10.4/DATA_XFF_CD_05_XFFCEL02) NOTE: Assigning number (1,16) to disk (o/192.168.10.4/DATA_XFF_CD_04_XFFCEL02) NOTE: Assigning number (1,23) to disk (o/192.168.10.4/DATA_XFF_CD_11_XFFCEL02) NOTE: Assigning number (1,12) to disk (o/192.168.10.4/DATA_XFF_CD_00_XFFCEL02) NOTE: Assigning number (1,21) to disk (o/192.168.10.4/DATA_XFF_CD_09_XFFCEL02) NOTE: Assigning number (1,13) to disk (o/192.168.10.4/DATA_XFF_CD_01_XFFCEL02) NOTE: Assigning number (1,14) to disk (o/192.168.10.4/DATA_XFF_CD_02_XFFCEL02) NOTE: Assigning number (1,1) to disk (o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01) NOTE: Assigning number (1,2) to disk (o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01) NOTE: Assigning number (1,3) to disk (o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01) NOTE: Assigning number (1,4) to disk (o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01) NOTE: Assigning number (1,5) to disk (o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01) NOTE: Assigning number (1,6) to disk (o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01) NOTE: Assigning number (1,7) to disk (o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01) NOTE: Assigning number (1,8) to disk (o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01) NOTE: Assigning number (1,9) to disk (o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01) NOTE: Assigning number (1,10) to disk (o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01) NOTE: Assigning number (1,11) to disk (o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01) Wed Dec 12 21:29:10 2018 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 101 for pid 27, osid 62541 NOTE: Assigning number (1,0) to disk () GMON querying group 1 at 102 for pid 27, osid 62541 NOTE: process _user62541_+asm2 (62541) initiating offline of disk 0.3915937355 () with mask 0x7e[0x7f] in group 1 NOTE: initiating PST update: grp = 1, dsk = 0/0xe968764b, mask = 0x6a, op = clear GMON updating disk modes for group 1 at 103 for pid 27, osid 62541 ERROR: Disk 0 cannot be offlined, since all the disks [0, 25] with mirrored data would be offline. ERROR: too many offline disks in PST (grp 1) WARNING: Offline of disk 0 () in group 1 and mode 0x7f failed on ASM inst 2 NOTE: cache dismounting (not clean) group 1/0x5FE882CB (DATA_XFF) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 1/0x5FE882CB (DATA_XFF) NOTE: cache ending mount (fail) of group DATA_XFF number=1 incarn=0x5fe882cb NOTE: cache deleting context for group DATA_XFF 1/0x5fe882cb GMON dismounting group 1 at 104 for pid 27, osid 62541 ERROR: diskgroup DATA_XFF was not mounted ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15066: offlining disk "0" in group "DATA_XFF" may result in a data loss ORA-15042: ASM disk "0" is missing from group number "1" ERROR: alter diskgroup DATA_XFF mount force
检查底层损坏情况
CellCLI> list physicaldisk 20:0 KN3VZL normal 20:1 KNAWLL normal 20:2 KN4E4L warning - predictive failure, poor performance 20:3 KNAN5L normal 20:4 KMJKYL normal 20:5 KN5DGL normal 20:6 KMDLWL normal 20:7 KMDKPL normal 20:8 KMDA7L normal 20:9 KN1YJL normal 20:10 KMH1YL normal 20:11 KMVHAL normal CellCLI> list griddisk DATA_XFF_CD_00_XFFCEL01 active DATA_XFF_CD_01_XFFCEL01 active DATA_XFF_CD_02_XFFCEL01 proactive failure DATA_XFF_CD_03_XFFCEL01 active DATA_XFF_CD_04_XFFCEL01 active DATA_XFF_CD_05_XFFCEL01 active DATA_XFF_CD_06_XFFCEL01 active DATA_XFF_CD_07_XFFCEL01 active DATA_XFF_CD_08_XFFCEL01 active DATA_XFF_CD_09_XFFCEL01 active DATA_XFF_CD_10_XFFCEL01 active DATA_XFF_CD_11_XFFCEL01 active
在db节点无法发现异常磁盘的asm disk
[grid@ycdwdb01 grid]$ kfod disk=all -------------------------------------------------------------------------------- Disk Size Path User Group ============================================================ 1: 433152 Mb o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01 <unknown> <unknown> 2: 433152 Mb o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01 <unknown> <unknown> 3: 433152 Mb o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01 <unknown> <unknown> 4: 433152 Mb o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01 <unknown> <unknown> 5: 433152 Mb o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01 <unknown> <unknown> 6: 433152 Mb o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01 <unknown> <unknown> 7: 433152 Mb o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01 <unknown> <unknown> 8: 433152 Mb o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01 <unknown> <unknown> 9: 433152 Mb o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01 <unknown> <unknown> 10: 433152 Mb o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01 <unknown> <unknown> 11: 433152 Mb o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01 <unknown> <unknown>
根据客户的反馈该磁盘组几乎全部被使用,asmcmd lsdg看到Usable_file_MB已经出现负值.证明该磁盘组本身的normal没有完全存储两份数据,在这样的情况下,继续坏盘会导致部分数据只有一份,因此也就出现了这里的磁盘组无法正常mount成功.
通过底层修复celldisk之后
CellCLI> list griddisk DATA_XFF_CD_00_XFFCEL01 active DATA_XFF_CD_01_XFFCEL01 active DATA_XFF_CD_02_XFFCEL01 active DATA_XFF_CD_03_XFFCEL01 active DATA_XFF_CD_04_XFFCEL01 active DATA_XFF_CD_05_XFFCEL01 active DATA_XFF_CD_06_XFFCEL01 active DATA_XFF_CD_07_XFFCEL01 active DATA_XFF_CD_08_XFFCEL01 active DATA_XFF_CD_09_XFFCEL01 active DATA_XFF_CD_10_XFFCEL01 active DATA_XFF_CD_11_XFFCEL01 active [grid@ycdwdb01 grid]$ kfod disk=all -------------------------------------------------------------------------------- Disk Size Path User Group ============================================================ 1: 433152 Mb o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01 <unknown> <unknown> 2: 433152 Mb o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01 <unknown> <unknown> 3: 433152 Mb o/192.168.10.3/DATA_XFF_CD_02_XFFCEL01 <unknown> <unknown> 4: 433152 Mb o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01 <unknown> <unknown> 5: 433152 Mb o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01 <unknown> <unknown> 6: 433152 Mb o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01 <unknown> <unknown> 7: 433152 Mb o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01 <unknown> <unknown> 8: 433152 Mb o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01 <unknown> <unknown> 9: 433152 Mb o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01 <unknown> <unknown> 10: 433152 Mb o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01 <unknown> <unknown> 11: 433152 Mb o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01 <unknown> <unknown> 12: 433152 Mb o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01 <unknown> <unknown>
data磁盘组直接mount成功
Fri Dec 14 14:04:59 2018 SQL> alter diskgroup DATA_XFF mount NOTE: cache registered group DATA_XFF number=1 incarn=0x78a886e7 NOTE: cache began mount (not first) of group DATA_XFF number=1 incarn=0x78a886e7 NOTE: Assigning number (1,36) to disk (o/192.168.10.5/DATA_XFF_CD_11_XFFCEL03) NOTE: Assigning number (1,34) to disk (o/192.168.10.5/DATA_XFF_CD_10_XFFCEL03) NOTE: Assigning number (1,37) to disk (o/192.168.10.5/DATA_XFF_CD_04_XFFCEL03) NOTE: Assigning number (1,38) to disk (o/192.168.10.5/DATA_XFF_CD_00_XFFCEL03) NOTE: Assigning number (1,39) to disk (o/192.168.10.5/DATA_XFF_CD_03_XFFCEL03) NOTE: Assigning number (1,40) to disk (o/192.168.10.5/DATA_XFF_CD_05_XFFCEL03) NOTE: Assigning number (1,41) to disk (o/192.168.10.5/DATA_XFF_CD_08_XFFCEL03) NOTE: Assigning number (1,42) to disk (o/192.168.10.5/DATA_XFF_CD_01_XFFCEL03) NOTE: Assigning number (1,43) to disk (o/192.168.10.5/DATA_XFF_CD_09_XFFCEL03) NOTE: Assigning number (1,44) to disk (o/192.168.10.5/DATA_XFF_CD_06_XFFCEL03) NOTE: Assigning number (1,45) to disk (o/192.168.10.5/DATA_XFF_CD_07_XFFCEL03) NOTE: Assigning number (1,46) to disk (o/192.168.10.5/DATA_XFF_CD_02_XFFCEL03) NOTE: Assigning number (1,22) to disk (o/192.168.10.4/DATA_XFF_CD_10_XFFCEL02) NOTE: Assigning number (1,18) to disk (o/192.168.10.4/DATA_XFF_CD_06_XFFCEL02) NOTE: Assigning number (1,19) to disk (o/192.168.10.4/DATA_XFF_CD_07_XFFCEL02) NOTE: Assigning number (1,15) to disk (o/192.168.10.4/DATA_XFF_CD_03_XFFCEL02) NOTE: Assigning number (1,20) to disk (o/192.168.10.4/DATA_XFF_CD_08_XFFCEL02) NOTE: Assigning number (1,17) to disk (o/192.168.10.4/DATA_XFF_CD_05_XFFCEL02) NOTE: Assigning number (1,16) to disk (o/192.168.10.4/DATA_XFF_CD_04_XFFCEL02) NOTE: Assigning number (1,23) to disk (o/192.168.10.4/DATA_XFF_CD_11_XFFCEL02) NOTE: Assigning number (1,12) to disk (o/192.168.10.4/DATA_XFF_CD_00_XFFCEL02) NOTE: Assigning number (1,21) to disk (o/192.168.10.4/DATA_XFF_CD_09_XFFCEL02) NOTE: Assigning number (1,13) to disk (o/192.168.10.4/DATA_XFF_CD_01_XFFCEL02) NOTE: Assigning number (1,14) to disk (o/192.168.10.4/DATA_XFF_CD_02_XFFCEL02) NOTE: Assigning number (1,1) to disk (o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01) NOTE: Assigning number (1,2) to disk (o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01) NOTE: Assigning number (1,3) to disk (o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01) NOTE: Assigning number (1,4) to disk (o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01) NOTE: Assigning number (1,5) to disk (o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01) NOTE: Assigning number (1,6) to disk (o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01) NOTE: Assigning number (1,7) to disk (o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01) NOTE: Assigning number (1,8) to disk (o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01) NOTE: Assigning number (1,9) to disk (o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01) NOTE: Assigning number (1,10) to disk (o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01) NOTE: Assigning number (1,11) to disk (o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01) NOTE: Assigning number (1,0) to disk (o/192.168.10.3/DATA_XFF_CD_02_XFFCEL01) Fri Dec 14 14:04:59 2018 GMON querying group 1 at 78 for pid 28, osid 76016 NOTE: Assigning number (1,24) to disk () NOTE: Assigning number (1,25) to disk () NOTE: Assigning number (1,26) to disk () NOTE: Assigning number (1,27) to disk () NOTE: Assigning number (1,28) to disk () NOTE: Assigning number (1,29) to disk () NOTE: Assigning number (1,30) to disk () NOTE: Assigning number (1,31) to disk () NOTE: Assigning number (1,32) to disk () NOTE: Assigning number (1,33) to disk () NOTE: Assigning number (1,35) to disk () GMON querying group 1 at 79 for pid 28, osid 76016 NOTE: cache opening disk 0 of grp 1: DATA_XFF_CD_02_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_02_XFFCEL01 NOTE: cache opening disk 1 of grp 1: DATA_XFF_CD_05_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_05_XFFCEL01 NOTE: cache opening disk 2 of grp 1: DATA_XFF_CD_03_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_03_XFFCEL01 NOTE: F1X0 found on disk 2 au 5 fcn 0.15948262 NOTE: cache opening disk 3 of grp 1: DATA_XFF_CD_06_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_06_XFFCEL01 NOTE: cache opening disk 4 of grp 1: DATA_XFF_CD_09_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_09_XFFCEL01 NOTE: cache opening disk 5 of grp 1: DATA_XFF_CD_04_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_04_XFFCEL01 NOTE: cache opening disk 6 of grp 1: DATA_XFF_CD_07_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_07_XFFCEL01 NOTE: cache opening disk 7 of grp 1: DATA_XFF_CD_11_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_11_XFFCEL01 NOTE: cache opening disk 8 of grp 1: DATA_XFF_CD_01_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_01_XFFCEL01 NOTE: cache opening disk 9 of grp 1: DATA_XFF_CD_00_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_00_XFFCEL01 NOTE: cache opening disk 10 of grp 1: DATA_XFF_CD_10_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_10_XFFCEL01 NOTE: cache opening disk 11 of grp 1: DATA_XFF_CD_08_XFFCEL01 path:o/192.168.10.3/DATA_XFF_CD_08_XFFCEL01 NOTE: cache opening disk 12 of grp 1: DATA_XFF_CD_00_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_00_XFFCEL02 NOTE: cache opening disk 13 of grp 1: DATA_XFF_CD_01_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_01_XFFCEL02 NOTE: cache opening disk 14 of grp 1: DATA_XFF_CD_02_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_02_XFFCEL02 NOTE: cache opening disk 15 of grp 1: DATA_XFF_CD_03_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_03_XFFCEL02 NOTE: cache opening disk 16 of grp 1: DATA_XFF_CD_04_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_04_XFFCEL02 NOTE: cache opening disk 17 of grp 1: DATA_XFF_CD_05_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_05_XFFCEL02 NOTE: cache opening disk 18 of grp 1: DATA_XFF_CD_06_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_06_XFFCEL02 NOTE: cache opening disk 19 of grp 1: DATA_XFF_CD_07_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_07_XFFCEL02 NOTE: cache opening disk 20 of grp 1: DATA_XFF_CD_08_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_08_XFFCEL02 NOTE: cache opening disk 21 of grp 1: DATA_XFF_CD_09_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_09_XFFCEL02 NOTE: F1X0 found on disk 21 au 2 fcn 0.15948262 NOTE: cache opening disk 22 of grp 1: DATA_XFF_CD_10_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_10_XFFCEL02 NOTE: cache opening disk 23 of grp 1: DATA_XFF_CD_11_XFFCEL02 path:o/192.168.10.4/DATA_XFF_CD_11_XFFCEL02 NOTE: cache opening disk 36 of grp 1: DATA_XFF_CD_11_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_11_XFFCEL03 NOTE: cache opening disk 37 of grp 1: DATA_XFF_CD_04_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_04_XFFCEL03 NOTE: cache opening disk 38 of grp 1: DATA_XFF_CD_00_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_00_XFFCEL03 NOTE: cache opening disk 39 of grp 1: DATA_XFF_CD_03_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_03_XFFCEL03 NOTE: cache opening disk 40 of grp 1: DATA_XFF_CD_05_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_05_XFFCEL03 NOTE: cache opening disk 41 of grp 1: DATA_XFF_CD_08_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_08_XFFCEL03 NOTE: cache opening disk 42 of grp 1: DATA_XFF_CD_01_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_01_XFFCEL03 NOTE: cache opening disk 43 of grp 1: DATA_XFF_CD_09_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_09_XFFCEL03 NOTE: cache opening disk 44 of grp 1: DATA_XFF_CD_06_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_06_XFFCEL03 NOTE: F1X0 found on disk 44 au 2 fcn 0.15948262 NOTE: cache opening disk 45 of grp 1: DATA_XFF_CD_07_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_07_XFFCEL03 NOTE: cache opening disk 46 of grp 1: DATA_XFF_CD_02_XFFCEL03 path:o/192.168.10.5/DATA_XFF_CD_02_XFFCEL03 NOTE: cache mounting (not first) normal redundancy group 1/0x78A886E7 (DATA_XFF) Fri Dec 14 14:04:59 2018 kjbdomatt send to inst 2 Fri Dec 14 14:04:59 2018 NOTE: attached to recovery domain 1 NOTE: redo buffer size is 512 blocks (2101760 bytes) Fri Dec 14 14:04:59 2018 NOTE: LGWR attempting to mount thread 2 for diskgroup 1 (DATA_XFF) NOTE: LGWR found thread 2 closed at ABA 98.4672 NOTE: LGWR mounted thread 2 for diskgroup 1 (DATA_XFF) NOTE: LGWR opening thread 2 at fcn 0.18931129 ABA 99.4673 NOTE: cache mounting group 1/0x78A886E7 (DATA_XFF) succeeded NOTE: cache ending mount (success) of group DATA_XFF number=1 incarn=0x78a886e7 GMON querying group 1 at 80 for pid 19, osid 9805 Fri Dec 14 14:04:59 2018 NOTE: Instance updated compatible.asm to 11.2.0.3.0 for grp 1 SUCCESS: diskgroup DATA_XFF was mounted SUCCESS: alter diskgroup DATA_XFF mount
恢复后的asm磁盘状态
ASMCMD> lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL Y 512 4096 4194304 15160320 4776184 5197824 -210820 12 N DATA_XFF/ MOUNTED NORMAL N 512 4096 4194304 864896 863400 298240 282580 0 Y DBFS_DG/ MOUNTED NORMAL N 512 4096 4194304 3787840 2157232 1298688 429272 0 N RECO_XFF/
后续数据库open成功,有部分坏块通过技术手段进行二次处理,至此数据库恢复完成,成功抢救了客户Oracle Exadata中的绝大部分数据.如果有类似xd故障恢复,无法自行解决,需要恢复支持请联系我们
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com
发表在 非常规恢复
标签为 exadata mount, exadata坏盘恢复, exadata恢复, exadata磁盘组恢复, ORA-15040, ORA-15042, ORA-15066, xd坏盘恢复, xd恢复
评论关闭
ORA-15042: ASM disk “N” is missing from group number “M” 故障恢复
接到一个朋友恢复请求,19个lun的asm 磁盘组,由于其中一个lun有问题,他们进行了增加一个新lun,删除老lun的方法操作,但是操作一半hang住了(因为坏的lun是底层损坏,无法完成rebalance),然后存储工程师继续修复异常lun,非常幸运异常lun修复好了,但是高兴过了头,直接从存储上删除了新加入的lun(已经rebalance一部分数据进去了),这个时候asm dg彻底趴下了,不能mount成功,请求恢复支持。由于某种原因,无法从lun层面恢复,只能让我们提供数据库层面恢复
Mon Sep 21 19:52:35 2015 SQL> alter diskgroup dg_XFF add disk '/dev/rhdisk116' size 716800M drop disk dg_XFF_0012 NOTE: Assigning number (1,20) to disk (/dev/rhdisk116) NOTE: requesting all-instance membership refresh for group=1 NOTE: initializing header on grp 1 disk DG_XFF_0020 NOTE: requesting all-instance disk validation for group=1 Mon Sep 21 19:52:44 2015 NOTE: skipping rediscovery for group 1/0xb94738f1 (DG_XFF) on local instance. NOTE: requesting all-instance disk validation for group=1 NOTE: skipping rediscovery for group 1/0xb94738f1 (DG_XFF) on local instance. NOTE: initiating PST update: grp = 1 Mon Sep 21 19:52:44 2015 GMON updating group 1 at 25 for pid 27, osid 12124486 NOTE: PST update grp = 1 completed successfully NOTE: membership refresh pending for group 1/0xb94738f1 (DG_XFF) GMON querying group 1 at 26 for pid 18, osid 10092734 NOTE: cache opening disk 20 of grp 1: DG_XFF_0020 path:/dev/rhdisk116 GMON querying group 1 at 27 for pid 18, osid 10092734 SUCCESS: refreshed membership for 1/0xb94738f1 (DG_XFF) Mon Sep 21 19:52:47 2015 SUCCESS: alter diskgroup dg_XFF add disk '/dev/rhdisk116' size 716800M drop disk dg_XFF_0012 NOTE: starting rebalance of group 1/0xb94738f1 (DG_XFF) at power 1 Starting background process ARB0 Mon Sep 21 19:52:47 2015 ARB0 started with pid=28, OS id=10944804 NOTE: assigning ARB0 to group 1/0xb94738f1 (DG_XFF) with 1 parallel I/O NOTE: Attempting voting file refresh on diskgroup DG_XFF Mon Sep 21 20:35:06 2015
SQL> ALTER DISKGROUP DG_XFF MOUNT /* asm agent *//* {1:51107:7083} */ NOTE: cache registered group DG_XFF number=1 incarn=0xdd6f975a NOTE: cache began mount (first) of group DG_XFF number=1 incarn=0xdd6f975a NOTE: Assigning number (1,0) to disk (/dev/rhdisk10) NOTE: Assigning number (1,1) to disk (/dev/rhdisk11) NOTE: Assigning number (1,2) to disk (/dev/rhdisk16) NOTE: Assigning number (1,3) to disk (/dev/rhdisk17) NOTE: Assigning number (1,4) to disk (/dev/rhdisk22) NOTE: Assigning number (1,5) to disk (/dev/rhdisk23) NOTE: Assigning number (1,6) to disk (/dev/rhdisk28) NOTE: Assigning number (1,7) to disk (/dev/rhdisk29) NOTE: Assigning number (1,8) to disk (/dev/rhdisk33) NOTE: Assigning number (1,9) to disk (/dev/rhdisk34) NOTE: Assigning number (1,10) to disk (/dev/rhdisk4) NOTE: Assigning number (1,11) to disk (/dev/rhdisk40) NOTE: Assigning number (1,12) to disk (/dev/rhdisk41) NOTE: Assigning number (1,13) to disk (/dev/rhdisk45) NOTE: Assigning number (1,14) to disk (/dev/rhdisk46) NOTE: Assigning number (1,15) to disk (/dev/rhdisk5) NOTE: Assigning number (1,16) to disk (/dev/rhdisk52) NOTE: Assigning number (1,17) to disk (/dev/rhdisk53) NOTE: Assigning number (1,18) to disk (/dev/rhdisk57) NOTE: Assigning number (1,19) to disk (/dev/rhdisk58) Wed Sep 30 11:08:07 2015 NOTE: start heartbeating (grp 1) GMON querying group 1 at 33 for pid 35, osid 4194488 NOTE: Assigning number (1,20) to disk () GMON querying group 1 at 34 for pid 35, osid 4194488 NOTE: cache dismounting (clean) group 1/0xDD6F975A (DG_XFF) NOTE: dbwr not being msg'd to dismount NOTE: lgwr not being msg'd to dismount NOTE: cache dismounted group 1/0xDD6F975A (DG_XFF) NOTE: cache ending mount (fail) of group DG_XFF number=1 incarn=0xdd6f975a NOTE: cache deleting context for group DG_XFF 1/0xdd6f975a GMON dismounting group 1 at 35 for pid 35, osid 4194488 NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment NOTE: Disk in mode 0x8 marked for de-assignment ERROR: diskgroup DG_XFF was not mounted ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete ORA-15042: ASM disk "20" is missing from group number "1" ERROR: ALTER DISKGROUP DG_XFF MOUNT /* asm agent *//* {1:51107:7083} */
这里比较明显,由于存储工程师直接删除了lun,这里导致磁盘组DG_XFF丢失asm disk 20,使得磁盘组无法直接mount,由于该磁盘组已经进行了较长时间的rebalance,丢失的盘中已经有大量数据(包括元数据),因此就算修改pst让磁盘组mount起来(不一定成功),也会丢失大量数据,也不一定可以直接拿出来里面的数据,如果只是加入盘,但是由于某种原因没有做rebalance,那我们直接可以通过修改pst,使得磁盘组mount起来。因此对于这样的情况,我们能够做的,只能从底层扫描磁盘,生成数据文件(因为有部分文件的元数据在丢失lun之上,如果直接使用现存元数据信息,直接拷贝,或者unload数据都会丢失大量数据),然后再进一步unload数据,完成恢复。需要恢复磁盘信息
grp# dsk# bsize ausize disksize diskname groupname path ---- ---- ----- ------ -------- --------------- --------------- ------------- 1 0 4096 4096K 179200 DG_XFF_0000 DG_XFF /dev/rhdisk10 1 1 4096 4096K 179200 DG_XFF_0001 DG_XFF /dev/rhdisk11 1 2 4096 4096K 179200 DG_XFF_0002 DG_XFF /dev/rhdisk16 1 3 4096 4096K 179200 DG_XFF_0003 DG_XFF /dev/rhdisk17 1 4 4096 4096K 179200 DG_XFF_0004 DG_XFF /dev/rhdisk22 1 5 4096 4096K 179200 DG_XFF_0005 DG_XFF /dev/rhdisk23 1 6 4096 4096K 179200 DG_XFF_0006 DG_XFF /dev/rhdisk28 1 7 4096 4096K 179200 DG_XFF_0007 DG_XFF /dev/rhdisk29 1 8 4096 4096K 179200 DG_XFF_0008 DG_XFF /dev/rhdisk33 1 9 4096 4096K 179200 DG_XFF_0009 DG_XFF /dev/rhdisk34 1 10 4096 4096K 179200 DG_XFF_0010 DG_XFF /dev/rhdisk4 1 11 4096 4096K 179200 DG_XFF_0011 DG_XFF /dev/rhdisk40 1 12 4096 4096K 179200 DG_XFF_0012 DG_XFF /dev/rhdisk41 1 13 4096 4096K 179200 DG_XFF_0013 DG_XFF /dev/rhdisk45 1 14 4096 4096K 179200 DG_XFF_0014 DG_XFF /dev/rhdisk46 1 15 4096 4096K 179200 DG_XFF_0015 DG_XFF /dev/rhdisk5 1 16 4096 4096K 179200 DG_XFF_0016 DG_XFF /dev/rhdisk52 1 17 4096 4096K 179200 DG_XFF_0017 DG_XFF /dev/rhdisk53 1 18 4096 4096K 179200 DG_XFF_0018 DG_XFF /dev/rhdisk57 1 19 4096 4096K 179200 DG_XFF_0019 DG_XFF /dev/rhdisk58
这次运气比较好,丢失的磁盘组只是一个业务磁盘组,而且里面只有19个表空间,10个分区表,因此在数据字典完成的情况下,恢复10个分区表(一共6443个分区)的数据,整体恢复效果如下:
从整体数据量看恢复比例为:6003.26953/6027.26935*100%=99.6018127%,对于丢失了一个已经rebalance的大部分的lun,依旧能够恢复如此的数据,整体看非常理想.
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com