Exadata磁盘损坏导致磁盘组无法mount恢复(oracle一体机磁盘组异常恢复)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Exadata磁盘损坏导致磁盘组无法mount恢复(oracle一体机磁盘组异常恢复)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

Oracle Exadata客户,在换盘过程中,cell节点又一块磁盘损坏,导致datac1磁盘组(该磁盘组是normal方式冗余)无法mount

Thu Jul 20 22:01:21 2023
SQL> alter diskgroup datac1 mount force 
NOTE: cache registered group DATAC1 number=1 incarn=0x0728ad12
NOTE: cache began mount (first) of group DATAC1 number=1 incarn=0x0728ad12
NOTE: Assigning number (1,35) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_11_dm01celadm03)
NOTE: Assigning number (1,31) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_07_dm01celadm03)
NOTE: Assigning number (1,24) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_00_dm01celadm03)
NOTE: Assigning number (1,25) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_01_dm01celadm03)
NOTE: Assigning number (1,27) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_03_dm01celadm03)
NOTE: Assigning number (1,33) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_09_dm01celadm03)
NOTE: Assigning number (1,30) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_06_dm01celadm03)
NOTE: Assigning number (1,28) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_04_dm01celadm03)
NOTE: Assigning number (1,26) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_02_dm01celadm03)
NOTE: Assigning number (1,1) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_08_dm01celadm03)
NOTE: Assigning number (1,34) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_10_dm01celadm03)
NOTE: Assigning number (1,29) to disk (o/192.168.10.9;192.168.10.10/DATAC1_CD_05_dm01celadm03)
NOTE: Assigning number (1,3) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_07_dm01celadm02)
NOTE: Assigning number (1,4) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_06_dm01celadm02)
NOTE: Assigning number (1,5) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_00_dm01celadm02)
NOTE: Assigning number (1,6) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_10_dm01celadm02)
NOTE: Assigning number (1,7) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_08_dm01celadm02)
NOTE: Assigning number (1,8) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_03_dm01celadm02)
NOTE: Assigning number (1,9) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_11_dm01celadm02)
NOTE: Assigning number (1,10) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_01_dm01celadm02)
NOTE: Assigning number (1,11) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_04_dm01celadm02)
NOTE: Assigning number (1,21) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_05_dm01celadm02)
NOTE: Assigning number (1,43) to disk (o/192.168.10.7;192.168.10.8/DATAC1_CD_02_dm01celadm02)
NOTE: Assigning number (1,36) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_07_dm01celadm01)
NOTE: Assigning number (1,37) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_09_dm01celadm01)
NOTE: Assigning number (1,38) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_11_dm01celadm01)
NOTE: Assigning number (1,0) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_08_dm01celadm01)
NOTE: Assigning number (1,40) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_00_dm01celadm01)
NOTE: Assigning number (1,41) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_03_dm01celadm01)
NOTE: Assigning number (1,42) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_06_dm01celadm01)
NOTE: Assigning number (1,44) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_05_dm01celadm01)
NOTE: Assigning number (1,45) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_01_dm01celadm01)
NOTE: Assigning number (1,46) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_02_dm01celadm01)
NOTE: Assigning number (1,47) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_10_dm01celadm01)
NOTE: Assigning number (1,2) to disk (o/192.168.10.5;192.168.10.6/DATAC1_CD_04_dm01celadm01)
Thu Jul 20 22:01:28 2023
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 450 for pid 30, osid 171838
NOTE: Assigning number (1,32) to disk ()
NOTE: Assigning number (1,39) to disk ()
GMON querying group 1 at 451 for pid 30, osid 171838
NOTE: cache closing disk 32 of grp 1: (not open) 
NOTE: process _user171838_+asm1 (171838) 
     initiating offline of disk 39.3915945266 () with mask 0x7e[0x7f] in group 1
NOTE: initiating PST update: grp = 1, dsk = 39/0xe9689532, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 452 for pid 30, osid 171838
NOTE: cache closing disk 32 of grp 1: (not open) 
ERROR: Disk 39 cannot be offlined, since all the disks [39, 32] with mirrored data would be offline.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline for disk  in mode 0x7f failed.
NOTE: cache dismounting (not clean) group 1/0x0728AD12 (DATAC1) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 171838, image: oracle@dm01dbadm01.gyzq.cn (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x0728AD12 (DATAC1) 
NOTE: cache ending mount (fail) of group DATAC1 number=1 incarn=0x0728ad12
NOTE: cache deleting context for group DATAC1 1/0x0728ad12
NOTE: cache closing disk 32 of grp 1: (not open) 
GMON dismounting group 1 at 453 for pid 30, osid 171838
NOTE: Disk DATAC1_CD_08_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_08_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_04_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_07_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_06_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_00_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_10_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_08_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_03_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_11_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_01_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_04_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_05_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_00_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_01_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_02_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_03_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_04_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_05_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_06_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_07_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x1 marked for de-assignment
NOTE: Disk DATAC1_CD_09_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_10_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_11_DM01CELADM03 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_07_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_09_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_11_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_00_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_03_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_06_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_02_DM01CELADM02 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_05_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_01_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_02_DM01CELADM01 in mode 0x7f marked for de-assignment
NOTE: Disk DATAC1_CD_10_DM01CELADM01 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATAC1 was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15066: offlining disk "39" in group "DATAC1" may result in a data loss
ORA-15042: ASM disk "39" is missing from group number "1" 
ORA-15042: ASM disk "32" is missing from group number "1" 
ERROR: alter diskgroup datac1 mount force

故障原因是由于asm disk 32还已经损坏在换盘过程中(数据没有reblance完成),又损坏了asm disk 39,而这两份磁盘中有数据互为镜像,因此磁盘组无法正常mount起来.

检查cell节点celldisk和griddisk情况,确认底层磁盘损坏
cellcli


对于这种情况,因为normal冗余的两份数据都有部分丢失,无法直接恢复数据,通过底层磁盘级别恢复(参考以前一次的Oracle exadata故障恢复:Oracle Exadata坏盘导致磁盘组无法mount恢复),然后比较顺利恢复数据,实现业务数据0丢失

SQL> alter datac1 mount;

Diskgroup altered.

SQL> alter diskgroup datac1 check all;

Diskgroup altered.

多套库顺利open成功
20230728124241


在实际恢复过程中由于客户进行了各种尝试,直接新镜像盘然后插入新盘,强制拉磁盘组drop异常disk操作等,导致第一现场发生一些破坏,增加了恢复难道,但是最终通过各种方法弥补,实现了预期的恢复效果(业务数据0丢失)

此条目发表在 Oracle备份恢复 分类目录,贴了 , , , , , , , , , 标签。将固定链接加入收藏夹。

评论功能已关闭。