手工对multipath设备进行授权导致asm 磁盘组mount报ORA-15032-ORA-15131

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:手工对multipath设备进行授权导致asm 磁盘组mount报ORA-15032-ORA-15131

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户硬件通过底层重组raid,然后把lun进行到asm的机器上,在mount data_dg磁盘组的时候,报ORA-15032 ORA-15131错误,磁盘组无法正常mount,这种报错不太常见,一般要不直接报某个block无法访问,要不直接报缺少asm disk之类的.
ORA-15131


通过远程上去分析,发现alert日志如下

Wed Jul 31 04:55:17 2024
NOTE: attached to recovery domain 1
NOTE: cache recovered group 1 to fcn 0.1814063801
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Wed Jul 31 04:55:17 2024
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA_DG)
Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc:
ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 3
Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc:
ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 3
WARNING: cache failed reading from group=1(DATA_DG) fn=1 blk=3 count=1 from disk= 0 
  (DATA_DG_0000) kfkist=0x20 status=0x02 osderr=0x0 file=kfc.c line=11596
Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc:
ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 3
ORA-15080: synchronous I/O operation to a disk failed
ERROR: cache failed to read group=1(DATA_DG) fn=1 blk=3 from disk(s): 0(DATA_DG_0000)
ORA-15080: synchronous I/O operation to a disk failed
NOTE: cache initiating offline of disk 0 group DATA_DG
NOTE: process _lgwr_+asm2 (8681) initiating offline of disk 0.3915927124 (DATA_DG_0000) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe9684e54, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 42 for pid 15, osid 8681
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline for disk DATA_DG_0000 in mode 0x7f failed.
Wed Jul 31 04:55:17 2024
NOTE: halting all I/Os to diskgroup 1 (DATA_DG)
NOTE: LGWR caught ORA-15131 while mounting diskgroup 1
ORA-15080: synchronous I/O operation to a disk failed
NOTE: cache initiating offline of disk 0 group DATA_DG
NOTE: process _lgwr_+asm2 (8681) initiating offline of disk 0.3915927124 (DATA_DG_0000) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe9684e54, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 42 for pid 15, osid 8681
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline for disk DATA_DG_0000 in mode 0x7f failed.
Wed Jul 31 04:55:17 2024
NOTE: halting all I/Os to diskgroup 1 (DATA_DG)
NOTE: LGWR caught ORA-15131 while mounting diskgroup 1
ERROR: ORA-15131 signalled during mount of diskgroup DATA_DG
NOTE: cache dismounting (clean) group 1/0xA868BD55 (DATA_DG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 16915, image: oracle@xffdb2 (TNS V1-V3)
NOTE: lgwr not being msg'd to dismount
Wed Jul 31 04:55:18 2024
List of instances:
 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 9)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
 2 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
freeing rdom 1
WARNING: dirty detached from domain 1
WARNING: thread recovery enqueue was not held for domain 1 when doing a dirty detach
NOTE: cache dismounted group 1/0xA868BD55 (DATA_DG)
NOTE: cache ending mount (fail) of group DATA_DG number=1 incarn=0xa868bd55
NOTE: cache deleting context for group DATA_DG 1/0xa868bd55
GMON dismounting group 1 at 43 for pid 29, osid 16915
NOTE: Disk DATA_DG_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0001 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0002 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0003 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0004 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0005 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA_DG was not mounted
ORA-15032: not all alterations performed
ORA-15131: block  of file  in diskgroup  could not be read
ERROR: alter diskgroup data_dg mount

基本上可以确认是由于访问/dev/mapper/xffdb_data01_new 磁盘权限不对导致读disk= 0 fn=1 blk=3失败(突然读这个block没有权限,而没有报最初的磁盘头无权限,有点不合常理),进一步分析确认是xffdb_data01_new 权限不对.

xffdb2:/oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace$ls -l /dev/mapper/
total 0
crw-rw---- 1 root root 10, 58 Jul 26 12:24 control
lrwxrwxrwx 1 root root      8 Jul 31 04:21 mpathe -> ../dm-17
lrwxrwxrwx 1 root root      7 Jul 31 04:28 mpathf -> ../dm-7
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data01_new -> ../dm-14
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data02_new -> ../dm-13
lrwxrwxrwx 1 root root      7 Jul 31 04:55 xffdb_data03 -> ../dm-2
lrwxrwxrwx 1 root root      7 Jul 31 04:55 xffdb_data04 -> ../dm-5
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data05_new -> ../dm-12
lrwxrwxrwx 1 root root      7 Jul 31 04:55 xffdb_data06 -> ../dm-6
lrwxrwxrwx 1 root root      8 Jul 31 04:28 xffdb_data07 -> ../dm-11
lrwxrwxrwx 1 root root      7 Jul 31 04:28 xffdb_data08 -> ../dm-9
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log1 -> ../dm-4
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log2 -> ../dm-3
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_vote2 -> ../dm-8
lrwxrwxrwx 1 root root      8 Jul 31 04:59 xffdb_vote3 -> ../dm-10
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vgdata-lv_data -> ../dm-15
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol00 -> ../dm-1
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol01 -> ../dm-0
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vg_xffdb2-LogVol02 -> ../dm-16
xffdb2:/oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace$ls -l /dev/dm*
brw-rw---- 1 root disk     253,  0 Jul 26 12:24 /dev/dm-0
brw-rw---- 1 root disk     253,  1 Jul 26 12:24 /dev/dm-1
brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:13 /dev/dm-10
brw-rw---- 1 root disk     253, 11 Jul 31 04:28 /dev/dm-11
brw-rw---- 1 root disk     253, 12 Jul 31 04:55 /dev/dm-12
brw-rw---- 1 grid asmadmin 253, 13 Jul 31 04:55 /dev/dm-13
brw-rw---- 1 grid asmadmin 253, 14 Jul 31 04:55 /dev/dm-14
brw-rw---- 1 root disk     253, 15 Jul 26 12:24 /dev/dm-15
brw-rw---- 1 root disk     253, 16 Jul 26 12:24 /dev/dm-16
brw-rw---- 1 root disk     253, 17 Jul 31 04:21 /dev/dm-17
brw-rw---- 1 grid asmadmin 253,  2 Jul 31 04:55 /dev/dm-2
brw-rw---- 1 grid asmadmin 253,  3 Jul 31 04:59 /dev/dm-3
brw-rw---- 1 grid asmadmin 253,  4 Jul 31 05:13 /dev/dm-4
brw-rw---- 1 grid asmadmin 253,  5 Jul 31 04:55 /dev/dm-5
brw-rw---- 1 grid asmadmin 253,  6 Jul 31 04:55 /dev/dm-6
brw-rw---- 1 root disk     253,  7 Jul 31 04:28 /dev/dm-7
brw-rw---- 1 grid asmadmin 253,  8 Jul 31 05:13 /dev/dm-8
brw-rw---- 1 root disk     253,  9 Jul 31 04:28 /dev/dm-9

再进一步确认xffdb_*_new三个磁盘是硬件恢复之后镜像过来的,然后现场工程师直接人工修改/dev/dm_[12-14]权限,再尝试mount磁盘组,结果发生该错误,通过v$asm_disk再次查询asm disk情况,发现xffdb_*_new的磁盘均不在列表中

GROUP_NUMBER DISK_NUMBER HEADER_STATUS         STATE          PATH
------------ ----------- --------------------- -------------- --------------------------
           0           2 MEMBER                NORMAL         /dev/mapper/xffdb_data03
           0           3 MEMBER                NORMAL         /dev/mapper/xffdb_data06
           0           4 MEMBER                NORMAL         /dev/mapper/xffdb_data04
           3           1 MEMBER                NORMAL         /dev/mapper/xffdb_vote2
           2           0 MEMBER                NORMAL         /dev/mapper/xffdb_log1
           3           2 MEMBER                NORMAL         /dev/mapper/xffdb_vote3
           2           1 MEMBER                NORMAL         /dev/mapper/xffdb_log2

7 rows selected.

进一步查看磁盘权限

xffdb2:/dev/mapper$ls -ltr
total 0
crw-rw---- 1 root root 10, 58 Jul 26 12:24 control
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol01 -> ../dm-0
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vgdata-lv_data -> ../dm-15
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol00 -> ../dm-1
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vg_xffdb2-LogVol02 -> ../dm-16
lrwxrwxrwx 1 root root      8 Jul 31 04:21 mpathe -> ../dm-17
lrwxrwxrwx 1 root root      7 Jul 31 04:28 xffdb_data08 -> ../dm-9
lrwxrwxrwx 1 root root      8 Jul 31 04:28 xffdb_data07 -> ../dm-11
lrwxrwxrwx 1 root root      7 Jul 31 04:28 mpathf -> ../dm-7
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data05_new -> ../dm-12
lrwxrwxrwx 1 root root      8 Jul 31 04:59 xffdb_vote3 -> ../dm-10
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_vote2 -> ../dm-8
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log2 -> ../dm-3
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log1 -> ../dm-4
lrwxrwxrwx 1 root root      8 Jul 31 05:15 xffdb_data01_new -> ../dm-14
lrwxrwxrwx 1 root root      8 Jul 31 05:15 xffdb_data02_new -> ../dm-13
lrwxrwxrwx 1 root root      7 Jul 31 05:15 xffdb_data06 -> ../dm-6
lrwxrwxrwx 1 root root      7 Jul 31 05:15 xffdb_data04 -> ../dm-5
lrwxrwxrwx 1 root root      7 Jul 31 05:15 xffdb_data03 -> ../dm-2
xffdb2:/dev/mapper$ls -l /dev/dm*
brw-rw---- 1 root disk     253,  0 Jul 26 12:24 /dev/dm-0
brw-rw---- 1 root disk     253,  1 Jul 26 12:24 /dev/dm-1
brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:22 /dev/dm-10
brw-rw---- 1 root disk     253, 11 Jul 31 04:28 /dev/dm-11
brw-rw---- 1 root disk     253, 12 Jul 31 04:55 /dev/dm-12
brw-rw---- 1 root disk     253, 13 Jul 31 05:15 /dev/dm-13
brw-rw---- 1 root disk     253, 14 Jul 31 05:15 /dev/dm-14
brw-rw---- 1 root disk     253, 15 Jul 26 12:24 /dev/dm-15
brw-rw---- 1 root disk     253, 16 Jul 26 12:24 /dev/dm-16
brw-rw---- 1 root disk     253, 17 Jul 31 04:21 /dev/dm-17
brw-rw---- 1 grid asmadmin 253,  2 Jul 31 05:15 /dev/dm-2
brw-rw---- 1 grid asmadmin 253,  3 Jul 31 04:59 /dev/dm-3
brw-rw---- 1 grid asmadmin 253,  4 Jul 31 05:22 /dev/dm-4
brw-rw---- 1 grid asmadmin 253,  5 Jul 31 05:15 /dev/dm-5
brw-rw---- 1 grid asmadmin 253,  6 Jul 31 05:15 /dev/dm-6
brw-rw---- 1 root disk     253,  7 Jul 31 04:28 /dev/dm-7
brw-rw---- 1 grid asmadmin 253,  8 Jul 31 05:22 /dev/dm-8
brw-rw---- 1 root disk     253,  9 Jul 31 04:28 /dev/dm-9

发现进一步访问,这三个盘权限全部还原成root:disk,导致grid无法正常访问,到这一部分基本上可以判断恢复过来的多路径下面的三个磁盘,当被访问之时,权限会发生改变,一般发生该问题,是由于这些设备没有被udev进行绑定导致,使用udev对这三个磁盘进行权限和所有组相关信息进行绑定之后,磁盘权限不再变化,v$asm_disk中显示信息也正常

[root@xffdb2 rules.d]# ls -l /dev/dm*
brw-rw---- 1 root disk     253,  0 Jul 31 05:26 /dev/dm-0
brw-rw---- 1 root disk     253,  1 Jul 31 05:26 /dev/dm-1
brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:26 /dev/dm-10
brw-rw---- 1 root disk     253, 11 Jul 31 05:26 /dev/dm-11
brw-rw---- 1 grid asmadmin 253, 12 Jul 31 05:26 /dev/dm-12
brw-rw---- 1 grid asmadmin 253, 13 Jul 31 05:26 /dev/dm-13
brw-rw---- 1 grid asmadmin 253, 14 Jul 31 05:26 /dev/dm-14
brw-rw---- 1 root disk     253, 15 Jul 31 05:26 /dev/dm-15
brw-rw---- 1 root disk     253, 16 Jul 31 05:26 /dev/dm-16
brw-rw---- 1 root disk     253, 17 Jul 31 05:26 /dev/dm-17
brw-rw---- 1 grid asmadmin 253,  2 Jul 31 05:26 /dev/dm-2
brw-rw---- 1 grid asmadmin 253,  3 Jul 31 05:26 /dev/dm-3
brw-rw---- 1 grid asmadmin 253,  4 Jul 31 05:26 /dev/dm-4
brw-rw---- 1 grid asmadmin 253,  5 Jul 31 05:26 /dev/dm-5
brw-rw---- 1 grid asmadmin 253,  6 Jul 31 05:26 /dev/dm-6
brw-rw---- 1 root disk     253,  7 Jul 31 05:26 /dev/dm-7
brw-rw---- 1 grid asmadmin 253,  8 Jul 31 05:26 /dev/dm-8
brw-rw---- 1 root disk     253,  9 Jul 31 05:26 /dev/dm-9
[root@xffdb2 rules.d]# ls -l /dev/mapper/
total 0
crw-rw---- 1 root root 10, 58 Jul 31 05:26 control
lrwxrwxrwx 1 root root      8 Jul 31 05:26 mpathe -> ../dm-17
lrwxrwxrwx 1 root root      7 Jul 31 05:26 mpathf -> ../dm-7
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data01_new -> ../dm-14
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data02_new -> ../dm-13
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data03 -> ../dm-2
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data04 -> ../dm-5
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data05_new -> ../dm-12
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data06 -> ../dm-6
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data07 -> ../dm-11
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data08 -> ../dm-9
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_log1 -> ../dm-4
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_log2 -> ../dm-3
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_vote2 -> ../dm-8
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_vote3 -> ../dm-10
lrwxrwxrwx 1 root root      8 Jul 31 05:26 vgdata-lv_data -> ../dm-15
lrwxrwxrwx 1 root root      7 Jul 31 05:26 vg_xffdb2-LogVol00 -> ../dm-1
lrwxrwxrwx 1 root root      7 Jul 31 05:26 vg_xffdb2-LogVol01 -> ../dm-0
lrwxrwxrwx 1 root root      8 Jul 31 05:26 vg_xffdb2-LogVol02 -> ../dm-16
[root@xffdb2 rules.d]# 
SQL> /

GROUP_NUMBER DISK_NUMBER HEADER_STATUS                        STATE                    PATH
------------ ----------- ------------------------------------ ------------------------ -----------------------------
           0           0 MEMBER                               NORMAL                   /dev/mapper/xffdb_data01_new
           0           1 MEMBER                               NORMAL                   /dev/mapper/xffdb_data05_new
           0           2 MEMBER                               NORMAL                   /dev/mapper/xffdb_data03
           0           3 MEMBER                               NORMAL                   /dev/mapper/xffdb_data06
           0           4 MEMBER                               NORMAL                   /dev/mapper/xffdb_data04
           0           5 MEMBER                               NORMAL                   /dev/mapper/xffdb_data02_new
           3           1 MEMBER                               NORMAL                   /dev/mapper/xffdb_vote2
           2           0 MEMBER                               NORMAL                   /dev/mapper/xffdb_log1
           3           2 MEMBER                               NORMAL                   /dev/mapper/xffdb_vote3
           2           1 MEMBER                               NORMAL                   /dev/mapper/xffdb_log2

10 rows selected.

mount磁盘组成功

SQL>  alter diskgroup data_dg mount 
NOTE: cache registered group DATA_DG number=1 incarn=0x4178bd5e
NOTE: cache began mount (first) of group DATA_DG number=1 incarn=0x4178bd5e
NOTE: Assigning number (1,0) to disk (/dev/mapper/xffdb_data01_new)
NOTE: Assigning number (1,4) to disk (/dev/mapper/xffdb_data05_new)
NOTE: Assigning number (1,2) to disk (/dev/mapper/xffdb_data03)
NOTE: Assigning number (1,5) to disk (/dev/mapper/xffdb_data06)
NOTE: Assigning number (1,3) to disk (/dev/mapper/xffdb_data04)
NOTE: Assigning number (1,1) to disk (/dev/mapper/xffdb_data02_new)
Wed Jul 31 05:27:47 2024
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 46 for pid 29, osid 26738
NOTE: cache opening disk 0 of grp 1: DATA_DG_0000 path:/dev/mapper/xffdb_data01_new
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA_DG_0001 path:/dev/mapper/xffdb_data02_new
NOTE: cache opening disk 2 of grp 1: DATA_DG_0002 path:/dev/mapper/xffdb_data03
NOTE: cache opening disk 3 of grp 1: DATA_DG_0003 path:/dev/mapper/xffdb_data04
NOTE: cache opening disk 4 of grp 1: DATA_DG_0004 path:/dev/mapper/xffdb_data05_new
NOTE: cache opening disk 5 of grp 1: DATA_DG_0005 path:/dev/mapper/xffdb_data06
NOTE: cache mounting (first) external redundancy group 1/0x4178BD5E (DATA_DG)
Wed Jul 31 05:27:47 2024
* allocate domain 1, invalid = TRUE 
kjbdomatt send to inst 1
Wed Jul 31 05:27:47 2024
NOTE: attached to recovery domain 1
NOTE: cache recovered group 1 to fcn 0.1814063801
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Wed Jul 31 05:27:47 2024
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA_DG)
NOTE: LGWR found thread 1 closed at ABA 12401.4517
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA_DG)
NOTE: LGWR opening thread 1 at fcn 0.1814063801 ABA 12402.4518
NOTE: cache mounting group 1/0x4178BD5E (DATA_DG) succeeded
NOTE: cache ending mount (success) of group DATA_DG number=1 incarn=0x4178bd5e
Wed Jul 31 05:27:47 2024
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATA_DG was mounted
SUCCESS:  alter diskgroup data_dg mount

重要提醒:手工直接对multipath设备权限所有者操作,当该设备被访问之时权限可能恢复成当初默认root:disk,对于这样的设备建议通过udev进行设置权限和所有者等信息

此条目发表在 Oracle 分类目录,贴了 标签。将固定链接加入收藏夹。

评论功能已关闭。