分类目录归档:Oracle ASM

ORA-600 kfrValAcd30 恢复

有客户由于存储控制器损坏,修复控制器之后,asm无法正常mount,报ORA-600 kfrValAcd30错误,让我们提供技术支持
kfrValAcd30


asm alert日志报错

Wed Apr 03 16:50:57 2019
SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=1 incarn=0x14248741
NOTE: cache began mount (first) of group DATA number=1 incarn=0x14248741
NOTE: Assigning number (1,0) to disk (ORCL:DATAVOL1)
Wed Apr 03 16:51:03 2019
NOTE: start heartbeating (grp 1)
kfdp_query(DATA): 15 
kfdp_queryBg(): 15 
NOTE: cache opening disk 0 of grp 1: DATAVOL1 label:DATAVOL1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (first) external redundancy group 1/0x14248741 (DATA)
Wed Apr 03 16:51:04 2019
* allocate domain 1, invalid = TRUE 
Wed Apr 03 16:51:04 2019
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=27.2697 group=1 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_15951.trc  (incident=23394):
ORA-00600: internal error code, arguments: [kfrValAcd30], [DATA], [1], [27], [2697], [28], [2697], [], [], [], [], []
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM2/incident/incdir_23394/+ASM2_ora_15951_i23394.trc
Abort recovery for domain 1
NOTE: crash recovery signalled OER-600
ERROR: ORA-600 signalled during mount of diskgroup DATA
ORA-00600: internal error code, arguments: [kfrValAcd30], [DATA], [1], [27], [2697], [28], [2697], [], [], [], [], []
ERROR: alter diskgroup data mount
NOTE: cache dismounting (clean) group 1/0x14248741 (DATA) 
NOTE: lgwr not being msg'd to dismount
freeing rdom 1
Wed Apr 03 16:51:05 2019
Sweep [inc][23394]: completed
Sweep [inc2][23394]: completed
Wed Apr 03 16:51:05 2019
Trace dumping is performing id=[cdmp_20190403165105]
NOTE: detached from domain 1
NOTE: cache dismounted group 1/0x14248741 (DATA) 
NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x14248741
kfdp_dismount(): 16 
kfdp_dismountBg(): 16 
NOTE: De-assigning number (1,0) from disk (ORCL:DATAVOL1)
ERROR: diskgroup DATA was not mounted
NOTE: cache deleting context for group DATA 1/337938241

mos相关记录
参考:ORA-600 [KFRVALACD30] in ASM (Doc ID 2123013.1)
kfrValAcd30-mos


ORA-00600: internal error code, arguments: [kfrValAcd30]可能是bug或者硬件故障导致.基于客户的情况,最大可能就是由于硬件故障导致asm 磁盘组的acd无法正常进行,从而无法mount成功.如果运气好,通过kfed相关修复可以正常mount成功,运气不好可以通过底层进行恢复数据文件,从而最大限度恢复数据.

发表在 Oracle ASM, 非常规恢复 | 标签为 , , | 评论关闭

又一例asm格式化文件系统恢复

又一个客户把win rac中的asm disk给格式化为ntfs了(data磁盘组由三个500G的磁盘组成,被格式化掉前面两个还剩下一个),而且格式化之后,还进行了一系列恢复(比如修复磁盘头,又进行分区等一些磁盘操作),导致恢复难度增加,也增加了一些数据覆盖
asm alert日志报错

Thu Aug 23 11:20:14 2018
NOTE: ASM client orcl1:orcl disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Process state recorded in trace file d:\app\administrator\diag\asm\+asm\+asm1\trace\+asm1_ora_2260.trc
Thu Aug 23 11:20:28 2018
Errors in file d:\app\administrator\diag\asm\+asm\+asm1\trace\+asm1_lgwr_3820.trc:
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 87) 参数错误。
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:26 disk_offset(bytes):27566080 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:3820
NOTE: unable to write any mirror side for diskgroup DATA
NOTE: cache initiating offline of disk 1 group DATA
NOTE: process 3268:3820 initiating offline of disk 1.4042301899 (DATA_0001) with mask 0x7e in group 2
WARNING: Disk DATA_0001 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 2, dsk = 1/0xf0f0a1cb, mode = 0x15
kfdp_updateDsk(): 22 
Thu Aug 23 11:20:28 2018
kfdp_updateDskBg(): 22 
ERROR: too many offline disks in PST (grp 2)
WARNING: Disk DATA_0001 in mode 0x7f offline aborted

数据库alert日志报错

WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:422 disk_offset(bytes):442515456 io_size:16384 operation:Read type:synchronous
	 result:I/O error process_id:11992
WARNING: failed to read mirror side 1 of virtual extent 5 logical extent 0 of file 260 in 
group [2.1859146063] from disk DATA_0001  allocation unit 422 reason error; if possible,will try another mirror side 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_11992.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 5 logical extent 0 of file 260 
in group 2 on disk 1 allocation unit 422 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_11992.trc:
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
Thu Aug 23 11:20:13 2018
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-27070: 异步读取/写入失败
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:841 disk_offset(bytes):882532352 io_size:131072 operation:Write type:asynchronous
	 result:I/O error process_id:3224
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 240 logical extent 0 of file 259 in group 2 on disk 1 
allocation unit 841 KCF: read, write or open error, block=0x7853 online=1
        file=4 '+DATA/orcl/datafile/users.259.944422883'
        error=15081 txt: ''
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 87) 参数错误。
WARNING: IO Failed. group:2 disk(number.incarnation):1.0xf0f0a1cb disk_path:\\.\ORCLDISKDATA1
	 AU:422 disk_offset(bytes):442515456 io_size:16384 operation:Read type:synchronous
	 result:I/O error process_id:3224
WARNING: failed to read mirror side 1 of virtual extent 5 logical extent 0 of file 260 in group [2.1859146063] from 
disk DATA_0001  allocation unit 422 reason error; if possible,will try another mirror side 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
WARNING: failed to write mirror side 1 of virtual extent 5 logical extent 0 of file 260 in group 2 on disk 1 
allocation unit 422 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_dbw1_3224.trc:
ORA-00204: 读取控制文件时出错 (块 41, # 块 1)
ORA-00202: 控制文件: ''+DATA/orcl/controlfile/current.260.944422981''
ORA-15081: 无法将 I/O 操作提交到磁盘
DBW1 (ospid: 3224): terminating the instance due to error 204

由于客户进行了一系列恢复恢复操作导致查看磁盘都不全

D:\>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              100M
NTFS                             \Device\Harddisk0\Partition2           102298M
NTFS                             \Device\Harddisk1\Partition1           102397M
NTFS                             \Device\Harddisk2\Partition1           204797M
---这里还有一个磁盘没有正常显示
ORCLDISKDATA10                   \Device\Harddisk4\Partition1           511997M--客户尝试修复的磁盘
ORCLDISKDATA2                    \Device\Harddisk5\Partition1           511997M
ORCLDISKRECOVERY0                \Device\Harddisk6\Partition1            51197M
ORCLDISKRECOVERY1                \Device\Harddisk7\Partition1            51197M
ORCLDISKRECOVERY2                \Device\Harddisk8\Partition1            51197M
ORCLDISKCRS0                     \Device\Harddisk9\Partition1            10237M
ORCLDISKCRS1                     \Device\Harddisk10\Partition1           10237M
ORCLDISKCRS2                     \Device\Harddisk11\Partition1           10237M
NTFS                             \Device\Harddisk12\Partition2         4194174M

通过主机层面激活卷,删除分区等一系列操作,然后通过kfed构造磁盘头,让这些磁盘在os层面可以正常显示

C:\Users\Administrator>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              100M
NTFS                             \Device\Harddisk0\Partition2           102298M
NTFS                             \Device\Harddisk1\Partition1           102397M
NTFS                             \Device\Harddisk2\Partition1           204797M
------需要处理的磁盘------
ORCLDISKDATA0                    \Device\Harddisk3\Partition1           511997M
ORCLDISKDATA1                    \Device\Harddisk4\Partition1           511997M
ORCLDISKDATA2                    \Device\Harddisk5\Partition1           511997M
-----------------------
ORCLDISKRECOVERY0                \Device\Harddisk6\Partition1            51197M
ORCLDISKRECOVERY1                \Device\Harddisk7\Partition1            51197M
ORCLDISKRECOVERY2                \Device\Harddisk8\Partition1            51197M
ORCLDISKCRS0                     \Device\Harddisk9\Partition1            10237M
ORCLDISKCRS1                     \Device\Harddisk10\Partition1           10237M
ORCLDISKCRS2                     \Device\Harddisk11\Partition1           10237M
NTFS                             \Device\Harddisk12\Partition2         4194174M

由于asm磁盘组内部目录au被彻底损坏,导致无法通过asm直接拷贝出来数据,通过底层扫描,按照au恢复出来相关数据,由于格式化ntfs和后续的误操作导致部分数据au被覆盖.其余数据均恢复,抢救了绝大部分数据.
数据文件恢复参考:asm disk header 彻底损坏恢复
另外有一次win平台类似恢复经历:asm disk格式化为ntfs恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 Oracle ASM, 非常规恢复 | 标签为 , , , , | 评论关闭

asm disk 大小限制

这个问题在12C之前争议很小,基本共识非XD环境不能超过2T,但是到了后面的版本中,发生了一些改变,主要是COMPATIBLE.ASM and COMPATIBLE.RDBMS disk group attributes are set to 12.1 or greater的时候asm disk 大小限制依赖au size,
1M ausize asm disk limit为4 PB
2M ausize asm disk limit为8 PB
4M ausize asm disk limit为16 PB
8M ausize asm disk limit为32 PB

asm-limit-1
asm-limit-2


参见:Oracle ASM Storage Limits
18C中COMPATIBLE.ASM和COMPATIBLE.RDBMS默认值(COMPATIBLE.RDBMS为10.1,也就是说默认情况下非XD情况还是只能支持不超过2T的asm disk)
18c-asm

发表在 Oracle ASM | 标签为 , | 评论关闭