标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,697)
- DB2 (22)
- MySQL (74)
- Oracle (1,558)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (93)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
- ORA-15411: Failure groups in disk group DATA have different number of disks.
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
分类目录归档:Oracle ASM
ORA-15096: lost disk write detected
又一例由于存储掉电导致asm磁盘组,由于ORA-15096: lost disk write detected,导致无法mount的恢复请求
SQL> ALTER DISKGROUP DATA MOUNT /* asm agent *//* {1:45277:148} */ NOTE: cache registered group DATA number=2 incarn=0x73886b6a NOTE: cache began mount (first) of group DATA number=2 incarn=0x73886b6a NOTE: Assigning number (2,2) to disk (/dev/asm-data3) NOTE: Assigning number (2,1) to disk (/dev/asm-data2) NOTE: Assigning number (2,0) to disk (/dev/asm-data1) Fri Nov 06 19:06:56 2020 NOTE: GMON heartbeating for grp 2 GMON querying group 2 at 94 for pid 30, osid 11596 NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/asm-data1 NOTE: F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/asm-data2 NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/asm-data3 NOTE: cache mounting (first) external redundancy group 2/0x73886B6A (DATA) Fri Nov 06 19:06:57 2020 * allocate domain 2, invalid = TRUE kjbdomatt send to inst 2 Fri Nov 06 19:06:57 2020 NOTE: attached to recovery domain 2 NOTE: starting recovery of thread=1 ckpt=25.7986 group=2 (DATA) NOTE: starting recovery of thread=2 ckpt=33.364 group=2 (DATA) NOTE: BWR validation signaled ORA-15096 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_11596.trc: ORA-15096: lost disk write detected NOTE: crash recovery signalled OER-15096 ERROR: ORA-15096 signalled during mount of diskgroup DATA NOTE: cache dismounting (clean) group 2/0x73886B6A (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 11596, image: oracle@db1 (TNS V1-V3) NOTE: lgwr not being msg'd to dismount kjbdomdet send to inst 2 detach from dom 2, sending detach message to inst 2 freeing rdom 2 NOTE: detached from domain 2 NOTE: cache dismounted group 2/0x73886B6A (DATA) NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x73886b6a NOTE: cache deleting context for group DATA 2/0x73886b6a GMON dismounting group 2 at 95 for pid 30, osid 11596 NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA was not mounted ORA-15032: not all alterations performed ORA-15096: lost disk write detected ERROR: ALTER DISKGROUP DATA MOUNT /* asm agent *//* {1:45277:148} */
通过判断,通过一系列处理之后,数据库进行了mount操作发现报错ORA-600 2130
Fri Nov 06 17:03:27 2020 ALTER DATABASE RECOVER database Media Recovery Start started logmerger process Parallel Media Recovery started with 40 slaves Fri Nov 06 17:03:29 2020 Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_pr00_7393.trc (incident=195869): ORA-00600: internal error code, arguments: [2130], [2], [1], [2], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_195869/ynhis1_pr00_7393_i195869.trc Fri Nov 06 17:03:30 2020 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Media Recovery failed with error 600 ORA-10877 signalled during: ALTER DATABASE RECOVER database ...
判断redo异常,通过resetlogs打开库,发现报错ORA-00600 2662
Fri Nov 06 18:21:32 2020 alter database open resetlogs RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 8670753264 Resetting resetlogs activation ID 306909514 (0x124b114a) Redo thread 2 enabled by open resetlogs or standby activation Fri Nov 06 18:21:39 2020 Setting recovery target incarnation to 2 Initializing SCN for created control file Database SCN compatibility initialized to 3 Warning - High Database SCN: Current SCN value is 8670753267, threshold SCN value is 0 Fri Nov 06 18:21:39 2020 Assigning activation ID 408224320 (0x18550240) Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: /orabak/data/group_1.289.954514319 Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Fri Nov 06 18:21:40 2020 SMON: enabling cache recovery Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc (incident=231847): ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365], [4194545], [], [], [], [], [],[] Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_231847/ynhis1_ora_24310_i231847.trc Fri Nov 06 18:21:42 2020 Dumping diagnostic data in directory=[cdmp_20201106182142],requested by(instance=1,osid=24310),summary=[incident=231847] Fri Nov 06 18:21:43 2020 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365],[4194545],[],[],[],[],[],[] Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365],[4194545],[],[],[],[],[],[] Error 704 happened during db open, shutting down database USER (ospid: 24310): terminating the instance due to error 704 Instance terminated by USER, pid = 24310 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (24310) as a result of ORA-1092
处理该错误之后,数据库resetlog之后,数据库open成功但是报错ORA-00600 4137
Database Characterset is ZHS16GBK Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_smon_26195.trc (incident=255799): ORA-00600: internal error code, arguments: [4137], [25.33.122556], [0], [0], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_255799/ynhis1_smon_26195_i255799.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. No Resource Manager plan active Fri Nov 06 18:30:46 2020 replication_dependency_tracking turned off (no async multimaster replication found) ORACLE Instance ynhis1 (pid = 23) - Error 600 encountered while recovering transaction (25, 33). Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_smon_26195.trc: ORA-00600: internal error code, arguments: [4137], [25.33.122556], [0], [0], [], [], [], [], [], [], [], []
对异常undo进行处理,数据库可以正常启动关闭,然后安排数据导出导入新库操作,恢复完成.
win asm disk header 异常恢复
有朋友反馈win环境下rac异常,asm无法正常mount,检查日志发现
Fri Jul 03 03:55:46 2020 Errors in file C:\APP\ADMINISTRATOR\diag\asm\+asm\+asm2\trace\+asm2_ora_7004.trc: ORA-15025: could not open disk "\\.\ORCLDISKDATA1" ORA-27041: unable to open file OSD-04002: 无法打开文件 O/S-Error: (OS 2) 系统找不到指定的文件。 Errors in file C:\APP\ADMINISTRATOR\diag\asm\+asm\+asm2\trace\+asm2_ora_7004.trc: ORA-15025: could not open disk "\\.\ORCLDISKDATA1" ORA-27041: unable to open file OSD-04002: 无法打开文件 O/S-Error: (OS 2) 系统找不到指定的文件。 WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 267 in group [2.2254399778] from disk DATA_0000 allocation unit 3502 reason error; if possible, will try another mirror side Errors in file C:\APP\ADMINISTRATOR\diag\asm\+asm\+asm2\trace\+asm2_ora_7004.trc: ORA-15081: failed to submit an I/O operation to a disk Fri Jul 03 03:59:46 2020 Errors in file C:\APP\ADMINISTRATOR\diag\asm\+asm\+asm2\trace\+asm2_ora_7328.trc: ORA-15025: could not open disk "\\.\ORCLDISKDATA1" ORA-27041: unable to open file OSD-04002: 无法打开文件 O/S-Error: (OS 2) 系统找不到指定的文件。 Errors in file C:\APP\ADMINISTRATOR\diag\asm\+asm\+asm2\trace\+asm2_ora_7328.trc: ORA-15025: could not open disk "\\.\ORCLDISKDATA1" ORA-27041: unable to open file OSD-04002: 无法打开文件 O/S-Error: (OS 2) 系统找不到指定的文件。 WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 267 in group [2.2254399778] from disk DATA_0000 allocation unit 3502 reason error; if possible, will try another mirror side Errors in file C:\APP\ADMINISTRATOR\diag\asm\+asm\+asm2\trace\+asm2_ora_7328.trc: ORA-15081: failed to submit an I/O operation to a disk
报错信息比较明显是由于无法找到\\.\ORCLDISKDATA1磁盘,因此异常,通过asmtool查看磁盘信息
C:\app\11.2.0\grid>asmtool -list NTFS \Device\Harddisk0\Partition3 81920M NTFS \Device\Harddisk0\Partition4 200000M NTFS \Device\Harddisk0\Partition5 4293849M \Device\Harddisk1\Partition2 4062M \Device\Harddisk2\Partition2 2097022M ORCLDISKFRA0 \Device\Harddisk3\Partition2 511870M
明显的发现ORCLDISKDATA1磁盘丢失,通过对磁盘dd到本地然后进行分析发现,asm disk header损坏
C:\Users\Administrator>kfed read F:\temp\disk3\1\disk2.dd kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 006B38C00 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0] C:\Users\Administrator>kfed read F:\temp\disk3\1\disk2.dd blkn=2 kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL kfbh.datfmt: 2 ; 0x003: 0x02 kfbh.block.blk: 2 ; 0x004: blk=2 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 2349305287 ; 0x00c: 0x8c078dc7 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdatb.aunum: 0 ; 0x000: 0x00000000 kfdatb.shrink: 448 ; 0x004: 0x01c0 kfdatb.ub2pad: 0 ; 0x006: 0x0000 kfdatb.auinfo[0].link.next: 8 ; 0x008: 0x0008 kfdatb.auinfo[0].link.prev: 8 ; 0x00a: 0x0008 kfdatb.auinfo[1].link.next: 12 ; 0x00c: 0x000c kfdatb.auinfo[1].link.prev: 12 ; 0x00e: 0x000c kfdatb.auinfo[2].link.next: 456 ; 0x010: 0x01c8 kfdatb.auinfo[2].link.prev: 456 ; 0x012: 0x01c8 kfdatb.auinfo[3].link.next: 488 ; 0x014: 0x01e8 kfdatb.auinfo[3].link.prev: 488 ; 0x016: 0x01e8 kfdatb.auinfo[4].link.next: 24 ; 0x018: 0x0018 kfdatb.auinfo[4].link.prev: 24 ; 0x01a: 0x0018 kfdatb.auinfo[5].link.next: 28 ; 0x01c: 0x001c kfdatb.auinfo[5].link.prev: 28 ; 0x01e: 0x001c kfdatb.auinfo[6].link.next: 552 ; 0x020: 0x0228 kfdatb.auinfo[6].link.prev: 3112 ; 0x022: 0x0c28 kfdatb.spare: 0 ; 0x024: 0x00000000 kfdate[0].discriminator: 1 ; 0x028: 0x00000001 kfdate[0].allo.lo: 0 ; 0x028: XNUM=0x0 kfdate[0].allo.hi: 8388608 ; 0x02c: V=1 I=0 H=0 FNUM=0x0 kfdate[1].discriminator: 1 ; 0x030: 0x00000001 kfdate[1].allo.lo: 0 ; 0x030: XNUM=0x0 kfdate[1].allo.hi: 8388608 ; 0x034: V=1 I=0 H=0 FNUM=0x0 kfdate[2].discriminator: 1 ; 0x038: 0x00000001 kfdate[2].allo.lo: 0 ; 0x038: XNUM=0x0 kfdate[2].allo.hi: 8388609 ; 0x03c: V=1 I=0 H=0 FNUM=0x1
fra磁盘虽然磁盘asm label信息存在,但是其他信息依旧损坏,但是也只是磁盘头信息损坏
通过现场分析,基本上可以确定是由于某种原因导致win asm 的磁盘的所有磁盘头都损坏(两个磁盘头被置空,另外一个磁盘头基本上损坏),基于原因未知
基于客户现场的情况,以及他们有前一天的rman备份,而且客户有保障现场(进一步故障原因分析)的需求,未在现场环境进行恢复,而是在不对现场环境做任何修改的情况下,直接恢复fra里面的redo和归档日志,进而结合备份异地实现数据库恢复,实现数据0丢失,又不破坏现场的效果
以前遇到过类似我其他操作系统平台中asm disk header异常的case:
asm磁盘分区丢失恢复
pvid=yes导致asm无法mount
asm磁盘头全部损坏数据0丢失恢复
分区无法识别导致asm diskgroup无法mount
asm disk误设置pvid导致asm diskgroup无法mount恢复
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh]故障处理
客户对asm进行扩容,由于配置不恰当,在使用asmca增加asm disk的时候直接选中了已经被用作文件系统的vg中的磁盘
Tue Nov 19 09:48:48 2019 Non critical error ORA-48180 cFri Nov 22 12:47:48 2019 SQL> ALTER DISKGROUP XIFENFEI ADD DISK '/dev/rhdisk29' SIZE 491520M , '/dev/rhdisk30' SIZE 491520M , '/dev/rhdisk31' SIZE 491520M /* ASMCA */ NOTE: GroupBlock outside rolling migration privileged region NOTE: Assigning number (4,15) to disk (/dev/rhdisk29) NOTE: Assigning number (4,16) to disk (/dev/rhdisk30) NOTE: Assigning number (4,17) to disk (/dev/rhdisk31) NOTE: requesting all-instance membership refresh for group=4 NOTE: initializing header on grp 4 disk XIFENFEI_0015 NOTE: initializing header on grp 4 disk XIFENFEI_0016 NOTE: initializing header on grp 4 disk XIFENFEI_0017 NOTE: requesting all-instance disk validation for group=4 Fri Nov 22 12:47:51 2019 NOTE: skipping rediscovery for group 4/0xb08c40b (XIFENFEI) on local instance. NOTE: requesting all-instance disk validation for group=4 NOTE: skipping rediscovery for group 4/0xb08c40b (XIFENFEI) on local instance. Fri Nov 22 12:47:59 2019 NOTE: initiating PST update: grp = 4 Fri Nov 22 12:47:59 2019 GMON updating group 4 at 12 for pid 27, osid 12649908 NOTE: PST update grp = 4 completed successfully NOTE: membership refresh pending for group 4/0xb08c40b (XIFENFEI) GMON querying group 4 at 13 for pid 18, osid 39912680 Fri Nov 22 12:48:01 2019 NOTE: cache opening disk 15 of grp 4: XIFENFEI_0015 path:/dev/rhdisk29 NOTE: cache opening disk 16 of grp 4: XIFENFEI_0016 path:/dev/rhdisk30 NOTE: cache opening disk 17 of grp 4: XIFENFEI_0017 path:/dev/rhdisk31 NOTE: Attempting voting file refresh on diskgroup XIFENFEI NOTE: Refresh completed on diskgroup XIFENFEI. No voting file found. GMON querying group 4 at 14 for pid 18, osid 39912680 SUCCESS: refreshed membership for 4/0xb08c40b (XIFENFEI) SUCCESS: ALTER DISKGROUP XIFENFEI ADD DISK '/dev/rhdisk29' SIZE 491520M , '/dev/rhdisk30' SIZE 491520M , '/dev/rhdisk31' SIZE 491520M /* ASMCA */
发现增加错磁盘之后,从vg里面强制踢掉被asm使用的磁盘,并且尝试在asm中删除这些磁盘,并加入新磁盘
Fri Nov 22 12:52:03 2019 SQL> ALTER DISKGROUP XIFENFEI DROP DISK 'XIFENFEI_0015','XIFENFEI_0016','XIFENFEI_0017' /* ASMCA */ NOTE: GroupBlock outside rolling migration privileged region Fri Nov 22 12:52:03 2019 NOTE: stopping process ARB0 NOTE: rebalance interrupted for group 4/0xb08c40b (XIFENFEI) NOTE: requesting all-instance membership refresh for group=4 NOTE: membership refresh pending for group 4/0xb08c40b (XIFENFEI) Fri Nov 22 12:52:12 2019 GMON querying group 4 at 15 for pid 18, osid 39912680 SUCCESS: refreshed membership for 4/0xb08c40b (XIFENFEI) SUCCESS: ALTER DISKGROUP XIFENFEI DROP DISK 'XIFENFEI_0015','XIFENFEI_0016','XIFENFEI_0017' /* ASMCA */ NOTE: starting rebalance of group 4/0xb08c40b (XIFENFEI) at power 1 Starting background process ARB0 ………… Fri Nov 22 12:58:26 2019 SQL> ALTER DISKGROUP XIFENFEI ADD DISK '/dev/rhdisk7' SIZE 491520M /* ASMCA */ NOTE: GroupBlock outside rolling migration privileged region Fri Nov 22 12:58:26 2019 NOTE: stopping process ARB0 NOTE: rebalance interrupted for group 4/0xb08c40b (XIFENFEI) NOTE: ASM did background COD recovery for group 4/0xb08c40b (XIFENFEI) NOTE: Assigning number (4,18) to disk (/dev/rhdisk7) NOTE: requesting all-instance membership refresh for group=4 NOTE: initializing header on grp 4 disk XIFENFEI_0018 NOTE: requesting all-instance disk validation for group=4 NOTE: skipping rediscovery for group 4/0xb08c40b (XIFENFEI) on local instance. NOTE: requesting all-instance disk validation for group=4 NOTE: skipping rediscovery for group 4/0xb08c40b (XIFENFEI) on local instance. Fri Nov 22 12:58:41 2019 NOTE: initiating PST update: grp = 4 Fri Nov 22 12:58:41 2019 GMON updating group 4 at 16 for pid 27, osid 12649908 NOTE: PST update grp = 4 completed successfully Fri Nov 22 12:58:41 2019 NOTE: membership refresh pending for group 4/0xb08c40b (XIFENFEI) GMON querying group 4 at 17 for pid 18, osid 39912680 NOTE: cache opening disk 18 of grp 4: XIFENFEI_0018 path:/dev/rhdisk7 NOTE: Attempting voting file refresh on diskgroup XIFENFEI NOTE: Refresh completed on diskgroup XIFENFEI. No voting file found. GMON querying group 4 at 18 for pid 18, osid 39912680 SUCCESS: refreshed membership for 4/0xb08c40b (XIFENFEI) NOTE: starting rebalance of group 4/0xb08c40b (XIFENFEI) at power 1 SUCCESS: ALTER DISKGROUP XIFENFEI ADD DISK '/dev/rhdisk7' SIZE 491520M /* ASMCA */ Starting background process ARB0 Fri Nov 22 12:58:46 2019 ARB0 started with pid=44, OS id=54460432 ………… Fri Nov 22 12:59:57 2019 SQL> ALTER DISKGROUP XIFENFEI ADD DISK '/dev/rhdisk10' SIZE 491520M , '/dev/rhdisk11' SIZE 491520M , '/dev/rhdisk8' SIZE 491520M , '/dev/rhdisk9' SIZE 491520M /* ASMCA */ NOTE: GroupBlock outside rolling migration privileged region Fri Nov 22 12:59:57 2019 NOTE: stopping process ARB0 NOTE: rebalance interrupted for group 4/0xb08c40b (XIFENFEI) NOTE: ASM did background COD recovery for group 4/0xb08c40b (XIFENFEI) NOTE: Assigning number (4,19) to disk (/dev/rhdisk10) NOTE: Assigning number (4,20) to disk (/dev/rhdisk11) NOTE: Assigning number (4,21) to disk (/dev/rhdisk8) NOTE: Assigning number (4,22) to disk (/dev/rhdisk9) NOTE: requesting all-instance membership refresh for group=4 NOTE: initializing header on grp 4 disk XIFENFEI_0019 NOTE: initializing header on grp 4 disk XIFENFEI_0020 NOTE: initializing header on grp 4 disk XIFENFEI_0021 NOTE: initializing header on grp 4 disk XIFENFEI_0022 NOTE: requesting all-instance disk validation for group=4 NOTE: skipping rediscovery for group 4/0xb08c40b (XIFENFEI) on local instance. Fri Nov 22 13:00:08 2019 NOTE: requesting all-instance disk validation for group=4 Fri Nov 22 13:00:08 2019 NOTE: skipping rediscovery for group 4/0xb08c40b (XIFENFEI) on local instance. NOTE: initiating PST update: grp = 4 Fri Nov 22 13:00:13 2019 GMON updating group 4 at 19 for pid 27, osid 12649908 NOTE: PST update grp = 4 completed successfully NOTE: membership refresh pending for group 4/0xb08c40b (XIFENFEI) GMON querying group 4 at 20 for pid 18, osid 39912680 NOTE: cache opening disk 19 of grp 4: XIFENFEI_0019 path:/dev/rhdisk10 NOTE: cache opening disk 20 of grp 4: XIFENFEI_0020 path:/dev/rhdisk11 NOTE: cache opening disk 21 of grp 4: XIFENFEI_0021 path:/dev/rhdisk8 NOTE: cache opening disk 22 of grp 4: XIFENFEI_0022 path:/dev/rhdisk9 NOTE: Attempting voting file refresh on diskgroup XIFENFEI NOTE: Refresh completed on diskgroup XIFENFEI. No voting file found. GMON querying group 4 at 21 for pid 18, osid 39912680 SUCCESS: refreshed membership for 4/0xb08c40b (XIFENFEI) SUCCESS: ALTER DISKGROUP XIFENFEI ADD DISK '/dev/rhdisk10' SIZE 491520M , '/dev/rhdisk11' SIZE 491520M , '/dev/rhdisk8' SIZE 491520M , '/dev/rhdisk9' SIZE 491520M /* ASMCA */ NOTE: starting rebalance of group 4/0xb08c40b (XIFENFEI) at power 1 Starting background process ARB0
asm在做着reblance的过程中遭遇到坏块,直接导致磁盘组dismount
Sun Nov 24 04:42:27 2019 NOTE: group 4 PST updated. WARNING: cache read a corrupt block: group=4(XIFENFEI) dsk=15 blk=258 disk=15 (XIFENFEI_0015) incarn=1717056824 au=113792 blk=2 count=254 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_x000_28639240.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483663] [258] [56 != 0] NOTE: a corrupted block from group XIFENFEI was dumped to /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_x000_28639240.trc WARNING: cache read (retry) a corrupt block: group=4(XIFENFEI) dsk=15 blk=258 disk=15 (XIFENFEI_0015) incarn=1717056824 au=113792 blk=2 count=1 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_x000_28639240.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483663] [258] [56 != 0] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483663] [258] [56 != 0] ERROR: cache failed to read group=4(XIFENFEI) dsk=15 blk=258 from disk(s): 15(XIFENFEI_0015) ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483663] [258] [56 != 0] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483663] [258] [56 != 0] NOTE: cache initiating offline of disk 15 group XIFENFEI NOTE: process _x000_+asm2 (28639240) initiating offline of disk 15.1717056824 (XIFENFEI_0015) with mask 0x7e in group 4 NOTE: initiating PST update: grp = 4, dsk = 15/0x66583538, mask = 0x6a, op = clear GMON updating disk modes for group 4 at 23 for pid 28, osid 28639240 ERROR: Disk 15 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 4) Sun Nov 24 04:42:27 2019 NOTE: cache dismounting (not clean) group 4/0x0B08C40B (XIFENFEI) WARNING: Offline for disk XIFENFEI_0015 in mode 0x7f failed. Sun Nov 24 04:42:27 2019 NOTE: halting all I/Os to diskgroup 4 (XIFENFEI) NOTE: messaging CKPT to quiesce pins Unix process pid: 59441780, image: oracle@xifenfei2 (B000) Sun Nov 24 04:42:27 2019 ERROR: ORA-15130 thrown in ARB0 for group number 4 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_arb0_50856926.trc: ORA-15130: diskgroup "XIFENFEI" is being dismounted
至此两个节点的该磁盘组就陷入了不停的mount,然后dismount的轮流循环中.这里我们可以大概的分析出来,由于vg的磁盘组被写入了数据或者强制剔除的时候导致asm写入该文件的数据被破坏,导致后续的asm reblance遭遇坏块,然后直接dismount.对于该问题的解决方案,通过对对该磁盘组的acd和cod进行patch,让其不进行reblance,保持该磁盘组现在,稳定的mount状态,然后对其数据进行备份和重建该磁盘组.这个客户运气不错,vg中的asm disk磁盘写入较少,数据库运行正常.
对于这种情况,如果发生极端损坏,比如asm磁盘组无法mount,可以参考:找回ASM中数据文件
如果是asm的元数据大量损坏,无法通过asm字典级别恢复,可以通过参考:asm disk header 彻底损坏恢复
发表在 Oracle ASM
标签为 asm vg异常, endian_kfbh, kfc.c:26368, ORA-15196, ORA-15196: invalid ASM block header
评论关闭