标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,697)
- DB2 (22)
- MySQL (74)
- Oracle (1,558)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (93)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
- ORA-15411: Failure groups in disk group DATA have different number of disks.
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
- .wstop扩展名勒索数据库恢复
分类目录归档:Oracle ASM
ORA-600 kffmLoad_1 kffmVerify_4
有朋友asm运行一段时间asm实例会报错导致数据库实例异常
Wed Dec 23 08:31:55 2020 Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_6729.trc: ORA-00600: internal error code, arguments: [kffmLoad_1], [4365], [1], [], [], [], [], [] Wed Dec 23 08:31:55 2020 Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_6729.trc: ORA-00600: internal error code, arguments: [kffmLoad_1], [4365], [1], [], [], [], [], [] Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_29743.trc: ORA-00600: internal error code, arguments: [kffmLoad_1], [670], [1], [], [], [], [], [] Wed Dec 23 09:10:22 2020 Errors in file /u01/app/oracle/admin/+ASM/bdump/+asm1_asmb_29743.trc: ORA-00600: internal error code, arguments: [kffmLoad_1], [670], [1], [], [], [], [], [] Wed Dec 23 09:10:22 2020 Wed Dec 23 10:18:33 2020 Errors in file /u01/app/oracle/admin/+ASM/udump/+asm1_ora_25890.trc: ORA-00600: internal error code, arguments: [kffmVerify_4], [0], [0], [887], [1005986561], [1352], [1], [0]
对应的trace文件
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u01/app/oracle/product/10.2.0/db System name: Linux Node name: shb01 Release: 2.6.18-348.el5 Version: #1 SMP Wed Nov 28 21:22:00 EST 2012 Machine: x86_64 Instance name: +ASM1 Redo thread mounted by this instance: 0 <none> Oracle process number: 29 Unix process pid: 26337, image: oracle@xff01 (TNS V1-V3) *** ACTION NAME:() 2020-12-22 19:03:41.272 *** MODULE NAME:(sp_ocap@xff01 (TNS V1-V3)) 2020-12-22 19:03:41.272 *** SERVICE NAME:() 2020-12-22 19:03:41.272 *** SESSION ID:(143.1) 2020-12-22 19:03:41.272 *** 2020-12-22 19:03:41.272 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [kffmVerify_4], [0], [0], [1657], [1005987045], [152], [1], [0] Current SQL statement for this session: DECLARE fileType varchar2(16); fileName varchar2(1024); blkSz number; fileSz number; hdl number; plksz number; BEGIN fileName := '+DATA4/xifenfei/onlinelog/group_6.1657.1005987045'; BEGIN dbms_diskgroup.getfileattr(fileName,fileType,fileSz, blkSz); dbms_diskgroup.open(fileName,'r',fileType,blkSz,hdl,plkSz,fileSz); EXCEPTION WHEN OTHERS then :rc := SQLCODE; :err_msg := SQLERRM; return; END; :handle := hdl; :bsz := blkSz; :bcnt := fileSz; :rc := 0; END; ----- PL/SQL Call Stack ----- object line object handle number name 0x15ce59360 96 package body SYS.X$DBMS_DISKGROUP 0x15cd88568 12 anonymous block ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst()+31 call ksedst1() 000000000 ? 000000001 ? 7FFFBDFC3450 ? 7FFFBDFC34B0 ? 7FFFBDFC33F0 ? 000000000 ? ksedmp()+610 call ksedst() 000000000 ? 000000001 ? 7FFFBDFC3450 ? 7FFFBDFC34B0 ? 7FFFBDFC33F0 ? 000000000 ? ksfdmp()+21 call ksedmp() 000000003 ? 000000001 ? 7FFFBDFC3450 ? 7FFFBDFC34B0 ? 7FFFBDFC33F0 ? 000000000 ? kgerinv()+161 call ksfdmp() 000000003 ? 000000001 ? 7FFFBDFC3450 ? 7FFFBDFC34B0 ? 7FFFBDFC33F0 ? 000000000 ? kgeasnmierr()+163 call kgerinv() 0068996E0 ? 009AA2670 ? 7FFFBDFC34B0 ? 7FFFBDFC33F0 ? 000000000 ? 000000000 ? kffmVerify()+379 call kgeasnmierr() 0068996E0 ? 009AA2670 ? 7FFFBDFC34B0 ? 7FFFBDFC33F0 ? 000000000 ? 000000000 ? kfioIdentify()+1276 call kffmVerify() 000000000 ? 00000000D ? 000000001 ? 927B814400000004 ? 3BF624E500000679 ? 000000000 ? ksfd_osmopn()+1138 call kfioIdentify() 7FFFBDFC4820 ? 15DB873F4 ? 15DB87556 ? 000000200 ? 7FFF00000003 ? 15DB873C8 ? ksfdopn()+1014 call ksfd_osmopn() 7FFFBDFC4820 ? 00000002D ? 000000200 ? 000000003 ? 2B3800020000 ? 15F3031F0 ? kfpkgDGOpenFile()+2 call ksfdopn() 7FFFBDFC4820 ? 00000002D ? 301 000000200 ? 000000003 ? 000020000 ? 15F3031F0 ? pevm_icd_call_commo call kfpkgDGOpenFile() 2B383F459FA8 ? 00000002D ? n()+1003 2B383F439070 ? 000000003 ? 000020000 ? 15F3031F0 ? pfrinstr_ICAL()+228 call pevm_icd_call_commo 7FFFBDFC5700 ? 000000000 ? n() 000000001 ? 000000001 ? 000000007 ? 7FFF00000000 ? pfrrun_no_tool()+65 call pfrinstr_ICAL() 2B383F459FA8 ? 005DBD8AA ? 2B383F45A010 ? 000000001 ? 000000007 ? 7FFF00000000 ? pfrrun()+906 call pfrrun_no_tool() 2B383F459FA8 ? 005DBD8AA ? 2B383F45A010 ? 000000001 ? 000000007 ? 7FFF00000000 ? plsql_run()+841 call pfrrun() 2B383F459FA8 ? 000000000 ? 2B383F45A010 ? 7FFFBDFC5700 ? 000000007 ? 15CD77BD6 ? peicnt()+298 call plsql_run() 2B383F459FA8 ? 000000001 ? 000000000 ? 7FFFBDFC5700 ? 000000007 ? 900000000 ? kkxexe()+503 call peicnt() 7FFFBDFC5700 ? 2B383F459FA8 ? 2B383F438830 ? 7FFFBDFC5700 ? 2B383F4367D8 ? 900000000 ? opiexe()+4691 call kkxexe() 2B383F4561D8 ? 2B383F459FA8 ? 2B383F438830 ? 15C160BD8 ? 0040D677F ? 900000000 ? kpoal8()+2273 call opiexe() 000000049 ? 000000003 ? 7FFFBDFC6950 ? 000000001 ? 0040D677F ? 900000000 ? opiodr()+984 call kpoal8() 00000005E ? 000000017 ? 7FFFBDFC9830 ? 000000001 ? 000000001 ? 900000000 ? ttcpip()+1012 call opiodr() 00000005E ? 000000017 ? 7FFFBDFC9830 ? 000000000 ? 0059C35D0 ? 900000000 ? opitsk()+1322 call ttcpip() 0068A13B0 ? 7FFFBDFC75A0 ? 7FFFBDFC9830 ? 000000000 ? 7FFFBDFC9328 ? 7FFFBDFC9998 ? opiino()+1026 call opitsk() 000000003 ? 000000000 ? 7FFFBDFC9830 ? 000000001 ? 000000000 ? 4E6111C00000001 ? opiodr()+984 call opiino() 00000003C ? 000000004 ? 7FFFBDFCA9F8 ? 000000001 ? 000000000 ? 4E6111C00000001 ? opidrv()+547 call opiodr() 00000003C ? 000000004 ? 7FFFBDFCA9F8 ? 000000000 ? 0059C3080 ? 4E6111C00000001 ? sou2o()+114 call opidrv() 00000003C ? 000000004 ? 7FFFBDFCA9F8 ? 000000000 ? 0059C3080 ? 4E6111C00000001 ? opimai_real()+163 call sou2o() 7FFFBDFCA9D0 ? 00000003C ? 000000004 ? 7FFFBDFCA9F8 ? 0059C3080 ? 4E6111C00000001 ? main()+116 call opimai_real() 000000002 ? 7FFFBDFCAA60 ? 000000004 ? 7FFFBDFCA9F8 ? 0059C3080 ? 4E6111C00000001 ? __libc_start_main() call main() 000000002 ? 7FFFBDFCAA60 ? +244 000000004 ? 7FFFBDFCA9F8 ? 0059C3080 ? 4E6111C00000001 ? _start()+41 call __libc_start_main() 0007230B8 ? 000000002 ? 7FFFBDFCABB8 ? 000000000 ? 0059C3080 ? 000000002 ? --------------------- Binary Stack Dump ---------------------
结合mos信息ORA-600[KFFMVERIFY_4] OR ORA-600 [kffmLoad_1], [131635] REPORTED ON THE ASMINSTANCE (Doc ID 794103.1)的描述,由于多个进程/现场使用dbms_diskgroup访问不同磁盘组之时可能触发
BUG:6377738 – ASMB ORA-00600 [KFFMVERIFY_4]
BUG:8328467 – ASM CRASHED WITH ORA-600[KFFMVERIFY_4] OR [KFFMVERIFY_4] AND [KFFMLOAD_1]
从而导致asm实例crash,引起数据库异常.结合客户这边的情况,确认他们是使用了多个SharePlex程序同步数据,而且redo放在多个磁盘组中,从而出现该问题.临时解决方案为把所有的redo和归档放一个磁盘组,这样多个SharePlex进程调用dbms_diskgroup访问redo/arch不会触发该bug.
ORA-00600 kfrHtAdd01
由于存储掉电,报ORA-15096: lost disk write detected错误,无法mount磁盘组.
Sun Dec 20 16:56:51 2020 SQL> alter diskgroup data mount NOTE: cache registered group DATA number=1 incarn=0x0c1a7a4e NOTE: cache began mount (first) of group DATA number=1 incarn=0x0c1a7a4e NOTE: Assigning number (1,2) to disk (/dev/mapper/multipath12) NOTE: Assigning number (1,5) to disk (/dev/mapper/multipath15) NOTE: Assigning number (1,3) to disk (/dev/mapper/multipath13) NOTE: Assigning number (1,7) to disk (/dev/mapper/multipath17) NOTE: Assigning number (1,1) to disk (/dev/mapper/multipath11) NOTE: Assigning number (1,6) to disk (/dev/mapper/multipath16) NOTE: Assigning number (1,0) to disk (/dev/mapper/multipath10) NOTE: Assigning number (1,4) to disk (/dev/mapper/multipath14) Sun Dec 20 16:56:57 2020 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 19 for pid 32, osid 130347 NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/mapper/multipath10 NOTE: F1X0 found on disk 0 au 2 fcn 0.14159360 NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/mapper/multipath11 NOTE: F1X0 found on disk 1 au 2 fcn 0.14159360 NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/mapper/multipath12 NOTE: F1X0 found on disk 2 au 2 fcn 0.14159360 NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/mapper/multipath13 NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/mapper/multipath14 NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/mapper/multipath15 NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/mapper/multipath16 NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/mapper/multipath17 NOTE: cache mounting (first) normal redundancy group 1/0x0C1A7A4E (DATA) Sun Dec 20 16:56:57 2020 * allocate domain 1, invalid = TRUE Sun Dec 20 16:56:58 2020 NOTE: attached to recovery domain 1 NOTE: starting recovery of thread=1 ckpt=233.4189 group=1 (DATA) NOTE: starting recovery of thread=2 ckpt=542.6409 group=1 (DATA) lost disk write detected during recovery (apply) NOTE: recovery (pass 2) of diskgroup 1 (DATA) caught error ORA-15096 Errors in file /grid/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_130347.trc: ORA-15096: lost disk write detected Abort recovery for domain 1 NOTE: crash recovery signalled OER-15096 ERROR: ORA-15096 signalled during mount of diskgroup DATA NOTE: cache dismounting (clean) group 1/0x0C1A7A4E (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 130347, image: oracle@db1.rac.com (TNS V1-V3) NOTE: lgwr not being msg'd to dismount
通过一系列修复之后报错如下
Sun Dec 20 20:12:35 2020 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 23 for pid 26, osid 67538 Sun Dec 20 20:12:35 2020 NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/mapper/multipath10 NOTE: F1X0 found on disk 0 au 2 fcn 0.14159360 NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/mapper/multipath11 NOTE: F1X0 found on disk 1 au 2 fcn 0.14159360 NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/mapper/multipath12 NOTE: F1X0 found on disk 2 au 2 fcn 0.14159360 NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/mapper/multipath13 NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/mapper/multipath14 NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/mapper/multipath15 NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/mapper/multipath16 NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/mapper/multipath17 NOTE: cache mounting (first) normal redundancy group 1/0x64848829 (DATA) Sun Dec 20 20:12:36 2020 * allocate domain 1, invalid = TRUE Sun Dec 20 20:12:36 2020 NOTE: attached to recovery domain 1 NOTE: Fallback recovery: thread 2 read 10751 blocks oldest redo found in ABA 540.6429 NOTE: Fallback recovery: thread 1 read 10751 blocks oldest redo found in ABA 232.4218 Errors in file /grid/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_67538.trc (incident=1692689): ORA-00600: internal error code, arguments: [kfrHtAdd01], [2147483651], [1025], [0], [38660545], [0], [38687990], [1], [2], [6429], [], [] Incident details in: /grid/app/grid/diag/asm/+asm/+ASM1/incident/incdir_1692689/+ASM1_ora_67538_i1692689.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Sun Dec 20 20:12:39 2020 Sweep [inc][1692689]: completed Sweep [inc2][1692689]: completed Errors in file /grid/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_67538.trc: ORA-00600: internal error code, arguments: [kfrHtAdd01], [2147483651], [1025], [0], [38660545], [0], [38687990], [1], [2], [6429], [], [] NOTE: crash recovery signalled OER-600 ERROR: ORA-600 signalled during mount of diskgroup DATA NOTE: cache dismounting (clean) group 1/0x64848829 (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 67538, image: oracle@db1.rac.com (TNS V1-V3) NOTE: lgwr not being msg'd to dismount freeing rdom 1 NOTE: detached from domain 1 NOTE: cache dismounted group 1/0x64848829 (DATA) NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x64848829 NOTE: cache deleting context for group DATA 1/0x64848829 GMON dismounting group 1 at 24 for pid 26, osid 67538 NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0003 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0004 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0005 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0006 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0007 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA was not mounted ORA-00600: internal error code, arguments: [kfrHtAdd01], [2147483651], [1025], [0], [38660545], [0], [38687990], [1], [2], [6429], [], [] ERROR: alter diskgroup data mount
分析trace文件
*** 2020-12-20 20:11:54.956 kfdp_query(DATA): 19 ----- Abridged Call Stack Trace ----- ksedsts()+465<-kfdp_query()+530<-kfdPstSyncPriv()+585<-kfgFinalizeMount()+1630<-kfgscFinalize()+1433< -kfgForEachKfgsc()+285<-kfgsoFinalize()+135<-kfgFinalize()+398<-kfxdrvMount()+5558<-kfxdrvEntry() +2207<-opiexe()+20624<-opiosq0()+3932<-kpooprx()+274<-kpoal8()+842<-opiodr()+917<-ttcpip() +2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o() +103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 ----- End of Abridged Call Stack Trace ----- 2020-12-20 20:11:55.393106 : Start recovery for domain=1, valid=0, flags=0x4 NOTE: starting recovery of thread=1 ckpt=233.4189 group=1 (DATA) NOTE: starting recovery of thread=2 ckpt=542.6409 group=1 (DATA) lost disk write detected during recovery (apply): last written kfcn: 0.38747593 aba=233.4208 thd=1 kfcn_kfrbcd=0.38747593 flags_kfrbcd=0x001c aba=542.6410 thd=2 CE: (0x0x66edc798) group=1 (DATA) fn=4 blk=1 hashFlags=0x0000 lid=0x0002 lruFlags=0x0000 bastCount=1 mirror=0 flags_kfcpba=0x38 copies=3 blockIndex=1 AUindex=0 AUcount=0 loctr fcn=0.0 copy #0: disk=6 au=35 flags=01 copy #1: disk=0 au=34 flags=01 copy #2: disk=4 au=52 flags=01 BH: (0x0x66e10d00) bnum=33 type=COD_RBO state=rcv chgSt=not modifying pageIn=rcvRead flags=0x00000000 pinmode=excl lockmode=null bf=0x66020000 kfbh_kfcbh.fcn_kfbh = 0.38747538 lowAba=0.0 highAba=0.0 modTime=0 last kfcbInitSlot return code=null chgCount=0 cpkt lnk is null ralFlags=0x00000000 PINS: (kfcbps) pin=91 get by kfr.c line 7879 mode=excl fn=4 blk=1 status=pinned flags=0x88000000 flags2=0x00000000 class=0 type=INVALID stateWanted=rcvRead bastCount=1 waitStatus=0x00000000 relocCount=0 scanBastCount=0 scanBxid=0 scanSkipCode=0 last released by kfc.c 21183 NOTE: recovery (pass 2) of diskgroup 1 (DATA) caught error ORA-15096 last new 0.0 kfrPass2: dump of current log buffer for error 15096 follows ======================= OSM metadata block dump: kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 8 ; 0x002: KFBTYP_CHNGDIR kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 17162 ; 0x004: blk=17162 kfbh.block.obj: 3 ; 0x008: file=3 kfbh.check: 4226524538 ; 0x00c: 0xfbeba57a kfbh.fcn.base: 38747431 ; 0x010: 0x024f3d27 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfracdb.aba.seq: 542 ; 0x000: 0x0000021e kfracdb.aba.blk: 6409 ; 0x004: 0x00001909 kfracdb.ents: 1 ; 0x008: 0x0001 kfracdb.ub2spare: 0 ; 0x00a: 0x0000 kfracdb.lge[0].valid: 1 ; 0x00c: V=1 B=0 M=0 kfracdb.lge[0].chgCount: 1 ; 0x00d: 0x01 kfracdb.lge[0].len: 68 ; 0x00e: 0x0044 kfracdb.lge[0].kfcn.base: 38747432 ; 0x010: 0x024f3d28 kfracdb.lge[0].kfcn.wrap: 0 ; 0x014: 0x00000000 kfracdb.lge[0].bcd[0].kfbl.blk: 1292 ; 0x018: blk=1292 kfracdb.lge[0].bcd[0].kfbl.obj: 1 ; 0x01c: file=1 kfracdb.lge[0].bcd[0].kfcn.base:38743102 ; 0x020: 0x024f2c3e kfracdb.lge[0].bcd[0].kfcn.wrap: 0 ; 0x024: 0x00000000 kfracdb.lge[0].bcd[0].oplen: 8 ; 0x028: 0x0008 kfracdb.lge[0].bcd[0].blkIndex: 12 ; 0x02a: 0x000c kfracdb.lge[0].bcd[0].flags: 28 ; 0x02c: F=0 N=0 F=1 L=1 V=1 A=0 C=0 kfracdb.lge[0].bcd[0].opcode: 135 ; 0x02e: 0x0087 kfracdb.lge[0].bcd[0].kfbtyp: 4 ; 0x030: KFBTYP_FILEDIR kfracdb.lge[0].bcd[0].redund: 19 ; 0x031: SCHE=0x1 NUMB=0x3 kfracdb.lge[0].bcd[0].pad: 63903 ; 0x032: 0xf99f kfracdb.lge[0].bcd[0].KFFFD_COMMIT.modts.hi:33108586 ; 0x034: HOUR=0xa DAYS=0x13 MNTH=0xc YEAR=0x7e4 kfracdb.lge[0].bcd[0].KFFFD_COMMIT.modts.lo:0 ; 0x038: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfracdb.lge[0].bcd[0].au[0]: 292415 ; 0x03c: 0x0004763f kfracdb.lge[0].bcd[0].au[1]: 292452 ; 0x040: 0x00047664 kfracdb.lge[0].bcd[0].au[2]: 292474 ; 0x044: 0x0004767a kfracdb.lge[0].bcd[0].disks[0]: 2 ; 0x048: 0x0002 kfracdb.lge[0].bcd[0].disks[1]: 1 ; 0x04a: 0x0001 kfracdb.lge[0].bcd[0].disks[2]: 0 ; 0x04c: 0x0000
彻底屏蔽asm的实例恢复,mount磁盘组,尝试启动库进行数据库恢复.如果如果此类asm无法mount问题,无法自行解决请联系我们
电话/微信:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com
asm磁盘类似_DROPPED_0001_DATA名称故障处理
发现一客户数据库的asm磁盘组中有磁盘掉线(通过分析日志确认2016年就已经掉线,而且不在做rebalance)
进一步检查
SQL> / NAME PATH GROUP_NUMBER DISK_NUMBER MOUNT_STATUS HEADER_STATUS ------------------------------ --------------------- ------------ ----------- -------------- ------------------------ MODE_STATUS STATE FAILGROUP -------------- ---------------- -------------------- ORCL:DATA2 0 0 CLOSED MEMBER ONLINE NORMAL ORCL:FLASH1 0 1 CLOSED MEMBER ONLINE NORMAL ORCL:GRID3 0 2 CLOSED MEMBER ONLINE NORMAL _DROPPED_0000_FLASH 2 0 MISSING UNKNOWN OFFLINE FORCING FLASH1 _DROPPED_0001_DATA 1 1 MISSING UNKNOWN OFFLINE FORCING DATA2 DATA1 ORCL:DATA1 1 0 CACHED MEMBER ONLINE NORMAL DATA1 FLASH2 ORCL:FLASH2 2 1 CACHED MEMBER ONLINE NORMAL FLASH2 GRID1 ORCL:GRID1 3 0 CACHED MEMBER ONLINE NORMAL GRID1 GRID2 ORCL:GRID2 3 1 CACHED MEMBER ONLINE NORMAL GRID2 GRID4 ORCL:GRID4 3 3 CACHED MEMBER ONLINE NORMAL GRID4 GRID5 ORCL:GRID5 3 4 CACHED MEMBER ONLINE NORMAL GRID5 GRID6 ORCL:GRID6 3 5 CACHED MEMBER ONLINE NORMAL GRID6 12 rows selected. SQL> select NAME,STATE,TYPE,OFFLINE_DISKS from v$asm_diskgroup; NAME ------------------------------------------------------------ STATE TYPE OFFLINE_DISKS ---------------------- ------------ ------------- DATA MOUNTED NORMAL 1 FLASH MOUNTED NORMAL 1 GRID MOUNTED NORMAL 0
主要问题是由于ORCL:FLASH1和ORCL:DATA2磁盘掉线导致处于_DROPPED_0000_FLASH和_DROPPED_0001_DATA状态.底层检查,确定现在这些磁盘都正常.然后使用force命令进行强制增加掉线的磁盘到对应的磁盘组中
SQL> alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1' force; Diskgroup altered. SQL> alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2' force; Diskgroup altered.
观察asm 日志,等rebalance完成
Sat Dec 05 16:48:10 2020 SQL> alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1' force NOTE: GroupBlock outside rolling migration privileged region NOTE: Assigning number (2,2) to disk (ORCL:FLASH1) NOTE: requesting all-instance membership refresh for group=2 NOTE: initializing header on grp 2 disk FLASH1 NOTE: requesting all-instance disk validation for group=2 Sat Dec 05 16:48:13 2020 NOTE: skipping rediscovery for group 2/0x58e713e7 (FLASH) on local instance. NOTE: requesting all-instance disk validation for group=2 NOTE: skipping rediscovery for group 2/0x58e713e7 (FLASH) on local instance. Sat Dec 05 16:48:19 2020 GMON updating for reconfiguration, group 2 at 14 for pid 34, osid 12203 NOTE: group 2 PST updated. NOTE: initiating PST update: grp = 2 GMON updating group 2 at 15 for pid 34, osid 12203 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0) NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1) NOTE: PST update grp = 2 completed successfully NOTE: membership refresh pending for group 2/0x58e713e7 (FLASH) GMON querying group 2 at 16 for pid 18, osid 41180 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: cache opening disk 2 of grp 2: FLASH1 label:FLASH1 NOTE: Attempting voting file refresh on diskgroup FLASH NOTE: Refresh completed on diskgroup FLASH. No voting file found. GMON querying group 2 at 17 for pid 18, osid 41180 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH Sat Dec 05 16:48:25 2020 SUCCESS: refreshed membership for 2/0x58e713e7 (FLASH) Sat Dec 05 16:48:25 2020 SUCCESS: alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1' force NOTE: starting rebalance of group 2/0x58e713e7 (FLASH) at power 1 Starting background process ARB0 Sat Dec 05 16:48:26 2020 ARB0 started with pid=36, OS id=12451 NOTE: assigning ARB0 to group 2/0x58e713e7 (FLASH) with 1 parallel I/O cellip.ora not found. NOTE: F1X0 copy 2 relocating from 0:2 to 2:2 for diskgroup 2 (FLASH) NOTE: Attempting voting file refresh on diskgroup FLASH NOTE: Refresh completed on diskgroup FLASH. No voting file found. Sat Dec 05 16:48:45 2020 NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group FLASH Sat Dec 05 16:49:06 2020 NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 2/0x58e713e7 (FLASH) Sat Dec 05 16:49:08 2020 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=2 Sat Dec 05 16:49:11 2020 GMON updating for reconfiguration, group 2 at 18 for pid 36, osid 12681 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0) NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1) NOTE: group 2 PST updated. SUCCESS: grp 2 disk _DROPPED_0000_FLASH going offline GMON updating for reconfiguration, group 2 at 19 for pid 36, osid 12681 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0) NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1) NOTE: group 2 PST updated. NOTE: membership refresh pending for group 2/0x58e713e7 (FLASH) GMON querying group 2 at 20 for pid 18, osid 41180 GMON querying group 2 at 21 for pid 18, osid 41180 NOTE: Disk _DROPPED_0000_FLASH in mode 0x0 marked for de-assignment SUCCESS: refreshed membership for 2/0x58e713e7 (FLASH) Sat Dec 05 16:51:56 2020 SQL> alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2' force NOTE: GroupBlock outside rolling migration privileged region NOTE: Assigning number (1,2) to disk (ORCL:DATA2) NOTE: requesting all-instance membership refresh for group=1 NOTE: initializing header on grp 1 disk DATA2 NOTE: requesting all-instance disk validation for group=1 Sat Dec 05 16:51:57 2020 NOTE: skipping rediscovery for group 1/0x58d713e6 (DATA) on local instance. NOTE: requesting all-instance disk validation for group=1 NOTE: skipping rediscovery for group 1/0x58d713e6 (DATA) on local instance. Sat Dec 05 16:52:02 2020 GMON updating for reconfiguration, group 1 at 22 for pid 34, osid 12203 NOTE: group 1 PST updated. NOTE: initiating PST update: grp = 1 GMON updating group 1 at 23 for pid 34, osid 12203 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: group DATA: updated PST location: disk 0000 (PST copy 0) NOTE: group DATA: updated PST location: disk 0002 (PST copy 1) NOTE: PST update grp = 1 completed successfully NOTE: membership refresh pending for group 1/0x58d713e6 (DATA) GMON querying group 1 at 24 for pid 18, osid 41180 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: cache opening disk 2 of grp 1: DATA2 label:DATA2 Sat Dec 05 16:52:08 2020 NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. GMON querying group 1 at 25 for pid 18, osid 41180 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA SUCCESS: refreshed membership for 1/0x58d713e6 (DATA) Sat Dec 05 16:52:08 2020 SUCCESS: alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2' force NOTE: starting rebalance of group 1/0x58d713e6 (DATA) at power 1 Starting background process ARB0 Sat Dec 05 16:52:08 2020 ARB0 started with pid=37, OS id=13463 NOTE: assigning ARB0 to group 1/0x58d713e6 (DATA) with 1 parallel I/O NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Sat Dec 05 16:52:44 2020 cellip.ora not found. NOTE: F1X0 copy 2 relocating from 1:2 to 2:2 for diskgroup 1 (DATA) Sat Dec 05 16:53:22 2020 NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA NOTE: membership refresh pending for group 1/0x58d713e6 (DATA) GMON querying group 1 at 27 for pid 18, osid 41180 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA SUCCESS: refreshed membership for 1/0x58d713e6 (DATA) SUCCESS: alter diskgroup data rebalance power 11 NOTE: starting rebalance of group 1/0x58d713e6 (DATA) at power 11 Starting background process ARB0 Sat Dec 05 17:27:52 2020 ARB0 started with pid=35, OS id=23318 NOTE: assigning ARB0 to group 1/0x58d713e6 (DATA) with 11 parallel I/Os NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Sat Dec 05 17:28:29 2020 cellip.ora not found. Sat Dec 05 17:28:45 2020 NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA Sat Dec 05 18:48:10 2020 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=1 Sat Dec 05 18:48:32 2020 GMON updating for reconfiguration, group 1 at 28 for pid 36, osid 47454 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: group DATA: updated PST location: disk 0000 (PST copy 0) NOTE: group DATA: updated PST location: disk 0002 (PST copy 1) Sat Dec 05 18:48:32 2020 NOTE: group 1 PST updated. SUCCESS: grp 1 disk _DROPPED_0001_DATA going offline GMON updating for reconfiguration, group 1 at 29 for pid 36, osid 47454 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: group DATA: updated PST location: disk 0000 (PST copy 0) NOTE: group DATA: updated PST location: disk 0002 (PST copy 1) NOTE: group 1 PST updated. Sat Dec 05 18:48:32 2020 NOTE: membership refresh pending for group 1/0x58d713e6 (DATA) GMON querying group 1 at 30 for pid 18, osid 41180 GMON querying group 1 at 31 for pid 18, osid 41180 NOTE: Disk _DROPPED_0001_DATA in mode 0x0 marked for de-assignment SUCCESS: refreshed membership for 1/0x58d713e6 (DATA) NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Sat Dec 05 18:52:24 2020 NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 1/0x58d713e6 (DATA)
总结:对于normal磁盘组由于某种原因磁盘从磁盘组中掉,v$asm_disk.name类似_DROPPED_0001_DATA,v$asm_disk.state为FORCING,可以通过类似alter diskgroup data add failgroup dg2 disk ‘ORCL:DATA2′ force;方式强制增加掉线的磁盘进入磁盘组,然后待rebalance完成,问题修复