标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,674)
- DB2 (22)
- MySQL (73)
- Oracle (1,536)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (22)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (67)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (562)
- Oracle安装升级 (92)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (78)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- GoldenGate 19安装和打patch
- dd破坏asm磁盘头恢复
- 删除asmlib磁盘导致磁盘组故障恢复
- Kylin Linux 安装19c
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
- ORA-600 16703故障再现
- 数据库启动报ORA-27102 OSD-00026 O/S-Error: (OS 1455)
- .[metro777@cock.li].Elbie勒索病毒加密数据库恢复
- 应用连接错误,初始化mysql数据库恢复
- RAC默认服务配置优先节点
分类目录归档:Oracle ASM
ORA-00600 kfrHtAdd01
由于存储掉电,报ORA-15096: lost disk write detected错误,无法mount磁盘组.
Sun Dec 20 16:56:51 2020 SQL> alter diskgroup data mount NOTE: cache registered group DATA number=1 incarn=0x0c1a7a4e NOTE: cache began mount (first) of group DATA number=1 incarn=0x0c1a7a4e NOTE: Assigning number (1,2) to disk (/dev/mapper/multipath12) NOTE: Assigning number (1,5) to disk (/dev/mapper/multipath15) NOTE: Assigning number (1,3) to disk (/dev/mapper/multipath13) NOTE: Assigning number (1,7) to disk (/dev/mapper/multipath17) NOTE: Assigning number (1,1) to disk (/dev/mapper/multipath11) NOTE: Assigning number (1,6) to disk (/dev/mapper/multipath16) NOTE: Assigning number (1,0) to disk (/dev/mapper/multipath10) NOTE: Assigning number (1,4) to disk (/dev/mapper/multipath14) Sun Dec 20 16:56:57 2020 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 19 for pid 32, osid 130347 NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/mapper/multipath10 NOTE: F1X0 found on disk 0 au 2 fcn 0.14159360 NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/mapper/multipath11 NOTE: F1X0 found on disk 1 au 2 fcn 0.14159360 NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/mapper/multipath12 NOTE: F1X0 found on disk 2 au 2 fcn 0.14159360 NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/mapper/multipath13 NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/mapper/multipath14 NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/mapper/multipath15 NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/mapper/multipath16 NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/mapper/multipath17 NOTE: cache mounting (first) normal redundancy group 1/0x0C1A7A4E (DATA) Sun Dec 20 16:56:57 2020 * allocate domain 1, invalid = TRUE Sun Dec 20 16:56:58 2020 NOTE: attached to recovery domain 1 NOTE: starting recovery of thread=1 ckpt=233.4189 group=1 (DATA) NOTE: starting recovery of thread=2 ckpt=542.6409 group=1 (DATA) lost disk write detected during recovery (apply) NOTE: recovery (pass 2) of diskgroup 1 (DATA) caught error ORA-15096 Errors in file /grid/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_130347.trc: ORA-15096: lost disk write detected Abort recovery for domain 1 NOTE: crash recovery signalled OER-15096 ERROR: ORA-15096 signalled during mount of diskgroup DATA NOTE: cache dismounting (clean) group 1/0x0C1A7A4E (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 130347, image: oracle@db1.rac.com (TNS V1-V3) NOTE: lgwr not being msg'd to dismount
通过一系列修复之后报错如下
Sun Dec 20 20:12:35 2020 NOTE: GMON heartbeating for grp 1 GMON querying group 1 at 23 for pid 26, osid 67538 Sun Dec 20 20:12:35 2020 NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/mapper/multipath10 NOTE: F1X0 found on disk 0 au 2 fcn 0.14159360 NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/mapper/multipath11 NOTE: F1X0 found on disk 1 au 2 fcn 0.14159360 NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/mapper/multipath12 NOTE: F1X0 found on disk 2 au 2 fcn 0.14159360 NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/mapper/multipath13 NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/mapper/multipath14 NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/mapper/multipath15 NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/mapper/multipath16 NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/mapper/multipath17 NOTE: cache mounting (first) normal redundancy group 1/0x64848829 (DATA) Sun Dec 20 20:12:36 2020 * allocate domain 1, invalid = TRUE Sun Dec 20 20:12:36 2020 NOTE: attached to recovery domain 1 NOTE: Fallback recovery: thread 2 read 10751 blocks oldest redo found in ABA 540.6429 NOTE: Fallback recovery: thread 1 read 10751 blocks oldest redo found in ABA 232.4218 Errors in file /grid/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_67538.trc (incident=1692689): ORA-00600: internal error code, arguments: [kfrHtAdd01], [2147483651], [1025], [0], [38660545], [0], [38687990], [1], [2], [6429], [], [] Incident details in: /grid/app/grid/diag/asm/+asm/+ASM1/incident/incdir_1692689/+ASM1_ora_67538_i1692689.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Sun Dec 20 20:12:39 2020 Sweep [inc][1692689]: completed Sweep [inc2][1692689]: completed Errors in file /grid/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_67538.trc: ORA-00600: internal error code, arguments: [kfrHtAdd01], [2147483651], [1025], [0], [38660545], [0], [38687990], [1], [2], [6429], [], [] NOTE: crash recovery signalled OER-600 ERROR: ORA-600 signalled during mount of diskgroup DATA NOTE: cache dismounting (clean) group 1/0x64848829 (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 67538, image: oracle@db1.rac.com (TNS V1-V3) NOTE: lgwr not being msg'd to dismount freeing rdom 1 NOTE: detached from domain 1 NOTE: cache dismounted group 1/0x64848829 (DATA) NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x64848829 NOTE: cache deleting context for group DATA 1/0x64848829 GMON dismounting group 1 at 24 for pid 26, osid 67538 NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0003 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0004 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0005 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0006 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0007 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA was not mounted ORA-00600: internal error code, arguments: [kfrHtAdd01], [2147483651], [1025], [0], [38660545], [0], [38687990], [1], [2], [6429], [], [] ERROR: alter diskgroup data mount
分析trace文件
*** 2020-12-20 20:11:54.956 kfdp_query(DATA): 19 ----- Abridged Call Stack Trace ----- ksedsts()+465<-kfdp_query()+530<-kfdPstSyncPriv()+585<-kfgFinalizeMount()+1630<-kfgscFinalize()+1433< -kfgForEachKfgsc()+285<-kfgsoFinalize()+135<-kfgFinalize()+398<-kfxdrvMount()+5558<-kfxdrvEntry() +2207<-opiexe()+20624<-opiosq0()+3932<-kpooprx()+274<-kpoal8()+842<-opiodr()+917<-ttcpip() +2183<-opitsk()+1710<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o() +103<-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253 ----- End of Abridged Call Stack Trace ----- 2020-12-20 20:11:55.393106 : Start recovery for domain=1, valid=0, flags=0x4 NOTE: starting recovery of thread=1 ckpt=233.4189 group=1 (DATA) NOTE: starting recovery of thread=2 ckpt=542.6409 group=1 (DATA) lost disk write detected during recovery (apply): last written kfcn: 0.38747593 aba=233.4208 thd=1 kfcn_kfrbcd=0.38747593 flags_kfrbcd=0x001c aba=542.6410 thd=2 CE: (0x0x66edc798) group=1 (DATA) fn=4 blk=1 hashFlags=0x0000 lid=0x0002 lruFlags=0x0000 bastCount=1 mirror=0 flags_kfcpba=0x38 copies=3 blockIndex=1 AUindex=0 AUcount=0 loctr fcn=0.0 copy #0: disk=6 au=35 flags=01 copy #1: disk=0 au=34 flags=01 copy #2: disk=4 au=52 flags=01 BH: (0x0x66e10d00) bnum=33 type=COD_RBO state=rcv chgSt=not modifying pageIn=rcvRead flags=0x00000000 pinmode=excl lockmode=null bf=0x66020000 kfbh_kfcbh.fcn_kfbh = 0.38747538 lowAba=0.0 highAba=0.0 modTime=0 last kfcbInitSlot return code=null chgCount=0 cpkt lnk is null ralFlags=0x00000000 PINS: (kfcbps) pin=91 get by kfr.c line 7879 mode=excl fn=4 blk=1 status=pinned flags=0x88000000 flags2=0x00000000 class=0 type=INVALID stateWanted=rcvRead bastCount=1 waitStatus=0x00000000 relocCount=0 scanBastCount=0 scanBxid=0 scanSkipCode=0 last released by kfc.c 21183 NOTE: recovery (pass 2) of diskgroup 1 (DATA) caught error ORA-15096 last new 0.0 kfrPass2: dump of current log buffer for error 15096 follows ======================= OSM metadata block dump: kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 8 ; 0x002: KFBTYP_CHNGDIR kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 17162 ; 0x004: blk=17162 kfbh.block.obj: 3 ; 0x008: file=3 kfbh.check: 4226524538 ; 0x00c: 0xfbeba57a kfbh.fcn.base: 38747431 ; 0x010: 0x024f3d27 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfracdb.aba.seq: 542 ; 0x000: 0x0000021e kfracdb.aba.blk: 6409 ; 0x004: 0x00001909 kfracdb.ents: 1 ; 0x008: 0x0001 kfracdb.ub2spare: 0 ; 0x00a: 0x0000 kfracdb.lge[0].valid: 1 ; 0x00c: V=1 B=0 M=0 kfracdb.lge[0].chgCount: 1 ; 0x00d: 0x01 kfracdb.lge[0].len: 68 ; 0x00e: 0x0044 kfracdb.lge[0].kfcn.base: 38747432 ; 0x010: 0x024f3d28 kfracdb.lge[0].kfcn.wrap: 0 ; 0x014: 0x00000000 kfracdb.lge[0].bcd[0].kfbl.blk: 1292 ; 0x018: blk=1292 kfracdb.lge[0].bcd[0].kfbl.obj: 1 ; 0x01c: file=1 kfracdb.lge[0].bcd[0].kfcn.base:38743102 ; 0x020: 0x024f2c3e kfracdb.lge[0].bcd[0].kfcn.wrap: 0 ; 0x024: 0x00000000 kfracdb.lge[0].bcd[0].oplen: 8 ; 0x028: 0x0008 kfracdb.lge[0].bcd[0].blkIndex: 12 ; 0x02a: 0x000c kfracdb.lge[0].bcd[0].flags: 28 ; 0x02c: F=0 N=0 F=1 L=1 V=1 A=0 C=0 kfracdb.lge[0].bcd[0].opcode: 135 ; 0x02e: 0x0087 kfracdb.lge[0].bcd[0].kfbtyp: 4 ; 0x030: KFBTYP_FILEDIR kfracdb.lge[0].bcd[0].redund: 19 ; 0x031: SCHE=0x1 NUMB=0x3 kfracdb.lge[0].bcd[0].pad: 63903 ; 0x032: 0xf99f kfracdb.lge[0].bcd[0].KFFFD_COMMIT.modts.hi:33108586 ; 0x034: HOUR=0xa DAYS=0x13 MNTH=0xc YEAR=0x7e4 kfracdb.lge[0].bcd[0].KFFFD_COMMIT.modts.lo:0 ; 0x038: USEC=0x0 MSEC=0x0 SECS=0x0 MINS=0x0 kfracdb.lge[0].bcd[0].au[0]: 292415 ; 0x03c: 0x0004763f kfracdb.lge[0].bcd[0].au[1]: 292452 ; 0x040: 0x00047664 kfracdb.lge[0].bcd[0].au[2]: 292474 ; 0x044: 0x0004767a kfracdb.lge[0].bcd[0].disks[0]: 2 ; 0x048: 0x0002 kfracdb.lge[0].bcd[0].disks[1]: 1 ; 0x04a: 0x0001 kfracdb.lge[0].bcd[0].disks[2]: 0 ; 0x04c: 0x0000
彻底屏蔽asm的实例恢复,mount磁盘组,尝试启动库进行数据库恢复.如果如果此类asm无法mount问题,无法自行解决请联系我们
电话/微信:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com
asm磁盘类似_DROPPED_0001_DATA名称故障处理
发现一客户数据库的asm磁盘组中有磁盘掉线(通过分析日志确认2016年就已经掉线,而且不在做rebalance)
进一步检查
SQL> / NAME PATH GROUP_NUMBER DISK_NUMBER MOUNT_STATUS HEADER_STATUS ------------------------------ --------------------- ------------ ----------- -------------- ------------------------ MODE_STATUS STATE FAILGROUP -------------- ---------------- -------------------- ORCL:DATA2 0 0 CLOSED MEMBER ONLINE NORMAL ORCL:FLASH1 0 1 CLOSED MEMBER ONLINE NORMAL ORCL:GRID3 0 2 CLOSED MEMBER ONLINE NORMAL _DROPPED_0000_FLASH 2 0 MISSING UNKNOWN OFFLINE FORCING FLASH1 _DROPPED_0001_DATA 1 1 MISSING UNKNOWN OFFLINE FORCING DATA2 DATA1 ORCL:DATA1 1 0 CACHED MEMBER ONLINE NORMAL DATA1 FLASH2 ORCL:FLASH2 2 1 CACHED MEMBER ONLINE NORMAL FLASH2 GRID1 ORCL:GRID1 3 0 CACHED MEMBER ONLINE NORMAL GRID1 GRID2 ORCL:GRID2 3 1 CACHED MEMBER ONLINE NORMAL GRID2 GRID4 ORCL:GRID4 3 3 CACHED MEMBER ONLINE NORMAL GRID4 GRID5 ORCL:GRID5 3 4 CACHED MEMBER ONLINE NORMAL GRID5 GRID6 ORCL:GRID6 3 5 CACHED MEMBER ONLINE NORMAL GRID6 12 rows selected. SQL> select NAME,STATE,TYPE,OFFLINE_DISKS from v$asm_diskgroup; NAME ------------------------------------------------------------ STATE TYPE OFFLINE_DISKS ---------------------- ------------ ------------- DATA MOUNTED NORMAL 1 FLASH MOUNTED NORMAL 1 GRID MOUNTED NORMAL 0
主要问题是由于ORCL:FLASH1和ORCL:DATA2磁盘掉线导致处于_DROPPED_0000_FLASH和_DROPPED_0001_DATA状态.底层检查,确定现在这些磁盘都正常.然后使用force命令进行强制增加掉线的磁盘到对应的磁盘组中
SQL> alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1' force; Diskgroup altered. SQL> alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2' force; Diskgroup altered.
观察asm 日志,等rebalance完成
Sat Dec 05 16:48:10 2020 SQL> alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1' force NOTE: GroupBlock outside rolling migration privileged region NOTE: Assigning number (2,2) to disk (ORCL:FLASH1) NOTE: requesting all-instance membership refresh for group=2 NOTE: initializing header on grp 2 disk FLASH1 NOTE: requesting all-instance disk validation for group=2 Sat Dec 05 16:48:13 2020 NOTE: skipping rediscovery for group 2/0x58e713e7 (FLASH) on local instance. NOTE: requesting all-instance disk validation for group=2 NOTE: skipping rediscovery for group 2/0x58e713e7 (FLASH) on local instance. Sat Dec 05 16:48:19 2020 GMON updating for reconfiguration, group 2 at 14 for pid 34, osid 12203 NOTE: group 2 PST updated. NOTE: initiating PST update: grp = 2 GMON updating group 2 at 15 for pid 34, osid 12203 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0) NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1) NOTE: PST update grp = 2 completed successfully NOTE: membership refresh pending for group 2/0x58e713e7 (FLASH) GMON querying group 2 at 16 for pid 18, osid 41180 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: cache opening disk 2 of grp 2: FLASH1 label:FLASH1 NOTE: Attempting voting file refresh on diskgroup FLASH NOTE: Refresh completed on diskgroup FLASH. No voting file found. GMON querying group 2 at 17 for pid 18, osid 41180 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH Sat Dec 05 16:48:25 2020 SUCCESS: refreshed membership for 2/0x58e713e7 (FLASH) Sat Dec 05 16:48:25 2020 SUCCESS: alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1' force NOTE: starting rebalance of group 2/0x58e713e7 (FLASH) at power 1 Starting background process ARB0 Sat Dec 05 16:48:26 2020 ARB0 started with pid=36, OS id=12451 NOTE: assigning ARB0 to group 2/0x58e713e7 (FLASH) with 1 parallel I/O cellip.ora not found. NOTE: F1X0 copy 2 relocating from 0:2 to 2:2 for diskgroup 2 (FLASH) NOTE: Attempting voting file refresh on diskgroup FLASH NOTE: Refresh completed on diskgroup FLASH. No voting file found. Sat Dec 05 16:48:45 2020 NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group FLASH Sat Dec 05 16:49:06 2020 NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 2/0x58e713e7 (FLASH) Sat Dec 05 16:49:08 2020 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=2 Sat Dec 05 16:49:11 2020 GMON updating for reconfiguration, group 2 at 18 for pid 36, osid 12681 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0) NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1) NOTE: group 2 PST updated. SUCCESS: grp 2 disk _DROPPED_0000_FLASH going offline GMON updating for reconfiguration, group 2 at 19 for pid 36, osid 12681 NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0) NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1) NOTE: group 2 PST updated. NOTE: membership refresh pending for group 2/0x58e713e7 (FLASH) GMON querying group 2 at 20 for pid 18, osid 41180 GMON querying group 2 at 21 for pid 18, osid 41180 NOTE: Disk _DROPPED_0000_FLASH in mode 0x0 marked for de-assignment SUCCESS: refreshed membership for 2/0x58e713e7 (FLASH) Sat Dec 05 16:51:56 2020 SQL> alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2' force NOTE: GroupBlock outside rolling migration privileged region NOTE: Assigning number (1,2) to disk (ORCL:DATA2) NOTE: requesting all-instance membership refresh for group=1 NOTE: initializing header on grp 1 disk DATA2 NOTE: requesting all-instance disk validation for group=1 Sat Dec 05 16:51:57 2020 NOTE: skipping rediscovery for group 1/0x58d713e6 (DATA) on local instance. NOTE: requesting all-instance disk validation for group=1 NOTE: skipping rediscovery for group 1/0x58d713e6 (DATA) on local instance. Sat Dec 05 16:52:02 2020 GMON updating for reconfiguration, group 1 at 22 for pid 34, osid 12203 NOTE: group 1 PST updated. NOTE: initiating PST update: grp = 1 GMON updating group 1 at 23 for pid 34, osid 12203 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: group DATA: updated PST location: disk 0000 (PST copy 0) NOTE: group DATA: updated PST location: disk 0002 (PST copy 1) NOTE: PST update grp = 1 completed successfully NOTE: membership refresh pending for group 1/0x58d713e6 (DATA) GMON querying group 1 at 24 for pid 18, osid 41180 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: cache opening disk 2 of grp 1: DATA2 label:DATA2 Sat Dec 05 16:52:08 2020 NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. GMON querying group 1 at 25 for pid 18, osid 41180 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA SUCCESS: refreshed membership for 1/0x58d713e6 (DATA) Sat Dec 05 16:52:08 2020 SUCCESS: alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2' force NOTE: starting rebalance of group 1/0x58d713e6 (DATA) at power 1 Starting background process ARB0 Sat Dec 05 16:52:08 2020 ARB0 started with pid=37, OS id=13463 NOTE: assigning ARB0 to group 1/0x58d713e6 (DATA) with 1 parallel I/O NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Sat Dec 05 16:52:44 2020 cellip.ora not found. NOTE: F1X0 copy 2 relocating from 1:2 to 2:2 for diskgroup 1 (DATA) Sat Dec 05 16:53:22 2020 NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA NOTE: membership refresh pending for group 1/0x58d713e6 (DATA) GMON querying group 1 at 27 for pid 18, osid 41180 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA SUCCESS: refreshed membership for 1/0x58d713e6 (DATA) SUCCESS: alter diskgroup data rebalance power 11 NOTE: starting rebalance of group 1/0x58d713e6 (DATA) at power 11 Starting background process ARB0 Sat Dec 05 17:27:52 2020 ARB0 started with pid=35, OS id=23318 NOTE: assigning ARB0 to group 1/0x58d713e6 (DATA) with 11 parallel I/Os NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Sat Dec 05 17:28:29 2020 cellip.ora not found. Sat Dec 05 17:28:45 2020 NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA Sat Dec 05 18:48:10 2020 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=1 Sat Dec 05 18:48:32 2020 GMON updating for reconfiguration, group 1 at 28 for pid 36, osid 47454 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: group DATA: updated PST location: disk 0000 (PST copy 0) NOTE: group DATA: updated PST location: disk 0002 (PST copy 1) Sat Dec 05 18:48:32 2020 NOTE: group 1 PST updated. SUCCESS: grp 1 disk _DROPPED_0001_DATA going offline GMON updating for reconfiguration, group 1 at 29 for pid 36, osid 47454 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA NOTE: group DATA: updated PST location: disk 0000 (PST copy 0) NOTE: group DATA: updated PST location: disk 0002 (PST copy 1) NOTE: group 1 PST updated. Sat Dec 05 18:48:32 2020 NOTE: membership refresh pending for group 1/0x58d713e6 (DATA) GMON querying group 1 at 30 for pid 18, osid 41180 GMON querying group 1 at 31 for pid 18, osid 41180 NOTE: Disk _DROPPED_0001_DATA in mode 0x0 marked for de-assignment SUCCESS: refreshed membership for 1/0x58d713e6 (DATA) NOTE: Attempting voting file refresh on diskgroup DATA NOTE: Refresh completed on diskgroup DATA. No voting file found. Sat Dec 05 18:52:24 2020 NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 1/0x58d713e6 (DATA)
总结:对于normal磁盘组由于某种原因磁盘从磁盘组中掉,v$asm_disk.name类似_DROPPED_0001_DATA,v$asm_disk.state为FORCING,可以通过类似alter diskgroup data add failgroup dg2 disk ‘ORCL:DATA2′ force;方式强制增加掉线的磁盘进入磁盘组,然后待rebalance完成,问题修复
ORA-15096: lost disk write detected
又一例由于存储掉电导致asm磁盘组,由于ORA-15096: lost disk write detected,导致无法mount的恢复请求
SQL> ALTER DISKGROUP DATA MOUNT /* asm agent *//* {1:45277:148} */ NOTE: cache registered group DATA number=2 incarn=0x73886b6a NOTE: cache began mount (first) of group DATA number=2 incarn=0x73886b6a NOTE: Assigning number (2,2) to disk (/dev/asm-data3) NOTE: Assigning number (2,1) to disk (/dev/asm-data2) NOTE: Assigning number (2,0) to disk (/dev/asm-data1) Fri Nov 06 19:06:56 2020 NOTE: GMON heartbeating for grp 2 GMON querying group 2 at 94 for pid 30, osid 11596 NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/asm-data1 NOTE: F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/asm-data2 NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/asm-data3 NOTE: cache mounting (first) external redundancy group 2/0x73886B6A (DATA) Fri Nov 06 19:06:57 2020 * allocate domain 2, invalid = TRUE kjbdomatt send to inst 2 Fri Nov 06 19:06:57 2020 NOTE: attached to recovery domain 2 NOTE: starting recovery of thread=1 ckpt=25.7986 group=2 (DATA) NOTE: starting recovery of thread=2 ckpt=33.364 group=2 (DATA) NOTE: BWR validation signaled ORA-15096 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_11596.trc: ORA-15096: lost disk write detected NOTE: crash recovery signalled OER-15096 ERROR: ORA-15096 signalled during mount of diskgroup DATA NOTE: cache dismounting (clean) group 2/0x73886B6A (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 11596, image: oracle@db1 (TNS V1-V3) NOTE: lgwr not being msg'd to dismount kjbdomdet send to inst 2 detach from dom 2, sending detach message to inst 2 freeing rdom 2 NOTE: detached from domain 2 NOTE: cache dismounted group 2/0x73886B6A (DATA) NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x73886b6a NOTE: cache deleting context for group DATA 2/0x73886b6a GMON dismounting group 2 at 95 for pid 30, osid 11596 NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA was not mounted ORA-15032: not all alterations performed ORA-15096: lost disk write detected ERROR: ALTER DISKGROUP DATA MOUNT /* asm agent *//* {1:45277:148} */
通过判断,通过一系列处理之后,数据库进行了mount操作发现报错ORA-600 2130
Fri Nov 06 17:03:27 2020 ALTER DATABASE RECOVER database Media Recovery Start started logmerger process Parallel Media Recovery started with 40 slaves Fri Nov 06 17:03:29 2020 Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_pr00_7393.trc (incident=195869): ORA-00600: internal error code, arguments: [2130], [2], [1], [2], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_195869/ynhis1_pr00_7393_i195869.trc Fri Nov 06 17:03:30 2020 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Media Recovery failed with error 600 ORA-10877 signalled during: ALTER DATABASE RECOVER database ...
判断redo异常,通过resetlogs打开库,发现报错ORA-00600 2662
Fri Nov 06 18:21:32 2020 alter database open resetlogs RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 8670753264 Resetting resetlogs activation ID 306909514 (0x124b114a) Redo thread 2 enabled by open resetlogs or standby activation Fri Nov 06 18:21:39 2020 Setting recovery target incarnation to 2 Initializing SCN for created control file Database SCN compatibility initialized to 3 Warning - High Database SCN: Current SCN value is 8670753267, threshold SCN value is 0 Fri Nov 06 18:21:39 2020 Assigning activation ID 408224320 (0x18550240) Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: /orabak/data/group_1.289.954514319 Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Fri Nov 06 18:21:40 2020 SMON: enabling cache recovery Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc (incident=231847): ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365], [4194545], [], [], [], [], [],[] Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_231847/ynhis1_ora_24310_i231847.trc Fri Nov 06 18:21:42 2020 Dumping diagnostic data in directory=[cdmp_20201106182142],requested by(instance=1,osid=24310),summary=[incident=231847] Fri Nov 06 18:21:43 2020 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365],[4194545],[],[],[],[],[],[] Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365],[4194545],[],[],[],[],[],[] Error 704 happened during db open, shutting down database USER (ospid: 24310): terminating the instance due to error 704 Instance terminated by USER, pid = 24310 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (24310) as a result of ORA-1092
处理该错误之后,数据库resetlog之后,数据库open成功但是报错ORA-00600 4137
Database Characterset is ZHS16GBK Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_smon_26195.trc (incident=255799): ORA-00600: internal error code, arguments: [4137], [25.33.122556], [0], [0], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_255799/ynhis1_smon_26195_i255799.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. No Resource Manager plan active Fri Nov 06 18:30:46 2020 replication_dependency_tracking turned off (no async multimaster replication found) ORACLE Instance ynhis1 (pid = 23) - Error 600 encountered while recovering transaction (25, 33). Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_smon_26195.trc: ORA-00600: internal error code, arguments: [4137], [25.33.122556], [0], [0], [], [], [], [], [], [], [], []
对异常undo进行处理,数据库可以正常启动关闭,然后安排数据导出导入新库操作,恢复完成.