标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 ORACLE恢复 Oracle 恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,698)
- DB2 (22)
- MySQL (74)
- Oracle (1,559)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (93)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- Bug 21915719 Database hang or may fail to OPEN in 12c IBM AIX or HPUX Itanium – ORA-742, DEADLOCK or ORA-600 [kcrfrgv_nextlwn_scn] ORA-600 [krr_process_read_error_2]
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
- ORA-15411: Failure groups in disk group DATA have different number of disks.
- 断电引起的ORA-08102: 未找到索引关键字, 对象号 39故障处理
- ORA-00227: corrupt block detected in control file
- 手工删除19c rac
- 解决oracle数据文件路径有回车故障
分类目录归档:Oracle ASM
asm磁盘组操作不当导致数据文件丢失恢复
最近遇到数据库恢复case,客户是要更换存储,在数据库mount状态把使用omf方式存储数据的asm 磁盘组通过rman copy到新的通过别名方式存储的新的asm 磁盘组的存储中,但是由于操作人员粗心,copy语句中部分目标磁盘组的数据文件别名重复了,最后执行rename file之后,导致部分数据文件彻底丢失.我们通过底层碎片扫描(参考:asm disk header 彻底损坏恢复)对于该用户的数据实现完全恢复.
因为整个过程重现比较麻烦,这里测试从一个data磁盘组中有一个omf方式存储的含有两个数据文件的表空间,通过rman copy 把这个表空间的两个文件拷贝到datanew磁盘组中,但是由于粗心把两个数据文件的别名写成一样,结果导致该表空间的一个数据文件彻底丢失的测试.
创建测试表空间
在datanew磁盘组中创建omf方式管理的xifenfei表空间,含有两个数据文件,file#分别为14和15
SQL> create tablespace xifenfei datafile '+DATA' SIZE 128m; Tablespace created. SQL> ALTER TABLESPACE XIFENFEI ADD DATAFILE '+DATA' SIZE 128m AUTOEXTEND ON; Tablespace altered. SQL> SELECT FILE_NAME,FILE_ID FROM DBA_DATA_FILES WHERE TABLESPACE_NAME='XIFENFEI'; FILE_NAME -------------------------------------------------------------------------------- FILE_ID ---------- +DATA/XFF/DATAFILE/xifenfei.276.961143809 14 +DATA/XFF/DATAFILE/xifenfei.277.961143825 15
rman copy datafile 14
通过rman copy把datafile 14拷贝到data磁盘组中,目标端为别名方式存储
RMAN> copy datafile 14 to '+datanew/xifenfei.dbf'; Starting backup at 27-NOV-17 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=24 device type=DISK channel ORA_DISK_1: starting datafile copy input datafile file number=00014 name=+DATA/XFF/DATAFILE/xifenfei.276.961143809 output file name=+DATANEW/xifenfei.dbf tag=TAG20171127T082643 RECID=4 STAMP=961144006 channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07 Finished backup at 27-NOV-17 [grid@localhost ~]$ asmcmd ASMCMD> cd datanew ASMCMD> ls XFF/ xifenfei.dbf ASMCMD> ls -l Type Redund Striped Time Sys Name Y XFF/ DATAFILE UNPROT COARSE NOV 27 08:00:00 N xifenfei.dbf => +DATANEW/XFF/DATAFILE/XIFENFEI.256.961144003 ASMCMD>
这里通过asmcmd的ls命令,可以看到虽然我们存储的为datanew磁盘组的别名文件,实际上是link到asm的omf方式的文件(本质上asm中的文件都是omf方式存储,只是在使用的时候体现asm的客户端程序方式不一样,是直接asm中的omf方式,还是asm中的别名).
rman copy datafile 15
通过rman copy把datafile 15 拷贝到和datafile 14别名一样的文件了
RMAN> copy datafile 15 to '+datanew/xifenfei.dbf'; Starting backup at 27-NOV-17 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile copy input datafile file number=00015 name=+DATA/XFF/DATAFILE/xifenfei.277.961143825 output file name=+DATANEW/xifenfei.dbf tag=TAG20171127T082731 RECID=5 STAMP=961144053 channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03 Finished backup at 27-NOV-17 ASMCMD> ls -l Type Redund Striped Time Sys Name Y XFF/ DATAFILE UNPROT COARSE NOV 27 08:00:00 N xifenfei.dbf => +DATANEW/XFF/DATAFILE/XIFENFEI.256.961144003 ASMCMD> cd xff ASMCMD> ls DATAFILE/ ASMCMD> cd datafile ASMCMD> ls XIFENFEI.256.961144003 ASMCMD>
这里可以看出来,在data磁盘组中,file 14被file 15覆盖掉了
rename file
把data磁盘组中的数据文件rename 到datanew磁盘组中
SQL> alter database rename file '+DATA/XFF/DATAFILE/xifenfei.276.961143809' to '+datanew/xifenfei.dbf'; Database altered. SQL> alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to '+datanew/xifenfei.dbf'; alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to '+datanew/xifenfei.dbf' * ERROR at line 1: ORA-01511: error in renaming log/data files ORA-01523: cannot rename data file to '+data/xifenfei.dbf' - file already part of database
这里我们可以看到,file 14 rename 成功,但是file 15 rename失败,因为在数据库中,已经有了别名的文件(数据文件的路径)
omf自动删除文件
查看原磁盘组datanew中,发现datafile 14被自动删除
ASMCMD> pwd +DATA/XFF/DATAFILE ASMCMD> ls -l Type Redund Striped Time Sys Name DATAFILE UNPROT COARSE NOV 27 08:00:00 Y SYSAUX.257.942061433 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y SYSTEM.256.942061393 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y UNDOTBS1.258.942061449 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y USERS.259.942061449 DATAFILE UNPROT COARSE NOV 27 08:00:00 Y XIFENFEI.277.961143825 ASMCMD>
alert日志证实数据文件被删除
2017-11-27T09:05:03.054741-05:00 alter database rename file '+DATA/XFF/DATAFILE/xifenfei.276.961143809' to '+datanew/xifenfei.dbf' 2017-11-27T09:05:03.114947-05:00 NOTE: Under CF enqueue, no dependency request for disk group DATANEW Deleted Oracle managed file +DATA/XFF/DATAFILE/xifenfei.276.961143809 Completed: alter database rename file '+DATA/XFF/DATAFILE/xifenfei.276.961143809' to '+datanew/xifenfei.dbf' 2017-11-27T09:05:21.471474-05:00 alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to '+data/xifenfei.dbf' ORA-1511 signalled during:alter database rename file '+DATA/XFF/DATAFILE/xifenfei.277.961143825' to'+datanew/xifenfei.dbf'
这里可以证实,数据文件的omf方式管理,在数据文件执行rename file的时候,会自动删除掉老的数据文件.这里悲剧已经发生,由于rman copy 覆盖了datanew磁盘组中的datafile 14,rename file又导致data磁盘组中的datafile 14被自动删除,从而使得datafile 14这个数据文件在两个磁盘组中都丢失.从常规角度来说,如果没有合适的备份该文件无法恢复.如果遭遇到oracle asm中数据文件丢失或者部分覆盖,请保护现场,联系我们(ORACLE数据库恢复技术支持),将为您提供专业数据库技术支持:Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com最大限度抢救您的数据
asm磁盘分区丢失恢复
有朋友反馈,他们做了xx存储的双活之后,重启主机发现gi无法正常启动,分析发现所有该存储的磁盘分区信息丢失,导致asmlib无法发现磁盘(使用分区做asm disk)
类似如下错误(磁盘分区丢失)
--fdisk -l 显示部分结果 Disk /dev/mapper/datahds1: 1099.5 GB, 1099511627776 bytes 255 heads, 63 sectors/track, 133674 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 --ls -l /dev/mapper/ 显示结果无分区信息 lrwxrwxrwx 1 root root 7 May 6 03:44 datahds1 -> ../dm-1 lrwxrwxrwx 1 root root 7 May 6 03:26 datahds2 -> ../dm-3 lrwxrwxrwx 1 root root 7 May 6 03:26 datahds3 -> ../dm-8 lrwxrwxrwx 1 root root 7 May 6 03:26 ocrhds1 -> ../dm-0 lrwxrwxrwx 1 root root 7 May 6 03:26 ocrhds2 -> ../dm-2 lrwxrwxrwx 1 root root 7 May 6 03:26 ocrhds3 -> ../dm-4
asm日志显示
SUCCESS: diskgroup DATADG was mounted NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 3 SUCCESS: diskgroup OCRHDS was mounted ORA-15032: not all alterations performed ORA-15017: diskgroup "DATA" cannot be mounted ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
分析系统日志
May 6 02:23:27 db2 kernel: sdb: unknown partition table May 6 02:23:27 db2 kernel: sde: unknown partition table May 6 02:23:27 db2 kernel: sdc: unknown partition table May 6 02:23:27 db2 kernel: sdf: unknown partition table May 6 02:23:27 db2 kernel: sdd: unknown partition table May 6 02:23:27 db2 kernel: sdj:Dev sdj: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdi: sdi1 May 6 02:23:27 db2 kernel: sdk: sdk1 May 6 02:23:27 db2 kernel: sdg: unknown partition table May 6 02:23:27 db2 kernel: sdl: sdl1 May 6 02:23:27 db2 kernel: sdm:Dev sdm: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdo:Dev sdo: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdn:Dev sdn: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdp:Dev sdp: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sds:Dev sds: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdh: May 6 02:23:27 db2 kernel: sdt: sdt1 May 6 02:23:27 db2 kernel: sdv:Dev sdv: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdq:Dev sdq: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sd 1:0:1:9: [sdr] Very big device. Trying to use READ CAPACITY(16). May 6 02:23:27 db2 kernel: sdr:Dev sdr: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sd 2:0:0:9: [sdab] Very big device. Trying to use READ CAPACITY(16). May 6 02:23:27 db2 kernel: sdab: unknown partition table May 6 02:23:27 db2 kernel: sdac: unknown partition table May 6 02:23:27 db2 kernel: sdw: sdw1 May 6 02:23:27 db2 kernel: sdu:Dev sdu: unable to read RDB block 0 May 6 02:23:27 db2 kernel: unable to read partition table May 6 02:23:27 db2 kernel: sdx: sdx1 May 6 02:23:27 db2 kernel: sdy: sdy1 May 6 02:23:27 db2 kernel: sdaa: sdaa1 May 6 02:23:27 db2 kernel: sdz: sdz1 May 6 02:23:27 db2 kernel: sdae: unknown partition table May 6 02:23:27 db2 kernel: sdaf: unknown partition table May 6 02:23:27 db2 kernel: sdag: unknown partition table May 6 02:23:27 db2 kernel: sdai: May 6 02:23:27 db2 kernel: sdah: unknown partition table May 6 02:23:27 db2 kernel: sdad: unknown partition table May 6 02:23:28 db2 mcelog: failed to prefill DIMM database from DMI data
这里错误比较明显unknown partition table,磁盘的分区信息损坏.使用fdisk无法发现分区
partprobe也无效
[root@db2 oracle]# partprobe /dev/mapper/ocrhds3 [root@db2 oracle]# [root@db2 oracle]# ls -l /dev/mapper/ocrhds3* lrwxrwxrwx 1 root root 7 May 6 07:30 /dev/mapper/ocrhds3 -> ../dm-4
从尚需信息看,磁盘的分区表信息应该已经损坏,现在能够做的,就是希望运气好,磁盘的分区的实际数据没有损坏
分析磁盘实际分区数据
[root@db2 ~]$ dd if=/dev/mapper/datahds1 of=/tmp/datahds1.dd bs=1024k count=50 [root@db2 ~]$ dd if=/tmp/datahds1.dd of=/tmp/xff01.dd bs=3225 skip=1 [grid@db2 ~]$ kfed read /tmp/xff01.dd |more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 3110278718 ; 0x00c: 0xb963163e kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISKHDSDATA1 ; 0x000: length=16 kfdhdb.driver.reserved[0]: 1146307656 ; 0x008: 0x44534448 kfdhdb.driver.reserved[1]: 826364993 ; 0x00c: 0x31415441 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 0 ; 0x024: 0x0000 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DATADG_0000 ; 0x028: length=11 kfdhdb.grpname: DATADG ; 0x048: length=6 kfdhdb.fgname: DATADG_0000 ; 0x068: length=11 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 33050696 ; 0x0a8: HOUR=0x8 DAYS=0x2 MNTH=0x4 YEAR=0x7e1 kfdhdb.crestmp.lo: 3813740544 ; 0x0ac: USEC=0x0 MSEC=0x44 SECS=0x35 MINS=0x38 kfdhdb.mntstmp.hi: 33050701 ; 0x0b0: HOUR=0xd DAYS=0x2 MNTH=0x4 YEAR=0x7e1 kfdhdb.mntstmp.lo: 411385856 ; 0x0b4: USEC=0x0 MSEC=0x150 SECS=0x8 MINS=0x6
通过上述分析,我们可以初步判断,分区磁盘的信息很可能是好的(因为asm disk header是好的,根据一般的规则从前往后覆盖,既然header是好的,后面的block被覆盖的概率非常小)
通过准备新磁盘直接把磁盘分区dd到新设备上
dd if=/dev/mapper/ocrhds1 of=/dev/mapper/ocrhdsnew1 skip=1 bs=3225 dd if=/dev/mapper/ocrhds2 of=/dev/mapper/ocrhdsnew2 skip=1 bs=3225 dd if=/dev/mapper/ocrhds3 of=/dev/mapper/ocrhdsnew3 skip=1 bs=3225 dd if=/dev/mapper/datahds1 of=/dev/mapper/datahdsnew1 skip=1 bs=3225 dd if=/dev/mapper/datahds2 of=/dev/mapper/datahdsnew2 skip=1 bs=3225 dd if=/dev/mapper/datahds3 of=/dev/mapper/datahdsnew3 skip=1 bs=3225
asmlib重新扫描磁盘
[root@db1 disks]# oracleasm scandisks Reloading disk partitions: done Cleaning any stale ASM disks... Scanning system for ASM disks... Instantiating disk "HDSOCR3" Instantiating disk "HDSDATA2" Instantiating disk "HDSDATA1" Instantiating disk "HDSDATA3" Instantiating disk "HDSOCR1" Instantiating disk "HDSOCR2" [root@db1 disks]# ls -ltr total 0 brw-rw---- 1 grid asmadmin 8, 160 May 6 13:49 HDSOCR3 brw-rw---- 1 grid asmadmin 8, 192 May 6 13:49 HDSDATA2 brw-rw---- 1 grid asmadmin 8, 176 May 6 13:49 HDSDATA1 brw-rw---- 1 grid asmadmin 8, 208 May 6 13:49 HDSDATA3 brw-rw---- 1 grid asmadmin 8, 128 May 6 13:49 HDSOCR1 brw-rw---- 1 grid asmadmin 8, 144 May 6 13:49 HDSOCR2
kfed验证拷贝的分区
[root@db2 tmp]# /oracle/app/11.2.0/grid_1/bin/kfed read /dev/oracleasm/disks/HDSDATA1 kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 3110278718 ; 0x00c: 0xb963163e kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISKHDSDATA1 ; 0x000: length=16 kfdhdb.driver.reserved[0]: 1146307656 ; 0x008: 0x44534448 kfdhdb.driver.reserved[1]: 826364993 ; 0x00c: 0x31415441 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 0 ; 0x024: 0x0000 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DATADG_0000 ; 0x028: length=11 kfdhdb.grpname: DATADG ; 0x048: length=6 kfdhdb.fgname: DATADG_0000 ; 0x068: length=11 kfdhdb.capname: ; 0x088: length=0
asm和数据库启动正常
[grid@db2 ~]$ asmcmd ASMCMD> lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED EXTERN N 512 4096 1048576 3145710 2378034 0 2378034 0 N DATADG/ MOUNTED NORMAL N 512 4096 1048576 15342 14416 5114 4651 0 Y OCRHDS/ ASMCMD> [oracle@db2 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Sat May 6 13:54:21 2017 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup ORACLE instance started. Total System Global Area 3.6077E+10 bytes Fixed Size 2260648 bytes Variable Size 7247757656 bytes Database Buffers 2.8723E+10 bytes Redo Buffers 104382464 bytes Database mounted. Database opened. SQL>
通过上述恢复,实现asm磁盘分区丢失数据0丢失
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com