分类目录归档:Oracle

aix磁盘损坏oracle数据库恢复

客户aix环境硬盘异常导致系统无法启动,初步判断是数据文件存放在本地磁盘的空间中(本地两个盘都异常,系统无法启动),通过硬件恢复厂商镜像出来,但是通过aix文件系统直接挂载提示需要fsck,但是做fsck之后,提示大量文件丢失(最关键的数据文件和备份文件都被自动删除)
dmp-remove
fsck-remove


基于这种情况,采用镜像主机挂载的方式肯定不行,考虑直接采用软件直接解析,能够看到软件,可惜由于大量的文件系统元数据损坏,解析出来的数据文件和dmp也不可用(大量损坏和空块)
QQ20250307-122939

基于上述情况,只能采用碎片级别恢复出来数据文件
QQ20250307-123123

然后使用dul工具把数据恢复到表中,实现最大限度抢救客户数据
QQ20250307-123653

对于数据库级别恢复,这个是理论上的终极恢复方法

发表在 Oracle, 非常规恢复 | 标签为 , | 留下评论

linux rm -rf 删除数据文件恢复

有客户由于误操作删除了oracle的部分数据文件(rm -rf 方式删除),然后自己尝试进行恢复操作,对部分文件执行了offline,导致比较麻烦的后果
jb
offline-file


接手故障之后,第一时间对其进行了镜像(因为有部分文件句柄已经释放,为了方式覆盖进一步破坏),对于没有释放的句柄可以通过类似方法进行恢复,参考以前类似恢复:
Oracle误删除数据文件恢复
Solaris rm datafile recovery—利用句柄误删除数据文件恢复

!cp  269  /u01/app/oracle/oradata/orcl/XXXXXX_DATA01.dbf
alter database datafile 12 offline;
recover datafile 12;
alter database datafile 12 online;

对于删除文件,而且句柄已经释放的文件,通过文件系统层面反删除进行恢复,参考以前类似恢复:
rm -rf误删Oracle数据库恢复
记录一次rm -rf 删除数据文件异常恢复
rm -rf 删除数据文件恢复方法—文件系统反删除+oracle碎片重组
rm


在这个恢复过程中,由于客户linux是物理机,而且本地空间不足,无法对其进行镜像,采用dd命令直接写镜像到其他的linux机器上(通过nfs方式),然后在win机器上直接挂载该nfs,记录下win上挂载nfs操作
nfs
mount-nfs

发表在 Oracle备份恢复 | 标签为 , , , | 评论关闭

近1万个数据文件的恢复case

朋友介绍一个恢复case,数据库发生过硬件故障,做过硬件恢复之后,数据库无法正常启动.我恢复的已经不是第一现场,客户和我反馈说找过三批人进行恢复,都没有正常打开数据库.数据库整体不大(1T左右),但是数据文件近1万个(9895个数据文件),我看了下alert日志,主要报错有:
ORA-600 kcratr_scan_lastbwr,该错误比较常见,一般是由于坏块或者redo和数据文件不匹配导致,在某些情况下recover下就可以解决,有些时候不行,看人品

Mon Feb 17 15:51:15 2025
Started redo scan
Hex dump of (file 3, block 240) in trace file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc
Reading datafile 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF' for corruption at rdba: 0x00c000f0 (file 3, block 240)
Reread (file 3, block 240) found same corrupt data (logically corrupt)
Write verification failed for File 3 Block 240 (rdba 0xc000f0)
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc  (incident=293029):
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details in: F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_293029\orcl_ora_10508_i293029.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Aborting crash recovery due to error 600
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Mon Feb 17 15:51:22 2025
Sweep [inc2][293029]: completed
Mon Feb 17 15:51:25 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10508.trc:
ORA-00600: 内部错误代码, 参数: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []

ORA-600 krr_parse_3错误,官方没有查询到资料,但是从报错的位置分析,应该和redo的应用有直接关系

Thu Feb 20 11:45:03 2025
ALTER DATABASE RECOVER  datafile 116  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321616):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 116  ...
ALTER DATABASE RECOVER  datafile 1168  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2084282 Reading mem 0
  Mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_10840.trc  (incident=321617):
ORA-00600: 内部错误代码, 参数: [krr_parse_3], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1168  ...

上述两个错误,由于数据库部分文件被offline,而且屏蔽一致性打开等操作,绕过了上述的两个ORA-600错误,现在停留在ORA-00604 ORA-00376 ORA-01110故障导致数据库无法打开的情况,该错误是由于数据库启动过程中有事务,需要使用被offline的undo文件.

Fri Feb 21 07:42:37 2025
minact-scn: got error during useg scan e:376 usn:1
minact-scn: useg scan erroring out with error e:376
Fri Feb 21 07:44:02 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m007_11464.trc:
ORA-51106: 由于出错, 检查无法完成。请查看下面的错误
ORA-48223: 已请求中断 - 提取已中止 - 返回代码 [12751] [HM_FINDING]
Fri Feb 21 07:45:12 2025
Errors in file F:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_14108.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00376: 此时无法读取文件 3
ORA-01110: 数据文件 3: 'F:\ORACLEDATA\ORCL\UNDOTBS01.DBF'

分析数据库文件状态,有25个数据文件被offline,而且这些文件的resetlogs信息均不对(截取了部分文件)

SQL> set lines 150
SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_TIME                          FUZZY  CHECKPOINT_CHANGE#          ROW_NUM
-------------- ---------------------------------------- ------ ------------------ ----------------
OFFLINE        2025-02-11 15:27:00                      YES            1909526545               22
OFFLINE        2025-02-17 17:24:14                      YES            1909551234                2
OFFLINE        2025-02-17 17:27:35                      NO             1909551234                1
ONLINE         2025-02-22 17:29:25                      YES            2095190672             9869

wrong-resetlogs
offline
对于这种情况,最简单的解决方法就是使用开发的小工具Oracle Recovery Tools(Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)),对这些offline的文件头信息进行修改
ora-tool
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复
尝试直接open数据库

SQL> STARTUP MOUNT PFILE='D:/PFILE.TXT'
ORACLE 例程已经启动。

Total System Global Area      82309009408 bytes
Fixed Size                        2290160 bytes
Variable Size                 12884905488 bytes
Database Buffers              69256347648 bytes
Redo Buffers                    165466112 bytes
数据库装载完毕。
SQL> RECOVER DATAFILE 3;
完成介质恢复。
SQL> RECOVER datafile 6601,7043,7044,7045,7050,
   7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
  ,7188,7190,7191,7192,7244,9501 ;
完成介质恢复。
SQL> alter database datafile 3,6601,7043,7044,7045,7050,
  7053,7054,7055,7056,7059,7060,7061,7062,7063,7064,7071,7072,7187
 ,7188,7190,7191,7192,7244,9501 online;
SQL> ALTER DATABASE OPEN;

数据库已更改。

QQ20250222-184800

Sat Feb 22 18:38:26 2025
alter database mount exclusive
Sat Feb 22 18:38:26 2025
MMNL started with pid=25, OS id=7524 
Successful mount of redo thread 1, with mount id 3367723362
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Sat Feb 22 18:42:34 2025
Thread 1 opened at log sequence 5
  Current log# 2 seq# 5 mem# 0: F:\ORACLEDATA\ORCL\REDO02.LOG
Successful open of redo thread 1
Sat Feb 22 18:42:34 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Feb 22 18:42:34 2025
SMON: enabling cache recovery
[7960] Successfully onlined Undo Tablespace 12273.
Undo initialization finished serial:0 start:98760972 end:98761612 diff:640 (6 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Feb 22 18:42:41 2025
QMNC started with pid=29, OS id=8116 
Sat Feb 22 18:42:45 2025
Completed: alter database open
Sat Feb 22 18:42:47 2025
Starting background process CJQ0
Sat Feb 22 18:42:47 2025
CJQ0 started with pid=31, OS id=3264 
Sat Feb 22 18:42:47 2025
db_recovery_file_dest_size of 4977 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.

数据库已经open,后续收尾工作比较简单,不再累赘.
对于这类缺少归档数据文件offline的故障Oracle Recovery Tools可以快速傻瓜式恢复,还是比较方便的
软件下载:OraRecovery下载
使用说明:使用说明

发表在 非常规恢复 | 标签为 , , , , | 评论关闭