联系:手机/微信(+86 17813235971) QQ(107644445)
标题:aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
接到朋友恢复请求,aix 5.3,Oracle 10.2.0.1平台,数据库启动报ORA-01115 ORA-01110 ORA-27067错误,数据库无法正常打开,通过分析,是由于10201在aix上面的bug导致,通过技巧规避,完美解决给问题,数据0丢失
数据库报错alert日志
Mon Aug 10 13:25:22 2015 ALTER DATABASE MOUNT Mon Aug 10 13:25:29 2015 Setting recovery target incarnation to 1 Mon Aug 10 13:25:29 2015 Successful mount of redo thread 1, with mount id 432339141 Mon Aug 10 13:25:29 2015 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Mon Aug 10 13:25:36 2015 alter database open Mon Aug 10 13:25:36 2015 Beginning crash recovery of 1 threads parallel recovery started with 15 processes Mon Aug 10 13:25:37 2015 Started redo scan Mon Aug 10 13:25:52 2015 Completed redo scan 7889582 redo blocks read, 75305 data blocks need recovery Mon Aug 10 13:25:53 2015 Errors in file /dc/admin/datacent/bdump/datacent_p002_144124.trc: ORA-01115: IO error reading block from file 2 (block # 40704) ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf' ORA-27067: size of I/O buffer is invalid Additional information: 2 Additional information: 1572864 Mon Aug 10 13:25:53 2015 Aborting crash recovery due to slave death, attempting serial crash recovery Mon Aug 10 13:25:53 2015 Beginning crash recovery of 1 threads Mon Aug 10 13:25:53 2015 Started redo scan Mon Aug 10 13:26:09 2015 Completed redo scan 7889582 redo blocks read, 75305 data blocks need recovery Mon Aug 10 13:26:12 2015 Aborting crash recovery due to error 1115 Mon Aug 10 13:26:12 2015 Errors in file /dc/admin/datacent/udump/datacent_ora_123384.trc: ORA-01115: IO error reading block from file 2 (block # 39077) ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf' ORA-27067: size of I/O buffer is invalid Additional information: 2 Additional information: 1310720 ORA-1115 signalled during: alter database open...
这里报的前面两个错误ORA-01115 ORA-01110我们都非常熟悉,类似数据库启动遇到坏块或者io错误之时可能就会报如此错误。但是ORA-27067确实不多见,从mos上看,很多是由于rman备份之时的bug可能导致该错误。
dbv检测undo坏块文件
DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 10 23:18:15 2015 Copyright (c) 1982, 2003, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = /dc/oradata/datacent/undotbs01.dbf DBVERIFY - Verification complete Total Pages Examined : 329600 Total Pages Processed (Data) : 0 Total Pages Failing (Data) : 0 Total Pages Processed (Index): 0 Total Pages Failing (Index): 0 Total Pages Processed (Other): 327504 Total Pages Processed (Seg) : 17 Total Pages Failing (Seg) : 0 Total Pages Empty : 2096 Total Pages Marked Corrupt : 0 Total Pages Influx : 0 Total Pages Encrypted : 0 Highest block SCN : 1887888 (0.1887888)
这里可以看到,undo文件本身并没有逻辑和物理的坏块,证明因为数据库异常的原因,可能是由于ORA-27067: size of I/O buffer is invalid导致。根据官方文档ORA-01115 ORA-27067 DURING PARALLEL INSTANCE RECOVERY AFTER INSTANCE CRASH中的解释,我们基本上可以确定很可能是由于10.2.0.1在aix平台的jfs2系统中,由于大量事务操作,突然abort掉数据库(也可能断电),从而数据库在启动的时候进行实例恢复,而由于内部的bug,导致实例恢复无法成功。通过我们处理后的,数据库完美启动,数据0丢失
数据库启动日志
Mon Aug 10 16:34:14 2015 alter database open Mon Aug 10 16:34:14 2015 Beginning crash recovery of 1 threads parallel recovery started with 15 processes Mon Aug 10 16:34:14 2015 Started redo scan Mon Aug 10 16:34:27 2015 Completed redo scan 7889582 redo blocks read, 0 data blocks need recovery Mon Aug 10 16:34:27 2015 Started redo application at Thread 1: logseq 664704, block 1286922 Mon Aug 10 16:34:27 2015 Recovery of Online Redo Log: Thread 1 Group 4 Seq 664704 Reading mem 0 Mem# 0 errs 0: /dev/rredo04 Mon Aug 10 16:34:32 2015 Recovery of Online Redo Log: Thread 1 Group 5 Seq 664705 Reading mem 0 Mem# 0 errs 0: /dev/rredo05 Mon Aug 10 16:34:38 2015 Recovery of Online Redo Log: Thread 1 Group 6 Seq 664706 Reading mem 0 Mem# 0 errs 0: /dev/rredo06 Mon Aug 10 16:34:40 2015 Completed redo application Mon Aug 10 16:34:40 2015 Completed crash recovery at Thread 1: logseq 664706, block 1017805, scn 8554793334 0 data blocks read, 0 data blocks written, 7889582 redo blocks read Mon Aug 10 16:34:40 2015 Thread 1 advanced to log sequence 664707 Thread 1 opened at log sequence 664707 Current log# 1 seq# 664707 mem# 0: /dev/rredo01 Successful open of redo thread 1 Mon Aug 10 16:34:40 2015 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Mon Aug 10 16:34:40 2015 SMON: enabling cache recovery Mon Aug 10 16:34:40 2015 Successfully onlined Undo Tablespace 1. Mon Aug 10 16:34:40 2015 SMON: enabling tx recovery Mon Aug 10 16:34:41 2015 Database Characterset is ZHS32GB18030 replication_dependency_tracking turned off (no async multimaster replication found) WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected. Mon Aug 10 16:34:41 2015 SMON: Parallel transaction recovery tried Mon Aug 10 16:34:42 2015 db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a user-specified limit on the amount of space that will be used by this database for recovery-related files, and does not reflect the amount of space available in the underlying filesystem or ASM diskgroup. Mon Aug 10 16:34:42 2015 Completed: alter database open
小机的文件拷贝到linux上面不能ok,如果只是做文件修复可以
内部bug啊。。。这个问题只有10.2.1 才会有么
小机跑一个单机的10.2.1 感觉好奢侈,前辈这个问题如果把文件全部拷贝到linux上是不是更好处理,然后再拷贝回来。