Data Guard出现gap sequence修复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Data Guard出现gap sequence修复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一、出现gap sequence现象
备库

Fetching gap sequence in thread 1, gap sequence 710-716
Tue May 31 15:02:38 2011
FAL[client]: Failed to request gap sequence 
 GAP - thread 1 sequence 710-716
 DBID 3240478808 branch 746916894
FAL[client]: All defined FAL servers have been attempted.
-------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that is sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
-------------------------------------------------------------

主库

Tue May 31 13:50:47 2011
FAL[server]: Fail to queue the whole FAL gap
 GAP - thread 1 sequence 710-716
 DBID 3240478808 branch 746916894

二、修复操作
1、查询备库的scn

SQL> SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
-----------
    1154337
--在出现意外datafile header scn不一致的时候,需要根据提示归档日志,找出最小scn

2、确定主库是否添加数据文件

SQL> select FILE#,name from v$datafile where CREATION_CHANGE#> =1154337;
no rows selected

确定主库在这个scn之后是否有添加数据文件,如果添加文件,需要手工在备库添加

3、备库停止日志应用

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

4、主库增量备份并传输到备库上
主库进行增量备份

RMAN> BACKUP INCREMENTAL FROM SCN  1154337 DATABASE 
FORMAT '/home/oracle/xff_%U' tag 'XIFENFEI';
[oracle@localhost ~]$ scp xff* 192.168.1.30:/home/oracle/rman

说明:主库之前必须要做过rman的全备(没有全备的库,基于scn的增量备份也能够成功)

5、备库上进行恢复

RMAN> CATALOG START WITH '/home/oracle/rman';
RMAN> RECOVER DATABASE NOREDO;

说明:CATALOG START WITH是10g及其以后版本中才存在功能,没有该功能可以采用catalog或者复制主库的控制文件,rman备份放置和主库备份时相同目录实现。

6、主库上创建standby controlfile文件并传输到备库

RMAN> BACKUP CURRENT CONTROLFILE FOR STANDBY 
FORMAT '/home/oracle/xff_ctl.bck';
[oracle@localhost ~]$ scp xff_ctl.bck 192.168.1.30:/home/oracle/rman

创建standby controlfile两步可以需要根据实际情况考虑,大多数情况下不需要

7、备库恢复控制文件

RMAN> shutdown;
RMAN> STARTUP NOMOUNT;
RMAN> RESTORE STANDBY CONTROLFILE FROM '/home/oracle/rman/xff_ctl.bck';
RMAN> alter database mount;

8、清空备库日志组

SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1; 
注:如果采用了standby log模式,不需要清空,如果清空会出现
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-19527: physical standby redo log must be renamed
ORA-00312: online log 1 thread 1: '/u01/oradata/xienfei/redo01.log'

说明:如果没有采用standby log模式,有几组需要清空几组

9、备库重设flashback

SQL> ALTER DATABASE FLASHBACK OFF; 
SQL> ALTER DATABASE FLASHBACK ON;

10、备库重新接收并应用日志

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

三、修复成功标志
1、sql中操作
在主库中执行alter system switch logfile;
分别主备库中执行select max(sequence#) from v$archived_log;如果一致标示修复成功

2、通过alert文件
主库

PING[ARC0]: Error 3113 when pinging standby xff.
Tue May 31 14:11:51 2011
Thread 1 advanced to log sequence 719
  Current log# 3 seq# 719 mem# 0: /u01/oradata/xienfei/redo03.log
Tue May 31 14:20:05 2011
Thread 1 advanced to log sequence 720
  Current log# 1 seq# 720 mem# 0: /u01/oradata/xienfei/redo01.log
Tue May 31 14:20:16 2011
ARC0: Standby redo logfile selected for thread 1 sequence 719 for destination LOG_ARCHIVE_DEST_2

备库

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
Tue May 31 15:30:37 2011
Attempt to start background Managed Standby Recovery process (xff)
MRP0 started with pid=18, OS id=14704
Tue May 31 15:30:37 2011
MRP0: Background Managed Standby Recovery process started (xff)
Managed Standby Recovery not using Real Time Apply
 parallel recovery started with 2 processes
Media Recovery Log /u01/archive/1_718_746916894.arc
Tue May 31 15:30:43 2011
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
Tue May 31 15:30:52 2011
RFS[1]: Successfully opened standby log 4: '/u01/oradata/xienfei/s_redo1.log'
Media Recovery Log /u01/archive/1_719_746916894.arc
Media Recovery Waiting for thread 1 sequence 720
此条目发表在 Data Guard 分类目录。将固定链接加入收藏夹。

Data Guard出现gap sequence修复》有 2 条评论

  1. 惜分飞 说:

    今天在帮朋友处理类此问题的时候,基于scn恢复好数据库后,因为备库没有standby redo log file 且没有清理备库的redo log,出现如下错误

    Thu Apr 19 22:01:36 2012
    Thread 1 cannot allocate new log, sequence 2031
    Private strand flush not complete
      Current log# 1 seq# 2030 mem# 0: /u01/app/oracle/oradata/orcl/redo01.log
    Thread 1 advanced to log sequence 2031
      Current log# 2 seq# 2031 mem# 0: /u01/app/oracle/oradata/orcl/redo02.log
    Thu Apr 19 22:01:37 2012
    Errors in file /u01/app/oracle/admin/orcl/bdump/orcl_lns1_7012.trc:
    ORA-00270: error creating archive log 
    Thu Apr 19 22:01:37 2012
    LGWR: Error 270 closing archivelog file 'standby'
    Thu Apr 19 22:01:44 2012
    Thread 1 cannot allocate new log, sequence 2032
    Private strand flush not complete
      Current log# 2 seq# 2031 mem# 0: /u01/app/oracle/oradata/orcl/redo02.log
    Thread 1 advanced to log sequence 2032
      Current log# 3 seq# 2032 mem# 0: /u01/app/oracle/oradata/orcl/redo03.log
    

    trace文件内容如

    *** 2012-04-19 20:51:47.096 57126 kcrr.c
    LNS1: initialized successfully ASYNC=1
    Destination is specified with ASYNC=61440
    *** 2012-04-19 20:51:48.125 70905 kcrr.c
    Sending online log thread 1 seq 2025 [logfile 2] to standby
    Opening logfile [logno 2]
    LGWR: Archivelog for thread 1 sequence 2025 will NOT be compressed
    Redo shipping client performing standby login
    *** 2012-04-19 20:51:48.174 64561 kcrr.c
    Logged on to standby successfully
    Client logon and security negotiation successful!
    Destination LOG_ARCHIVE_DEST_2 is in MAXIMUM PERFORMANCE mode
    Archiving to destination standby ASYNC blocks=20480
    Allocate ASYNC blocks: Previous blocks=0 New blocks=20480
    Log file opened [logno 2]
    *** 2012-04-19 20:52:03.210
    *** 2012-04-19 20:52:03.210 71706 kcrr.c
    Detected proper completion of online log
    *** 2012-04-19 20:52:03.210 71549 kcrr.c
    Detected physical end of file
    Online entry still exists for log 2 thr 1 seq 2025. Using it
    Closing thread 1 sequence 2025 logno 2 nab 2 los 65825433 nxs 65825440
    *** 2012-04-19 20:52:03.219 70905 kcrr.c
    Sending online log thread 1 seq 2026 [logfile 3] to standby
    Opening logfile [logno 3]
    LGWR: Archivelog for thread 1 sequence 2026 will NOT be compressed
    Destination LOG_ARCHIVE_DEST_2 is in MAXIMUM PERFORMANCE mode
    Archiving to destination standby ASYNC blocks=20480
    Log file opened [logno 3]
    Attempting to send buffer 0x0x2aad8decb000  start block 2  block count 1
        total size to send 512  blocksize 512
    *** 2012-04-19 20:52:11.234 71706 kcrr.c
    Detected proper completion of online log
    *** 2012-04-19 20:52:11.234 71549 kcrr.c
    Detected physical end of file
    Attempting to send buffer 0x0x2aad8decb000  start block 3  block count 1
        total size to send 512  blocksize 512
    Online entry still exists for log 3 thr 1 seq 2026. Using it
    Closing thread 1 sequence 2026 logno 3 nab 4 los 65825440 nxs 65825443
    

    执行如下命令问题解决

    SQL> select group# from v$log;
    
        GROUP#
    ----------
             1
             3
             2
    
    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
    
    Database altered.
    
    SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
    
    Database altered.
    
    SQL> ALTER DATABASE CLEAR LOGFILE GROUP 2;    
    
    Database altered.
    
    SQL> ALTER DATABASE CLEAR LOGFILE GROUP 3;
    
    Database altered.
    
    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
    
    Database altered.
    
  2. 惜分飞 说:
    MRP0: Background Managed Standby Recovery process started (orcl)
    Managed Standby Recovery not using Real Time Apply
    Media Recovery Waiting for thread 1 sequence 2017
    Fetching gap sequence in thread 1, gap sequence 2017-2018
    Thu Apr 19 20:37:25 2012
    FAL[server]: Fail to queue the whole FAL gap
     GAP - thread 1 sequence 2017-2018
     DBID 1298106628 branch 770325894
    

    当查询v$datafile scn,database scn,datafile_header.scn发现均不一致,这个时候不能使用CURRENT_SCN来作为scn增量备份的起点,这个时候需要根据缺少日志最下sequence#找出scn,然后作为增量备份的起点.

    找出scn起点

    SQL> select FIRST_CHANGE#  from v$archived_log where SEQUENCE#  =2017;
    
     FIRST_CHANGE# 
     ------------- 
          65727543  
    

    所以使用gap中提示的sequence#找出起始scn比CURRENT_SCN 更加靠谱(出现scn不一致情况,一般人工不正常干预dg导致

    BACKUP INCREMENTAL FROM SCN 65727543 DATABASE
    FORMAT '/home/oracle/xff_%U' tag 'XIFENFEI';