分类目录归档:数据库

删除asmlib磁盘导致磁盘组故障恢复

有客户执行drop disk磁盘组操作之后,然后立刻从oracle asmlib层面执行了oracleasm deletedisk,并且在操作系统层面delete partition(删除磁盘分区),导致磁盘组直接dismount

Tue Nov 26 16:44:04 2024
SQL> alter diskgroup data drop disk DATA_0008 
NOTE: GroupBlock outside rolling migration privileged region
Tue Nov 26 08:44:05 2024
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 2/0x28dec0d5 (DATA)
NOTE: requesting all-instance membership refresh for group=2
NOTE: membership refresh pending for group 2/0x28dec0d5 (DATA)
Tue Nov 26 08:44:14 2024
GMON querying group 2 at 48 for pid 18, osid 27385
SUCCESS: refreshed membership for 2/0x28dec0d5 (DATA)
SUCCESS: alter diskgroup data drop disk DATA_0008
NOTE: starting rebalance of group 2/0x28dec0d5 (DATA) at power 2
Starting background process ARB0
Tue Nov 26 08:44:14 2024
ARB0 started with pid=38, OS id=56987 
NOTE: assigning ARB0 to group 2/0x28dec0d5 (DATA) with 2 parallel I/Os
Tue Nov 26 08:44:17 2024
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Nov 26 08:44:57 2024
cellip.ora not found.
Tue Nov 26 17:08:46 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 17:10:30 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 09:34:38 2024
WARNING: cache read  a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8 (DATA_0008) incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc
WARNING:cache read (retry) a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8(DATA_0008)incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ERROR: cache failed to read group=2(DATA) dsk=8 blk=98 from disk(s): 8(DATA_0008)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _arb0_+asm1(56987)initiating offline of disk 8.3911069755 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e303b, mask = 0x6a, op = clear
Tue Nov 26 09:34:38 2024
GMON updating disk modes for group 2 at 49 for pid 38, osid 56987
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Nov 26 09:34:38 2024
NOTE: cache dismounting (not clean) group 2/0x28DEC0D5 (DATA) 
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 89645, image: oracle@ahptdb5 (B000)
Tue Nov 26 09:34:38 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc  (incident=413105):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
Tue Nov 26 09:34:39 2024
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x28dec0d5 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_27385.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ERROR: ORA-15335 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: stopping process ARB0
Tue Nov 26 09:34:40 2024
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=716.2684 last written ABA 716.2684

通过重新分区,并且kfed repair修复磁盘头操作之后,重新mount磁盘组报错

SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=2 incarn=0x73bec220
NOTE: cache began mount (first) of group DATA number=2 incarn=0x73bec220
NOTE: Assigning number (2,16) to disk (/dev/oracleasm/disks/DATA208)
NOTE: Assigning number (2,15) to disk (/dev/oracleasm/disks/DATA207)
NOTE: Assigning number (2,14) to disk (/dev/oracleasm/disks/DATA206)
NOTE: Assigning number (2,13) to disk (/dev/oracleasm/disks/DATA205)
NOTE: Assigning number (2,12) to disk (/dev/oracleasm/disks/DATA204)
NOTE: Assigning number (2,11) to disk (/dev/oracleasm/disks/DATA203)
NOTE: Assigning number (2,10) to disk (/dev/oracleasm/disks/DATA202)
NOTE: Assigning number (2,9) to disk (/dev/oracleasm/disks/DATA201)
NOTE: Assigning number (2,6) to disk (/dev/oracleasm/disks/DATA07)
NOTE: Assigning number (2,5) to disk (/dev/oracleasm/disks/DATA06)
NOTE: Assigning number (2,4) to disk (/dev/oracleasm/disks/DATA05)
NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
NOTE: Assigning number (2,3) to disk (/dev/oracleasm/disks/DATA04)
NOTE: Assigning number (2,2) to disk (/dev/oracleasm/disks/DATA03)
NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
NOTE: Assigning number (2,8) to disk (/dev/oracleasm/disks/DATA101)
Tue Nov 26 11:48:22 2024
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 83 for pid 27, osid 15781
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
NOTE: F1X0 found on disk 0 au 2 fcn 0.127835487
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/oracleasm/disks/DATA03
NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/oracleasm/disks/DATA04
NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/oracleasm/disks/DATA05
NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/oracleasm/disks/DATA06
NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/oracleasm/disks/DATA07
NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/oracleasm/disks/DATA101
NOTE: cache opening disk 9 of grp 2: DATA_0009 path:/dev/oracleasm/disks/DATA201
NOTE: cache opening disk 10 of grp 2: DATA_0010 path:/dev/oracleasm/disks/DATA202
NOTE: cache opening disk 11 of grp 2: DATA_0011 path:/dev/oracleasm/disks/DATA203
NOTE: cache opening disk 12 of grp 2: DATA_0012 path:/dev/oracleasm/disks/DATA204
NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/oracleasm/disks/DATA205
NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/oracleasm/disks/DATA206
NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/oracleasm/disks/DATA207
NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/oracleasm/disks/DATA208
NOTE: cache mounting (first) external redundancy group 2/0x73BEC220 (DATA)
Tue Nov 26 11:48:22 2024
* allocate domain 2, invalid = TRUE 
kjbdomatt send to inst 2
Tue Nov 26 11:48:22 2024
NOTE: attached to recovery domain 2
NOTE: starting recovery of thread=1 ckpt=716.1536 group=2 (DATA)
NOTE: starting recovery of thread=2 ckpt=763.6248 group=2 (DATA)
NOTE: recovery initiating offline of disk 8 group 2 (*)
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _user15781_+asm1 (15781) initiating offline of disk 8.3911069996 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e312c, mask = 0x6a, op = clear
GMON updating disk modes for group 2 at 84 for pid 27, osid 15781
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
Tue Nov 26 11:48:23 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
NOTE: recovery (pass 2) of diskgroup 2 (DATA) caught error ORA-15130
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15781.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15131: block 97 of file 8 in diskgroup 2 could not be read
ORA-15196: invalid ASM block header [kfc.c:7600] [endian_kfbh] [2147483656] [97] [0 != 1]

由于客户执行了oracleasm deletedisk,根据经验确认该操作是对asm磁盘头的前1M数据进行了清空,而客户这个asm刚好是drop disk触发了rebalance操作的时候干掉磁盘的,基于这样的情况,直接通过修复磁盘1M数据并且mount磁盘组继续使用该磁盘组的概率不大.因此处理建议:
1. 直接恢复出来该磁盘组数据然后打开该库
2. 直接提取客户需要的核心表数据
有过客户有类似操作是asmlib重新创建了磁盘信息恢复:分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
删除分区信息之后数据库恢复案例:删除分区 oracle asm disk 恢复

发表在 Oracle ASM, Oracle备份恢复 | 标签为 , , | 评论关闭

Kylin Linux 安装19c

一直的观点,对于oracle没有认证的操作系统,不建议客户在生产环境中实施,但是随着信创的推进,有一些客户需要在麒麟(Kylin Linux)系统v10安装oracle 19c,通过安装过程发现基本上还redhat 8 安装19c一致
操作系统版本

[root@xifenfei opt]# uname -a
Linux xifenfei 4.19.90-52.15.v2207.ky10.x86_64 #1 SMP Thu Nov 24 21:50:03 CST 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@xifenfei opt]# cat /etc/os-release 
NAME="Kylin Linux Advanced Server"
VERSION="V10 (Lance)"
ID="kylin"
VERSION_ID="V10"
PRETTY_NAME="Kylin Linux Advanced Server V10 (Lance)"
ANSI_COLOR="0;31"

需要安装的包

yum install -y bc  \
binutils \
elfutils-libelf \
elfutils-libelf-devel \
fontconfig-devel \
glibc \
glibc-devel \
ksh \
libaio \
libaio-devel \
libXrender \
libX11 \
libXau \
libXi \
libXtst \
libgcc \
libnsl \
librdmacm \
libstdc++ \
libstdc++-devel \
libxcb \
libibverbs \
make \
policycoreutils \
policycoreutils-python-utils \
smartmontools \
sysstat 

其他操作系统配置基本上和redhat 8 安装19c一致,这里忽略
在打patch的过程会遭遇以下错误(缺少fuser命令)

[oracle@xifenfei 36912597]$ opatch apply
Oracle Interim Patch Installer version 12.2.0.1.44
Copyright (c) 2024, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/19c/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/19c/db_1/oraInst.loc
OPatch version    : 12.2.0.1.44
OUI version       : 12.2.0.7.0
Log file location : /u01/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-11-20_13-06-19PM_1.log

Verifying environment and performing prerequisite checks...
Prerequisite check "CheckSystemCommandAvailable" failed.
The details are:
Missing command :fuser
UtilSession failed: 
Prerequisite check "CheckSystemCommandAvailable" failed.
Log file location: /u01/app/oracle/product/19c/db_1/cfgtoollogs/opatch/opatch2024-11-20_13-06-19PM_1.log

OPatch failed with error code 73

需要安装下psmisc包(在一般默认的redhat,oracle linux系统中已经安装)

yum install -y psmisc
发表在 Oracle安装升级 | 标签为 | 评论关闭

ORA-600 krse_arc_complete.4

11.2.0.4版本数据库对于虚拟化环境中的备库进行克隆,然后尝试failover方式激活备库,结果遭遇ORA-600 krse_arc_complete.4错误

SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE
*
ERROR at line 1:
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/u01/app/oracle/oradata/hisdb/system.256.975233377'

alert日志显示

Wed Oct 23 21:45:46 2024
ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE
ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE (hisdb)
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
Wed Oct 23 21:45:46 2024
ARC0: Detected ARCH process failure
ARC0: STARTING ARCH PROCESSES
Wed Oct 23 21:45:46 2024
ARC3 started with pid=22, OS id=28848 
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Beginning Standby Crash Recovery.
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Errors in file /u01/app/oracle/diag/rdbms/hisdbdg/hisdb/trace/hisdb_arc0_28695.trc  (incident=528155):
ORA-00600: internal error code, arguments: [krse_arc_complete.4], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/hisdbdg/hisdb/incident/incdir_528155/hisdb_arc0_28695_i528155.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Master archival failure: 600
Wed Oct 23 21:45:47 2024
Deleted Oracle managed file /fra/fast_recovery_area/HISDBDG/archivelog/2024_10_23/o1_mf_2_158418_mkkzjbqt_.arc
Warning: Datafile 1 (/u01/app/oracle/oradata/hisdb/system.256.975233377) is infinitely media recovery fuzzy
Standby database will not open with this datafile online!
Standby Crash Recovery aborted due to error 10554.
Errors in file /u01/app/oracle/diag/rdbms/hisdbdg/hisdb/trace/hisdb_ora_28842.trc:
ORA-10554: Media recovery failed to bring datafile 1 to a consistent point
ORA-01110: data file 1: '/u01/app/oracle/oradata/hisdb/system.256.975233377'
Completed Standby Crash Recovery.
Wed Oct 23 21:45:47 2024
Errors in file /u01/app/oracle/diag/rdbms/hisdbdg/hisdb/trace/hisdb_arc2_28840.trc  (incident=528172):
ORA-00600: internal error code, arguments: [krse_arc_complete.4], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/hisdbdg/hisdb/incident/incdir_528172/hisdb_arc2_28840_i528172.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Master archival failure: 600
ORA-1196 signalled during: ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE...
ARC3: Detected ARCH process failure

查询mos 发现是尝试应用standby log,并且arc进程尝试对其进行归档,发现归档失败从而报该错误,可以尝试对standby log进行clear,然后再激活备库

SQL> select group#,status from v$standby_log;

          GROUP# STATUS
---------------- ----------
              11 UNASSIGNED
              12 UNASSIGNED
              13 UNASSIGNED
              14 ACTIVE
              15 UNASSIGNED
              16 UNASSIGNED
              17 UNASSIGNED
              18 UNASSIGNED

8 rows selected.

SQL> alter database clear unarchived logfile group 14;

Database altered.

SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;

Database altered.

Wed Oct 23 21:55:12 2024
ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE
ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE (hisdb)
Begin: Standby Redo Logfile archival
End: Standby Redo Logfile archival
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Standby terminal recovery start SCN: 35738445353
RESETLOGS after incomplete recovery UNTIL CHANGE 35738178180
Using STANDBY_ARCHIVE_DEST parameter default value as USE_DB_RECOVERY_FILE_DEST
Resetting resetlogs activation ID 1887849281 (0x70864b41)
Online log /u01/app/oracle/oradata/hisdb/group_5.272.976991793: Thread 1 Group 5 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_5.2338.976991793: Thread 1 Group 5 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_6.273.976991805: Thread 1 Group 6 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_6.2339.976991805: Thread 1 Group 6 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_7.274.976991825: Thread 2 Group 7 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_7.2336.976991825: Thread 2 Group 7 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_8.275.976991863: Thread 2 Group 8 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_8.2337.976991863: Thread 2 Group 8 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_9.276.976991877: Thread 1 Group 9 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_9.2334.976991877: Thread 1 Group 9 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_10.277.976991891: Thread 2 Group 10 was previously cleared
Online log /u01/app/oracle/oradata/hisdb/group_10.2333.976991893: Thread 2 Group 10 was previously cleared
Standby became primary SCN: 35738445352
Wed Oct 23 21:55:21 2024
Setting recovery target incarnation to 3
ACTIVATE STANDBY: Complete - Database mounted as primary
Completed: ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE

参考:Activate Standby Database failed with ORA-00600: [krse_arc_complete.4] (Doc ID 2409336.1)

发表在 Data Guard | 标签为 , | 评论关闭