标签归档:ORA-600 4137

虚拟机故障引起ORA-00310 ORA-00334故障处理

有客户由于硬件底层问题,导致运行在虚拟机环境中的oracle数据库突然爆大量错误

Reread (file 5, block 2371528) found same corrupt data (no logical check)
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_j000_10927.trc  (incident=397049):
ORA-01578: ORACLE data block corrupted (file # 5, block # 2371528)
ORA-01110: data file 5: '/home/oracle/app/oradata/users01.dbf'

Wed Apr 02 23:10:24 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_j000_10927.trc  (incident=397050):
ORA-00600: internal error code, arguments: [5400], [], [], [], [], [], [], [], [], [], [], []

Wed Apr 02 23:15:29 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_11605.trc  (incident=397075):
ORA-00600: internal error code, arguments: [ktbdchk1: bad dscn], [], [], [], [], [], [], [], [], [], [], []

Wed Apr 02 23:20:32 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_11530.trc  (incident=397034):
ORA-00600: internal error code, arguments: [25027], [6], [196610], [], [], [], [], [], [], [], [], []

Wed Apr 02 23:20:52 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_11528.trc  (incident=397027):
ORA-00600: internal error code, arguments: [ktspfpblk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []

Wed Apr 02 23:22:53 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_11609.trc  (incident=397082):
ORA-00600: internal error code, arguments: [6002], [6], [189], [1], [0], [], [], [], [], [], [], []

Wed Apr 02 23:26:41 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_m000_11966.trc  (incident=397035):
ORA-00600: internal error code, arguments: [dbgrmblur_update_range_1], [11], [6], [], [], [], [], [], [], [], [], []

Wed Apr 02 23:31:47 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_j000_10927.trc:
ORA-12012: error on auto execute of job "SYS"."ORA$AT_SA_SPC_SY_49685"
ORA-08102: index key not found, obj# 39, file 1, block 55190 (2)

Thu Apr 03 00:15:18 2025
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x8] [PC:0xB9EC41, ksuloget()+421] [flags: 0x0, count: 1]
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_m000_12633.trc  (incident=400879):
ORA-07445: exception encountered:core dump [ksuloget()+421][SIGSEGV][ADDR:0x8][PC:0xB9EC41][Address not mapped to object]

Thu Apr 03 00:15:23 2025
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_pmon_4097.trc  (incident=396817):
ORA-00600: internal error code, arguments: [1100], [0x2E3947E78], [0x2E3947E78], [], [], [], [], [], [], [], [], []

数据库crash掉之后,处理好硬件环境和虚拟机启动之后,数据库直接启动失败,报ORA-01172 ORA-01151

Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 29239 KB redo, 4020 data blocks need recovery
Started redo application at
 Thread 1: logseq 211603, block 9107
Recovery of Online Redo Log: Thread 1 Group 4 Seq 211603 Reading mem 0
  Mem# 0: /home/oracle/app/oradata/orcl/redo04.log
  Mem# 1: /home/oracle/app/oradata/orcl/redo041.log
Hex dump of (file 2, block 4835) in trace file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_19174.trc
Reading datafile '/home/oracle/app/oradata/orcl/sysaux01.dbf' for corruption at rdba: 0x008012e3 (file 2, block 4835)
Reread (file 2, block 4835) found same corrupt data (logically corrupt)
RECOVERY OF THREAD 1 STUCK AT BLOCK 4835 OF FILE 2
Aborting crash recovery due to error 1172
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_19174.trc:
ORA-01172: recovery of thread 1 stuck at block 4835 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_19174.trc:
ORA-01172: recovery of thread 1 stuck at block 4835 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...

然后再次尝试重启提示ORA-01113 ORA-01110

Fri Apr 04 09:34:36 2025
ALTER DATABASE OPEN
Errors in file /home/oracle/app/diag/rdbms/orcl/orcl/trace/orcl_ora_4076.trc:
ORA-01113: file 5 needs media recovery
ORA-01110: data file 5: '/home/oracle/app/oradata/users01.dbf'
ORA-1113 signalled during: ALTER DATABASE OPEN...

可以自行尝试了各种恢复,比如using backup controlfile,until cancel,rectl等操作,数据库均为open成功,基本上都是卡在类似如下报ORA-00310 ORA-00334错

Sat Apr 05 10:17:34 2025
ALTER DATABASE RECOVER  database using backup controlfile   
Media Recovery Start
 started logmerger process
Sat Apr 05 10:17:34 2025
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 6 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 7 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 8 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 9 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 10 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 11 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 12 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 13 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 14 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 15 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
Parallel Media Recovery started with 28 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile   ...
Sat Apr 05 10:17:59 2025
ALTER DATABASE RECOVER    LOGFILE '/home/oradata/redo02.log'  
Media Recovery Log /home/oradata/redo02.log
Sat Apr 05 10:17:59 2025
Errors with log /home/oradata/redo02.log
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_pr00_12141.trc:
ORA-00310: archived log contains sequence 211550; sequence 211603 required
ORA-00334: archived log: '/home/oradata/redo02.log'
ORA-310 signalled during: ALTER DATABASE RECOVER    LOGFILE '/home/oradata/redo02.log'  ...
ALTER DATABASE RECOVER CANCEL 
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 

基于上述情况,数据库由于底层异常,导致所需要的redo和实际存在的redo文件内容不匹配,只能屏蔽一致性强制打开库

SQL> alter database open resetlogs ;
alter database open resetlogs 
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [0], [1685409503], [0], [1685415469], [12583040], []
ORA-00600: internal error code, arguments: [2662], [0], [1685409502], [0], [1685415469], [12583040], []
ORA-01092: ORACLE instance terminated. Disconnection force
ORA-00600: internal error code, arguments: [2662], [0], [1685409498], [0], [1685415469], [12583040], []
Process ID: 10637
Session ID: 645 Serial number: 7

ORA-600 2662这个错误比较常见,通过修改数据库scn,进行规避然后尝试打开库

Sat Apr 05 10:31:45 2025
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 1685409495
Resetting resetlogs activation ID 1725417463 (0x66d7c7f7)
Sat Apr 05 10:31:46 2025
Setting recovery target incarnation to 2
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 1685409498, threshold SCN value is 0
Sat Apr 05 10:31:46 2025
Assigning activation ID 1725412798 (0x66d7b5be)
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: /home/oradata/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Apr 05 10:31:46 2025
SMON: enabling cache recovery
Undo initialization finished serial:0 start:61632504 end:61632514 diff:10 (0 seconds)
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
SMON: enabling tx recovery
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is AL32UTF8
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8145):
ORA-00600: internal error code, arguments: [4137], [9.1.436887], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8145/oorcl_smon_22927_i8145.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Stopping background process MMNL
ORACLE Instance oorcl (pid = 17) - Error 600 encountered while recovering transaction (9, 1).
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc:
ORA-00600: internal error code, arguments: [4137], [9.1.436887], [0], [0], [], [], [], [], [], [], [], []
Sat Apr 05 10:31:46 2025
Sweep [inc][8145]: completed
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8146):
ORA-00600: internal error code, arguments: [4137], [9.1.436887], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8146/oorcl_smon_22927_i8146.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sat Apr 05 10:31:46 2025
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_p054_2643.trc  (incident=8625):
ORA-00600: internal error code, arguments: [kturbleurec1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8625/oorcl_p054_2643_i8625.trc
Sat Apr 05 10:31:46 2025
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_p034_2603.trc  (incident=8465):
ORA-00600: internal error code, arguments: [kturbleurec1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8465/oorcl_p034_2603_i8465.trc
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open resetlogs
Sat Apr 05 10:31:48 2025
Starting background process CJQ0
Sat Apr 05 10:31:48 2025
CJQ0 started with pid=80, OS id=2852 
SMON: Restarting fast_start parallel rollback
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8147):
ORA-00600: internal error code, arguments: [4137], [9.1.436887], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8147/oorcl_smon_22927_i8147.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sat Apr 05 10:31:50 2025
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_p000_2535.trc  (incident=8169):
ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8169/oorcl_p000_2535_i8169.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance oorcl (pid = 17) - Error 600 encountered while recovering transaction (9, 1).
Block recovery from logseq 1, block 19 to scn 2147483682
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1 Reading mem 0
  Mem# 0: /home/oradata/redo02.log
Block recovery completed at rba 1.734.16, scn 0.2147483683
Block recovery from logseq 1, block 404 to scn 2147483682
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1 Reading mem 0
  Mem# 0: /home/oradata/redo02.log
Block recovery completed at rba 1.734.16, scn 0.2147483683
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8148):
ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8148/oorcl_smon_22927_i8148.trc
Sat Apr 05 10:31:50 2025
Sweep [inc][8147]: completed
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
SMON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8149):
ORA-00600: internal error code, arguments: [4137], [10.28.1201778], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8149/oorcl_smon_22927_i8149.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance oorcl (pid = 17) - Error 600 encountered while recovering transaction (10, 28).
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc:
ORA-00600: internal error code, arguments: [4137], [10.28.1201778], [0], [0], [], [], [], [], [], [], [], []
Sat Apr 05 10:31:50 2025
Sweep [inc][8149]: completed
Checker run found 1 new persistent data failures
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8150):
ORA-00600: internal error code, arguments: [4137], [10.28.1201778], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oorcl/oorcl/incident/incdir_8150/oorcl_smon_22927_i8150.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance oorcl (pid = 17) - Error 600 encountered while recovering transaction (10, 28).
Errors in file /u01/app/oracle/diag/rdbms/oorcl/oorcl/trace/oorcl_smon_22927.trc  (incident=8151):
ORA-00600: internal error code, arguments: [4137], [10.28.1201778], [0], [0], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sat Apr 05 10:31:51 2025
Sweep [inc][8150]: completed
ORACLE Instance oorcl (pid = 17) - Error 600 encountered while recovering transaction (10, 28).

虽然数据库open成功,但是有ORA-600 4137/ORA-600 kturbleurec1/ORA-600 4198等错误,但是这里比较明显的undo有问题,对于异常undo进行处理,然后逻辑导出数据,导入新库完成本次恢复任务

发表在 Oracle备份恢复 | 标签为 , , , , , , , , , , | 留下评论

存储故障后oracle报—ORA-01122/ORA-01207故障处理

客户存储异常,通过硬件恢复解决存储故障之后,oracle数据库无法正常启动(存储cache丢失),尝试recover数据库报ORA-00283 ORA-01122 ORA-01110 ORA-01207错误
以前处理过比较类似的存储故障case:
又一起存储故障导致ORA-00333 ORA-00312恢复
存储故障,强制拉库报ORA-600 kcbzib_kcrsds_1处理

SQL> recover database until cancel;
ORA-00283: 恢复会话因错误而取消
ORA-01122: 数据库文件 536 验证失败
ORA-01110: 数据文件 536: '+DATA/orcl/dt_img_dat511.ora'
ORA-01207: 文件比控制文件更新 - 旧的控制文件
Sun May 05 00:09:03 2024
ALTER DATABASE RECOVER  database until cancel  
Media Recovery Start
 started logmerger process
Sun May 05 00:09:10 2024
SUCCESS: diskgroup FRA was mounted
Sun May 05 00:09:10 2024
NOTE: dependency between database orcl and diskgroup resource ora.FRA.dg is established
Sun May 05 00:09:14 2024
WARNING! Recovering data file 1 from a fuzzy backup. It might be an online
backup taken without entering the begin backup command.
Media Recovery failed with error 1122
Slave exiting with ORA-283 exception
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_pr00_8048.trc:
ORA-00283: 恢复会话因错误而取消
ORA-01122: 数据库文件 536 验证失败
ORA-01110: 数据文件 536: '+DATA/orcl/dt_img_dat511.ora'
ORA-01207: 文件比控制文件更新 - 旧的控制文件
Sun May 05 00:09:16 2024
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...

using backup controlfile进行恢复

SQL> recover database using backup controlfile until cancel;
ORA-00279: 更改 18646239951 (在 04/25/2024 17:14:50 生成) 对于线程 1 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003886.199435.1167240505
ORA-00280: 更改 18646239951 (用于线程 1) 在序列 #1003886 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: 更改 18646239951 (在 04/25/2024 17:11:40 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677876.199531.1167239807
ORA-00280: 更改 18646239951 (用于线程 2) 在序列 #677876 中


ORA-00279: 更改 18646255791 (在 04/25/2024 17:16:46 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677877.199472.1167240099
ORA-00280: 更改 18646255791 (用于线程 2) 在序列 #677877 中
ORA-00278: 此恢复不再需要日志文件
'+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677876.199531.1167239807'


ORA-00279: 更改 18646295647 (在 04/25/2024 17:21:38 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677878.199379.1167240623
ORA-00280: 更改 18646295647 (用于线程 2) 在序列 #677878 中
ORA-00278: 此恢复不再需要日志文件
'+FRA/orcl/archivelog/2024_04_25/thread_2_seq_677877.199472.1167240099'


ORA-00279: 更改 18646331784 (在 04/25/2024 17:28:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507
ORA-00280: 更改 18646331784 (用于线程 1) 在序列 #1003887 中
ORA-00278: 此恢复不再需要日志文件
'+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003886.199435.1167240505'


ORA-00308: 无法打开归档日志
'+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507'
ORA-17503: ksfdopn: 2 未能打开文件
+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507
ORA-15012: ASM file
'+FRA/orcl/archivelog/2024_04_25/thread_1_seq_1003887.199320.1167241507' does not exist


ORA-10879: error signaled in parallel recovery slave
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: '+DATA/orcl/system01.dbf'

通过分析,确认由于cache丢失导致thread_1_seq_1003887这个日志丢失(而且redo已经被覆盖)
20240506-2


20240506-1

数据库无法通过正常recover的思路解决.通过屏蔽一致性,强制打开数据库,alert日志报ORA-600 2662错误

Sat May 04 17:23:00 2024
Redo thread 2 internally disabled at seq 1 (CKPT)
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 2 added for thread 2 sequence 1 ID 0x0 dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_3684.trc  (incident=47066):
ORA-00600: ??????, ??: [2662], [4], [1466533588], [4], [1466584862], [12583040], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_3684.trc:
ORA-00600: ??????, ??: [2662], [4], [1466533588], [4], [1466584862], [12583040], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_ora_3684.trc:
ORA-00600: ??????, ??: [2662], [4], [1466533588], [4], [1466584862], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3684): terminating the instance due to error 600
Instance terminated by USER, pid = 3684
ORA-1092 signalled during: alter database open resetlogs...

通过修改数据库scn,open数据库报ORA-600 4137

Sun May 05 00:12:41 2024
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open resetlogs
Sun May 05 00:12:56 2024
Trace dumping is performing id=[cdmp_20240505001256]
Sun May 05 00:12:56 2024
ORACLE Instance orcl1 (pid = 22) - Error 600 encountered while recovering transaction (28, 21).
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl1\trace\orcl1_smon_5896.trc:
ORA-00600: ??????, ??: [4137], [28.21.42965783], [0], [0], [], [], [], [], [], [], [], []

这个错误比较明显,由于28号回滚段异常导致,对异常回滚段进行处理,重建undo,数据库恢复主要工作完成

发表在 Oracle备份恢复 | 标签为 , , , | 评论关闭

云主机快照之后Oracle无法正常启动处理

某客户数据库放在x云上面,需要对数据库盘进行扩容,在扩容之前对该盘做了快照,结果没有想到悲剧发生了

[root@xifenfei ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        99G   64G   31G  68% /
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  720K   16G   1% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/vdb        2.0T  1.2T  910G  56% /www/xifenfei
tmpfs           3.2G     0  3.2G   0% /run/user/1004
tmpfs           3.2G     0  3.2G   0% /run/user/0

如上显示,客户的数据文件都放在/dev/vdb中了,但是很不幸,redo文件放在/data中(也就是vda磁盘组中),没有被做快照,结果客户还原vdb快照之后,发现现象如下

SQL> set pages 10000
SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_CHANGE# CHECKPOINT_T     LAST_CHANGE#          ROW_NUM
-------------- ------------------ ------------ ---------------- ----------------
ONLINE                69632585947 04-JUL-22                                   38
SYSTEM                69632585947 04-JUL-22                                    2

SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS         CHECKPOINT_TIME                          FUZZY  CHECKPOINT_CHANGE#          ROW_NUM
-------------- ---------------------------------------- ------ ------------------ ----------------
ONLINE         2022-07-04 09:03:24                      YES           69631105424               40

20220704230638


通过上述分析,该库相当数据文件和redo文件之间相差了一段时间数据,而且该库为非归档,基于这种情况,该库只能强制打开,在打开过程中遇到ORA-600 ktpridestroy2错误

SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
SMON: Restarting fast_start parallel rollback
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7332.trc  (incident=41257):
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_41257/orcl_smon_7332_i41257.trc
Starting background process QMNC
Mon Jul 04 16:31:44 2022
QMNC started with pid=36, OS id=7454 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7332.trc:
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 7332): terminating the instance due to error 474
Instance terminated by SMON, pid = 7332

对应trace文件

Dump continued from file: /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7332.trc
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 41257 (ORA 600 [ktpridestroy2]) ========

*** 2022-07-04 16:31:44.261
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   7FFCD123FE98 ? 000000000 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksedst()+34          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+2736  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksedmp()+36          call     dbkedDefDump()       000000003 ? 000000002 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksfdmp()+64          call     ksedmp()             000000003 ? 000000002 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbgexPhaseII()+1764  call     ksfdmp()             000000003 ? 000000002 ?
                                                   7FFCD123B998 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbgexProcessError()  call     dbgexPhaseII()       7F3C5D15C6F0 ? 7F3C5A851598 ?
+2279                                              7FFCD1247C88 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbgeExecuteForError  call     dbgexProcessError()  7F3C5D15C6F0 ? 7F3C5A851598 ?
()+83                                              000000001 ? 000000000 ?
                                                   7FFC00000000 ? 000000000 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  7F3C5D15C6F0 ? 7F3C5A851598 ?
1615                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000000 ? 7F3C5A6C1228 ?
63                                                 000000258 ? 7F3C5A851598 ?
                                                   000000000 ? 000000000 ?
kgeadse()+383        call     dbkePostKGE_kgsf()   00A984C60 ? 7F3C5A6C1228 ?
                                                   000000258 ? 7F3C5A851598 ?
                                                   000000000 ? 000000000 ?
kgerinv_internal()+  call     kgeadse()            00A984C60 ? 7F3C5A6C1228 ?
45                                                 000000258 ? 000000000 ?
                                                   000000000 ? 000000000 ?
kgerinv()+33         call     kgerinv_internal()   00A984C60 ? 7F3C5A6C1228 ?
                                                   D124022000000000 ?
                                                   000000258 ? 000000000 ?
                                                   000000000 ?
kgeasnmierr()+143    call     kgerinv()            00A984C60 ? 7F3C5A6C1228 ?
                                                   D124022000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ktpridestroy()+912   call     kgeasnmierr()        00A984C60 ? 7F3C5A6C1228 ?
                                                   D124022000000000 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
ktprw1s()+527        call     ktpridestroy()       D124022000000000 ?
                                                   000000000 ? 1E7A1C2B0 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
ktprsched()+197      call     ktprw1s()            D124022000000000 ?
                                                   000000000 ? 1E7A1C2B0 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
kturRecoverUndoSegm  call     ktprsched()          D124022000000000 ?
ent()+1057                                         000000000 ? 1E7A1C2B0 ?
                                                   000000000 ? 1E0F02D40 ?
                                                   1EC6DA410 ?
kturRecoverActiveTx  call     kturRecoverUndoSegm  000000000 ? 000000000 ?
ns()+710                      ent()                000000001 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ktprbeg()+2506       call     kturRecoverActiveTx  000000004 ? 000000000 ?
                              ns()                 000000027 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ktmmon()+13588       call     ktprbeg()            000000000 ? 000000000 ?
                                                   000000027 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ktmSmonMain()+201    call     ktmmon()             06002DEC0 ? 000000000 ?
                                                   000000027 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
ksbrdp()+923         call     ktmSmonMain()        06002DEC0 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
opirip()+618         call     ksbrdp()             06002DEC0 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
opidrv()+598         call     opirip()             000000032 ? 000000004 ?
                                                   7FFCD124B658 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
sou2o()+98           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFCD124B658 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
opimai_real()+261    call     sou2o()              7FFCD124B630 ? 000000032 ?
                                                   000000004 ? 7FFCD124B658 ?
                                                   0D124FFFF ? 6200000005 ?
ssthrdmain()+209     call     opimai_real()        000000000 ? 7FFCD124B820 ?
                                                   000000004 ? 7FFCD124B658 ?
                                                   0D124FFFF ? 6200000005 ?
main()+196           call     ssthrdmain()         000000003 ? 7FFCD124B820 ?
                                                   000000001 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
__libc_start_main()  call     main()               000000003 ? 7FFCD124B9C0 ?
+245                                               000000001 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
_start()+36          call     __libc_start_main()  0009C12F0 ? 000000001 ?
                                                   7FFCD124B9B8 ? 000000000 ?
                                                   0D124FFFF ? 6200000005 ?
--------------------- Binary Stack Dump ---------------------

通过分析确认该错误和并行恢复有关系,绕过该错误之后,再次尝试启动库报错为ORA-600 4137

Mon Jul 04 16:33:41 2022
SMON: enabling cache recovery
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc  (incident=42457):
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_42457/orcl_smon_7554_i42457.trc
Stopping background process MMNL
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (6, 11).
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_7554.trc:
ORA-00600: internal error code, arguments: [4137], [6.11.21484016], [0], [0], [], [], [], [], [], [], [], []

该错误比较常见,一般是由于undo中有异常事务,对异常事务进行处理,数据库open成功,并顺利导入数据到新库中,完成本次数据恢复

发表在 Oracle备份恢复 | 标签为 , , , , | 评论关闭