标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (103)
- 数据库 (1,702)
- DB2 (22)
- MySQL (74)
- Oracle (1,563)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (94)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- Oracle各种类型坏块说明和处理
- fio测试io,导致磁盘文件系统损坏故障恢复
- ORA-742 写丢失常见bug记录
- Oracle 19c 202501补丁(RUs+OJVM)
- 避免 19c 数据库性能问题需要考虑的事项 (Doc ID 3050476.1)
- Bug 21915719 Database hang or may fail to OPEN in 12c IBM AIX or HPUX Itanium – ORA-742, DEADLOCK or ORA-600 [kcrfrgv_nextlwn_scn] ORA-600 [krr_process_read_error_2]
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
分类目录归档:ORA-xxxxx
ORA-00800: soft external error, arguments: [Set Priority Failed]
在一套19.14的linux 2节点rac库中,使用sqlplus启动数据库成功,但是alert日志中报ORA-00800: soft external error, arguments: [Set Priority Failed]错误.
2022-09-21T22:20:35.924251+08:00 Starting background process VKTM 2022-09-21T22:20:35.977936+08:00 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_vktm_22653.trc (incident=880052): ORA-00800: soft external error, arguments: [Set Priority Failed], [VKTM], [Check traces and OS configuration], [Check Oracle document and MOS notes], [] Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_880052/orcl1_vktm_22653_i880052.trc 2022-09-21T22:20:35.980555+08:00 Error attempting to elevate VKTM's priority: no further priority changes will be attempted for this process VKTM started with pid=6, OS id=22653 Starting background process LMHB 2022-09-21T22:20:36.467831+08:00 Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_lms0_22703_22708.trc (incident=920005): ORA-00800: soft external error, arguments: [Set Priority Failed], [LMS0], [Check traces and OS configuration], [Check Oracle document and MOS notes], [] Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_920005/orcl1_lms0_22703_22708_i920005.trc 2022-09-21T22:20:36.470535+08:00 Error attempting to elevate LMS0's priority: no further priority changes will be attempted for this process
错误提示比较明显,提升进程的优先级失败,通过操作系统命令观察发现确实进程优先级没有提升
[root@oradb01 ~]# ps -eo pid,class,pri,nice,time,args|grep vktm|grep -v grep 5656 TS 19 0 00:00:00 ora_vktm_orcl1 30838 RR 41 - 13:08:36 ora_vktm_+ASM1
重新使用srvctl启动数据库,优先级提升正常,alert日志中也无类似警告
[root@oradb01 ~]# ps -eo pid,class,pri,nice,time,args|grep vktm|grep -v grep 5716 RR 41 0 00:00:00 ora_vktm_orcl1 30838 RR 41 - 13:18:46 ora_vktm_+ASM1
这个问题一直困惑了很久,今天无意中在mos上发现了相关mos文档,具体参考:(DB50) Clusterware Fails to Start Because CSSD Cannot Get Real-Time Priority (Doc ID 2903663.1),由于 bug 34286265 and bug 34318125(Bug 34649727 Linux: ORA-800 / Set Priority / DB Performance Merge Patch for 19.17 – 34286265 34318125)
尽量不要使用sqlplus去启动数据库,而是选择使用srvctl,避免在rac环境中导致数据库后台关键进程优先级无法提升问题.
ORA-00257: archiver error的另外一种原因
今天遇到一个相对特殊的案例,拿出来和大家分享,数据库报错为不能归档ORA-00257
SQL> conn system/xxxxxx ERROR: ORA-00257: archiver error. Connect internal only, until freed. SQL> archive log list; Database log mode Archive Mode Automatic archival Enabled Archive destination /oracle/PRD/oraarch Oldest online log sequence 11479 Current log sequence 11481
alert日志报错
sapprddb1:oraprd 73> tail -f alert_prd.log Master archival failure: 19502 Master archival failure: 19502 Master archival failure: 19502 Master archival failure: 19502 Wed Oct 12 09:27:41 2022 Master archival failure: 19502 Master archival failure: 19502 Master archival failure: 19502 Wed Oct 12 09:27:41 2022 Master archival failure: 19502
该错误的含义
sapprddb1:oraprd 74>oerr ora 19502 19502, 00000, "write error on file \"%s\", block number %s (block size=%s)" // *Cause: write error on output file // *Action: check the file
从报错初步看是由于归档目录空间满了导致,查看发现归档空间剩余很多
sapprddb1:/oracle/PRD/sapdata1 # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 99G 25G 70G 26% / udev 253G 240K 253G 1% /dev tmpfs 426G 72K 426G 1% /dev/shm /dev/sda3 388G 12G 357G 4% /backup /dev/mapper/SAPVG-oraclelv 197G 22G 165G 12% /oracle /dev/mapper/SAPVG-mirrlogA 20G 773M 18G 5% /oracle/PRD/mirrlogA /dev/mapper/SAPVG-mirrlogBlv 20G 773M 18G 5% /oracle/PRD/mirrlogB /dev/mapper/SAPVG-oraarchlv 180G 2.2G 169G 2% /oracle/PRD/oraarch <----剩余很多 /dev/mapper/SAPVG-origlogAlv 20G 894M 18G 5% /oracle/PRD/origlogA /dev/mapper/SAPVG-origlogBlv 20G 894M 18G 5% /oracle/PRD/origlogB /dev/mapper/SAPVG-sapdata1lv 591G 561G 0 100% /oracle/PRD/sapdata1 /dev/mapper/SAPVG-sapdata2lv 1.4T 1.4T 0 100% /oracle/PRD/sapdata2 /dev/mapper/SAPVG-sapdata3lv 788G 748G 0 100% /oracle/PRD/sapdata3 /dev/mapper/SAPVG-sapdata4lv 788G 748G 0 100% /oracle/PRD/sapdata4 sapprd:/sapmnt/PRD 30G 12G 17G 42% /sapmnt/PRD sapprd:/usr/sap/trans 105G 1005M 99G 1% /usr/sap/trans /dev/mapper/VGSAP2-lvsapdata5 1.5T 340G 1.1T 25% /oracle/PRD/sapdata5 /dev/mapper/VGSAP2-lvsapdata6 1.5T 706G 696G 51% /oracle/PRD/sapdata6
尝试人工归档
SQL> alter system archive log current 2 / alter system archive log current * ERROR at line 1: ORA-19502: write error on file "/oracle/PRD/sapdata1/cntrl/cntrlPRD.dbf", block number 4837 (block size=16384) ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 28: No space left on device Additional information: -1 Additional information: 442368
到这里基本上明确了,由于控制文件所在的分区磁盘使用100%,导致归档的时候无法写如记录到控制文件从而导致数据库报ORA-00257错误,对/oracle/PRD/sapdata1中的某个文件进行稍微收缩,数据库恢复正常
ORA-600 3417故障处理
数据库突然报ORA-600 3417错误
Mon Sep 26 06:42:51 2022 Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc (incident=176185): ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], [] Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\incident\incdir_176185\orcl2_lgwr_7984_i176185.trc Mon Sep 26 06:42:54 2022 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc: ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], [] LGWR (ospid: 7984): terminating the instance due to error 470
节点2异常之后,节点1由于跨节点实例恢复导致异常
Mon Sep 26 06:44:26 2022 Instance recovery: looking for dead threads Beginning instance recovery of 1 threads Submitted all GCS remote-cache requests Post SMON to start 1st pass IR Fix write in gcs resources Reconfiguration complete parallel recovery started with 31 processes Started redo scan Completed redo scan read 887 KB redo, 348 data blocks need recovery Started redo application at Thread 2: logseq 9907, block 1980 Recovery of Online Redo Log: Thread 2 Group 3 Seq 9907 Reading mem 0 Mem# 0: +DATA/orcl/onlinelog/group_3.265.1078882689 Mem# 1: +OCR/orcl/onlinelog/group_3.259.1078882689 Completed redo application of 0.32MB Completed instance recovery at Thread 2: logseq 9907, block 3755, scn 231951271 338 data blocks read, 348 data blocks written, 887 redo k-bytes read Mon Sep 26 06:44:36 2022 Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc (incident=208205): ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], [] Mon Sep 26 06:44:38 2022 Reconfiguration started (old inc 14, new inc 16) List of instances: 1 2 (myinst: 1) Global Resource Directory frozen Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Mon Sep 26 06:44:38 2022 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Mon Sep 26 06:44:38 2022 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Mon Sep 26 06:44:38 2022 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Mon Sep 26 06:44:38 2022 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived Set master node info Submitted all remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted ORA-600 occurred during recovery, instance will be terminated Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc: ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], [] System state dump requested by (instance=1,osid=7972 (SMON)),summary=[abnormal instance termination]. SMON (ospid: 7972): terminating the instance due to error 600 Mon Sep 26 06:44:43 2022 ORA-1092 : opitsk aborting process Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_diag_6956_20220926064442.trc: ORA-00601: ?????? Mon Sep 26 06:44:46 2022 opiodr aborting process unknown ospid (6688) as a result of ORA-1092
再次重启实例无法正常启动报ORA-600 3417错误
Completed: ALTER DATABASE MOUNT Mon Sep 26 08:08:34 2022 alter database open This instance was first to open Picked broadcast on commit scheme to generate SCNs Mon Sep 26 08:08:36 2022 LGWR: STARTING ARCH PROCESSES Mon Sep 26 08:08:36 2022 ARC0 started with pid=39, OS id=5004 ARC0: Archival started LGWR: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Mon Sep 26 08:08:37 2022 ARC1 started with pid=38, OS id=3568 Mon Sep 26 08:08:37 2022 ARC2 started with pid=41, OS id=3308 Mon Sep 26 08:08:37 2022 ARC3 started with pid=42, OS id=8180 Mon Sep 26 08:08:37 2022 ARC4 started with pid=43, OS id=7768 Mon Sep 26 08:08:37 2022 ARC5 started with pid=44, OS id=4628 Mon Sep 26 08:08:37 2022 ARC6 started with pid=45, OS id=6920 Mon Sep 26 08:08:37 2022 ARC7 started with pid=46, OS id=7960 ARC1: Archival started ARC2: Archival started ARC3: Archival started ARC4: Archival started ARC5: Archival started ARC6: Archival started ARC1: Becoming the 'no FAL' ARCH ARC1: Becoming the 'no SRL' ARCH ARC2: Becoming the heartbeat ARCH Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc(incident=400186): ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], [] ARC7: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Mon Sep 26 08:08:39 2022 Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc: ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], [] LGWR (ospid: 7924): terminating the instance due to error 470 Mon Sep 26 08:08:44 2022 ORA-1092 : opitsk aborting process
故障比较明显,数据库两个节点故障之后,启动其中一个节点,已经完成了实例恢复,但是无法open,通过10046进行跟踪发下你open过程没有执行任何语句直接导致win服务异常,然后终止
通过进一步分析确认是redo组异常
SQL> select group#,sequence#,status,thread# from v$log; GROUP# SEQUENCE# STATUS THREAD# ---------- ---------- ---------------- ---------- 1 10837 CURRENT 1 2 0 UNUSED 1 3 9907 CURRENT 2 4 0 UNUSED 2
进一步查询数据文件是否正常
SQL> set pages 10000 SQL> set numw 16 SQL> SELECT status, 2 checkpoint_change#, 3 checkpoint_time,last_change#, 4 count(*) ROW_NUM 5 FROM v$datafile 6 GROUP BY status, checkpoint_change#, checkpoint_time,last_change# 7 ORDER BY status, checkpoint_change#, checkpoint_time; STATUS CHECKPOINT_CHANGE# CHECKPOINT_TIME LAST_CHANGE# ROW_NUM ------- ------------------ ------------------- ---------------- ---------------- ONLINE 231971314 2022-09-26 06:44:37 231971314 44 SYSTEM 231971314 2022-09-26 06:44:37 231971314 1 SQL> set numw 16 SQL> col CHECKPOINT_TIME for a40 SQL> set lines 150 SQL> set pages 1000 SQL> SELECT status, 2 to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,chec kpoint_change#, 3 count(*) ROW_NUM 4 FROM v$datafile_header 5 GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh 24:mi:ss'),fuzzy 6 ORDER BY status, checkpoint_change#, checkpoint_time; STATUS CHECKPOINT_TIME FUZ CHECKPOINT_CHANGE# ROW_NUM ------- ---------------------------------------- --- ------------------ -------- -------- ONLINE 2022-09-26 06:44:37 NO 231971314 45
基于上述情况,对于这个库,可以直接open库,实现数据0丢失