标签归档:O/S-Error: (OS 23)

ORA-07445: exception encountered: core dump [expgod()+43] [IN_PAGE_ERROR]

数据库在运行过程中报O/S-Error: (OS 23) 数据错误(循环冗余检查)错误

Thu Jan 30 22:00:02 2025
Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
Thu Jan 30 22:00:04 2025
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_j000_12576.trc:
ORA-12012: error on auto execute of job 155962
ORA-01115: IO error reading block from file  (block # )
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
ORA-06512: at "SYS.DBMS_STATS", line 25836
ORA-06512: at "SYS.DBMS_STATS", line 26171
End automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"
Fri Jan 31 02:00:00 2025
Clearing Resource Manager plan via parameter
Fri Jan 31 08:15:46 2025
Thread 1 advanced to log sequence 4420 (LGWR switch)
  Current log# 1 seq# 4420 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Fri Jan 31 10:53:57 2025
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_cjq0_1140.trc:
Fri Jan 31 10:53:57 2025
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_j000_7916.trc:
ORA-27102: out of memory
OSD-00043: 附加错误信息
O/S-Error: (OS 1455) 页面文件太小,无法完成操作。
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_cjq0_1140.trc:
Fri Jan 31 10:54:03 2025
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x18] [PC:0xB778B02, clsdcxini()+90]
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
dbgexProcessError()+193<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+1726<-dbkePostKGE_kgsf()+75
<-kgeade()+560<-kgerev()+125<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1869<-sss_xcpt_EvalFilter()+174
<-.1.4_5+59<-00007FFD0245F306<-00007FFD024735AF<-00007FFD023D4AAF<-00007FFD0247231E<-clsdcxini()+90
<-clsdinit()+124<-ksdnfy()+225<-kscnfy()+778<-opirip()+86<-opidrv()+909<-sou2o()+98<-opimai_real()+299
<-opimai()+191<-BackgroundThreadStart()+693<-00007FFD020E7E94<-00007FFD02437AD1
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_j000_9920.trc  (incident=39621):
ORA-07445: exception encountered: core dump [clsdcxini()+90][ACCESS_VIOLATION][ADDR:0x18][PC:0xB778B02][UNABLE_TO_READ]
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_39621\orcl_j000_9920_i39621.trc

然后无法正常启动,报Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]错误

Wed Feb 05 09:43:51 2025
Sweep [inc][39621]: completed
Successful mount of redo thread 1, with mount id 1720066005
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Started redo scan
Completed redo scan
 read 140 KB redo, 62 data blocks need recovery
Started redo application at
 Thread 1: logseq 4420, block 42375
Wed Feb 05 09:44:00 2025
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4420 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Completed redo application of 0.09MB
Completed crash recovery at
 Thread 1: logseq 4420, block 42656, scn 94456019
 62 data blocks read, 62 data blocks written, 140 redo k-bytes read
Wed Feb 05 09:44:01 2025
Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
dbgexProcessError()+193<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+1726
<-dbkePostKGE_kgsf()+75<-kgeade()+560<-kgerev()+125<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1869
<-sss_xcpt_EvalFilter()+174<-.1.4_5+59<-00007FFD0245F306<-00007FFD024735AF<-00007FFD023D4AAF
<-00007FFD0247231E<-expgod()+43<-xtyopncb()+241<-qctcopn()+613<-qctcopn()+392<-qctcpqb()+290
<-qctcpqbl()+52<-xtydrv()+148<-opitca()+1091<-kksLoadChild()+9008<-kxsGetRuntimeLock()+2320
<-kksfbc()+15225<-kkspsc0()+2117<-kksParseCursor()+181<-opiosq0()+2538<-opiosq()+23<-opiodr()+1662
<-rpidrus()+862<-rpidru()+154<-rpiswu2()+2757<-rpidrv()+6105<-rpisplu()+1607<-kqldFixedTableLoadCols()+345
<-kqldcor()+2534<-kglslod()+352<-kqlslod()+52<-PGOSF455_kqlsublod()+125<-kqllod()+7284<-kglobld()+1354
<-kglobpn()+1900<-kglpim()+336<-qcdlgtd()+260<-qcsfplob()+166<-qcsprfro()+903<-qcsprfro_tree()
+292<-qcsprfro_tree()+373<-qcspafq()+96
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_mmon_15468.trc  (incident=39749):
ORA-07445: exception encountered: core dump [expgod()+43] [IN_PAGE_ERROR] [] [PC:0x2C9C015] [] []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_39749\orcl_mmon_15468_i39749.trc
Wed Feb 05 09:44:02 2025
Thread 1 advanced to log sequence 4421 (thread open)
Thread 1 opened at log sequence 4421
  Current log# 2 seq# 4421 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Feb 05 09:44:02 2025
SMON: enabling cache recovery
Wed Feb 05 09:44:11 2025
Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_9308.trc  (incident=39781):
ORA-07445: ??????: ???? [expgod()+43] [IN_PAGE_ERROR] [] [PC:0x2C9C015] [] []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_39781\orcl_ora_9308_i39781.trc
Wed Feb 05 09:44:19 2025
PMON (ospid: 12376): terminating the instance due to error 397
Instance terminated by PMON, pid = 12376

基于上述的Exception [type: IN_PAGE_ERROR, ] [] [PC:0x2C9C015, expgod()+43]错误,第一反应就是可能由于底层损坏导致数据块损坏,dbv检查文件是否报错
dbv-system


检查系统日志确认异常
20250208203059

尝试拷贝文件也报错
QQ20250208-203152

已经比较明确由于底层问题,解决给问题之前,需要先对文件系统进行处理,然后再对恢复出来的数据文件恢复数据

发表在 Oracle备份恢复 | 标签为 , , , , , | 留下评论

O/S-Error: (OS 23) 数据错误(循环冗余检查)—故障处理

有客户由于磁盘坏道导致数据文件访问报ORA-27070 OSD-04016 O/S-Error等相关错误
OSD-04016


rman 尝试读取88号文件

RMAN> backup datafile 88 format 'e:\rman\%d_%T_%I.%s%p';

启动 backup 于 05-6月 -24
使用目标数据库控制文件替代恢复目录
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=2246 设备类型=DISK
通道 ORA_DISK_1: 正在启动全部数据文件备份集
通道 ORA_DISK_1: 正在指定备份集内的数据文件
输入数据文件: 文件号=00088 名称=E:\APP\ADMINISTRATOR\ORADATA\XFF\XIFENFEI.ORA
通道 ORA_DISK_1: 正在启动段 1 于 05-6月 -24
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: backup 命令 (ORA_DISK_1 通道上, 在 06/05/2024 18:33:43 上) 失败
ORA-19501: 文件 "E:\APP\ADMINISTRATOR\ORADATA\XFF\XIFENFEI.ORA", 块编号 322944 (块大小=8192) 上出现读取错误
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。

检查系统日志,发现有报:设备 \Device\Harddisk0\DR0 有一个不正确的区块
disk-error


到目前为止基本上可以判断是文件系统或者磁盘层面出现坏道导致该问题(磁盘坏道概率更大),使用工具对损坏数据文件进行强制拷贝,提示少量扇区数据无法拷贝
force-copy

通过dbv检查恢复出来文件效果

E:\check_db>DBV FILE=D:/XIFENFEI.ORA

DBVERIFY: Release 11.2.0.3.0 - Production on 星期日 6月 9 22:17:28 2024

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - 开始验证: FILE = D:\XIFENFEI.ORA
页 323055 标记为损坏
Corrupt block relative dba: 0x1604edef (file 88, block 323055)
Bad header found during dbv:
Data in bad block:
 type: 229 format: 5 rdba: 0xe5e5e5e5
 last change scn: 0xe5e5.e5e5e5e5 seq: 0xe5 flg: 0xe5
 spare1: 0xe5 spare2: 0xe5 spare3: 0xe5e5
 consistency value in tail: 0x4c390601
 check value in block header: 0xe5e5
 computed block checksum: 0x5003



DBVERIFY - 验证完成

检查的页总数: 524288
处理的页总数 (数据): 204510
失败的页总数 (数据): 0
处理的页总数 (索引): 127485
失败的页总数 (索引): 0
处理的页总数 (其他): 3030
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 189262
标记为损坏的总页数: 1
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 184063522 (3470.184063522)

运气不错,就一个数据库block异常,通过dba_extents查询坏块所属对象,运气不太好是一个表数据

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
输入 file_id 的值:  88
原值    3:  WHERE FILE_ID = &FILE_ID
新值    3:  WHERE FILE_ID = 88
输入 block_id 的值:  323055
原值    4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
新值    4:    AND 323055 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1

OWNER
------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       TABLESPACE_NAME                PARTITION_NAME
------------------ ------------------------------ ------------------------------
XFFUSER
XFFTABLE
TABLE              XFFTBS_TAB2

设置跳过坏块导出数据,然后重命名原表导入数据,完成本次恢复,以前有过类似恢复:

一次侥幸的OSD-04016 O/S-Error异常恢复
O/S-Error: (OS 23) 数据错误(循环冗余检查) 数据库恢复

发表在 Oracle备份恢复 | 标签为 , , | 评论关闭

O/S-Error: (OS 23) 数据错误(循环冗余检查) 数据库恢复

有客户数据库运行过程中突然crash,检测发现ORA-27070 OSD-04016 O/S-Error: (OS 23) 等报错

Thu May 12 11:25:53 2022
KCF: write/open error block=0x19e95f online=1
     file=57 H:\ORADATA\xifenfei\XFF51.DBF
     error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。'
Thu May 12 11:25:53 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw0_3532.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式
ORA-01114: 将块写入文件 57 时出现 IO 错误 (块 # 1698143)
ORA-01110: 数据文件 57: 'H:\ORADATA\xifenfei\XFF51.DBF'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。

DBW0: terminating instance due to error 1242
Thu May 12 11:25:54 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_mman_3528.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:54 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_3544.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw1_3536.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_psp0_3524.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_ckpt_3548.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_pmon_3520.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:06 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_q002_37468.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:08 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_reco_3556.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:08 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_smon_3552.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:10 2022
Instance terminated by DBW0, pid = 3532

再次重启数据库报错 ORA-27070: 异步读取/写入失败 OSD-04016: 异步 I/O 请求排队时出错。类似错误
osd-04006


dbv检查数据文件报异常
dbv-io-error

通过以上信息基本上可以确认是由于底层故障(文件系统或者硬件故障),导致数据库文件访问异常,检查系统日志发现异常
20220518142942

通过专业恢复软件对异常文件进行恢复,实现数据库正常open(跳过坏块)
20220518143342

发表在 Oracle, Oracle备份恢复 | 标签为 , , | 评论关闭