作者归档:惜分飞

不当使用_allow_resetlogs_corruption参数引起ORA-600 2662错误

有一个数据库由于机房掉电,导致数据库无法启动,由于不当恢复导致open过程报ORA-600 2662等错误客户无法自行解决,让我们协助.幸运的是客户对现场做了备份,我接手恢复之后,让客户还原现场,然后进行恢复,比较顺利的直接open数据库,实现数据0丢失原库直接可用,避免了一次因为不当操作而引起少量数据丢失的风险,和业务的快速恢复(避免的因为强制拉库引起的不一致性问题而要做数据库逻辑迁移).通过日志回顾第一次现场恢复经历
最初故障数据库mount报错

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 1071333376 bytes
Fixed Size                  1375792 bytes
Variable Size             754975184 bytes
Database Buffers          310378496 bytes
Redo Buffers                4603904 bytes
ORA-03113: 通信通道的文件结尾
进程 ID: 964
会话 ID: 1145 序列号: 1

无法mount,大部分情况是由于控制文件损坏,然后客户选择了重建controlfile

Thu Feb 20 10:20:45 2025
Successful mount of redo thread 1, with mount id 1721384698
Completed: CREATE CONTROLFILE REUSE DATABASE "ORCL" RESETLOGS ARCHIVELOG
MAXLOGFILES 32
MAXLOGMEMBERS 2
MAXDATAFILES 254
MAXINSTANCES 1
MAXLOGHISTORY 226
LOGFILE
GROUP 1 'D:\app\Administrator\oradata\orcl\redo01.log' size 50M,
GROUP 2 'D:\app\Administrator\oradata\orcl\redo02.log' size 50M,
GROUP 3 'D:\app\Administrator\oradata\orcl\redo03.log' size 50M
DATAFILE
'D:\app\Administrator\oradata\orcl\XFF.DBF',
'D:\app\Administrator\oradata\orcl\XIFENFEI.DBF',
'D:\app\Administrator\oradata\orcl\SYSAUX01.DBF',
'D:\app\Administrator\oradata\orcl\SYSTEM01.DBF',
'D:\app\Administrator\oradata\orcl\UNDOTBS01.DBF',
'D:\app\Administrator\oradata\orcl\USERS01.DBF'
CHARACTER SET ZHS16GBK

重建ctl使用了resetlogs模式,然后下一步客户进行恢复使用命令为:ALTER DATABASE RECOVER DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE

Thu Feb 20 10:22:05 2025
ALTER DATABASE RECOVER  DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE  
Media Recovery Start
 started logmerger process
Thu Feb 20 10:22:05 2025
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 6 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
Parallel Media Recovery started with 4 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  DATABASE UNTIL CANCEL USING BACKUP CONTROLFILE  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_02_20\O1_MF_1_9087_%U_.ARC
Errors with log D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_02_20\O1_MF_1_9087_%U_.ARC
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_pr00_3044.trc:
ORA-00308: cannot open archived log 
        'D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_02_20\O1_MF_1_9087_%U_.ARC'
ORA-27041: unable to open file
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_02_20\O1_MF_1_9087_%U_.ARC
Errors with log D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_02_20\O1_MF_1_9087_%U_.ARC
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_pr00_3044.trc:
ORA-00308: cannot open archived log 
        'D:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\ARCHIVELOG\2025_02_20\O1_MF_1_9087_%U_.ARC'
ORA-27041: unable to open file
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_pr00_3044.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-10879 signalled during: ALTER DATABASE RECOVER CANCEL ...

直接提示需要找归档日志seq为9087的,但是由于该库为非归档模式,客户直接输入了auto,无法找到对应的日志.然后尝试直接resetlogs打开库

Thu Feb 20 10:22:58 2025
ALTER DATABASE OPEN RESETLOGS
ORA-1194 signalled during: ALTER DATABASE OPEN RESETLOGS...
ALTER DATABASE OPEN
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_4836.trc:
ORA-01589: ??????????? RESETLOGS ? NORESETLOGS ??
ORA-1589 signalled during: ALTER DATABASE OPEN...
ALTER DATABASE OPEN RESETLOGS
ORA-1194 signalled during: ALTER DATABASE OPEN RESETLOGS...

由于数据文件不一致(前面recover没有成功),导致直接reseltogs方式打开库失败,然后设置一些参数

Thu Feb 20 10:27:24 2025
ALTER SYSTEM SET _allow_error_simulation=TRUE SCOPE=SPFILE;
ALTER SYSTEM SET _allow_terminal_recovery_corruption=TRUE SCOPE=SPFILE;
Thu Feb 20 10:27:38 2025
ALTER SYSTEM SET _allow_resetlogs_corruption=TRUE SCOPE=SPFILE;

重启库之后强制拉库

Completed: alter database mount
Thu Feb 20 10:29:08 2025
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Thu Feb 20 10:29:19 2025
Archived Log entry 1 added for thread 1 sequence 9088 ID 0x5d904d0a dest 1:
Archived Log entry 2 added for thread 1 sequence 9086 ID 0x5d904d0a dest 1:
Thu Feb 20 10:29:31 2025
Archived Log entry 3 added for thread 1 sequence 9087 ID 0x5d904d0a dest 1:
RESETLOGS after incomplete recovery UNTIL CHANGE 223770120
Thu Feb 20 10:29:38 2025
Setting recovery target incarnation to 2
Thu Feb 20 10:29:39 2025
Assigning activation ID 1721340650 (0x669992ea)
LGWR: STARTING ARCH PROCESSES
Thu Feb 20 10:29:39 2025
ARC0 started with pid=20, OS id=2924 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thu Feb 20 10:29:40 2025
ARC1 started with pid=21, OS id=1832 
Thu Feb 20 10:29:40 2025
ARC2 started with pid=22, OS id=3668 
ARC1: Archival started
Thu Feb 20 10:29:40 2025
ARC3 started with pid=23, OS id=5104 
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Thu Feb 20 10:29:41 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Feb 20 10:29:41 2025
SMON: enabling cache recovery
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_1652.trc  (incident=1006385):
ORA-00600: ??????, ??: [2662], [0], [223770128], [0], [223811777], [12583040], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1006385\orcl_ora_1652_i1006385.trc
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_1652.trc:
ORA-00600: ??????, ??: [2662], [0], [223770128], [0], [223811777], [12583040], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_1652.trc:
ORA-00600: ??????, ??: [2662], [0], [223770128], [0], [223811777], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 1652): terminating the instance due to error 600
Instance terminated by USER, pid = 1652

数据库报ORA-600 2662错误,导致强制拉库失败.
后面客户进行了一系列折腾,导致出现其他错误,比如:
ORA-01595/ORA-600 4194

Fri Feb 21 04:52:19 2025
Trace dumping is performing id=[cdmp_20250221045219]
Doing block recovery for file 3 block 472
Resuming block recovery (PMON) for file 3 block 472
Block recovery from logseq 2, block 51 to scn 223837038
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Block recovery stopped at EOT rba 2.54.16
Block recovery completed at rba 2.54.16, scn 0.223837035
Doing block recovery for file 3 block 144
Resuming block recovery (PMON) for file 3 block 144
Block recovery from logseq 2, block 51 to scn 223837033
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Block recovery completed at rba 2.52.16, scn 0.223837034
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_2124.trc:
ORA-01595: error freeing extent (3) of rollback segment (2))
ORA-00600: internal error code, arguments: [4194], [], [                                      

ORA-600 2256/ORA-600 4194

Fri Feb 21 05:02:43 2025
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_2184.trc  (incident=1134285):
ORA-00600: 内部错误代码, 参数: [2256], [0], [1073741824], [0], [1073761870], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1134285\orcl_ora_2184_i1134285.trc
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is ZHS16GBK
No Resource Manager plan active
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_648.trc  (incident=1134237):
ORA-00600: internal error code, arguments: [4194], [], [
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1134237\orcl_smon_648_i1134237.trc
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_2184.trc  (incident=1134286):
ORA-00600: 内部错误代码, 参数: [4194], [0], [
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1134286\orcl_ora_2184_i1134286.trc

ORA-600 3712

Fri Feb 21 05:24:51 2025
Assigning activation ID 1721440698 (0x669b19ba)
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_lgwr_3620.trc  (incident=1150206):
ORA-00600: internal error code, arguments: [3712], [1], [1], [3584], [3], [3584], [1], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_1150206\orcl_lgwr_3620_i1150206.trc
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_lgwr_3620.trc:
ORA-00600: internal error code, arguments: [3712], [1], [1], [3584], [3], [3584], [1], [], [], [], [], []
LGWR (ospid: 3620): terminating the instance due to error 470

基于客户本身库不大,而且在操作之前备份了现场,然后客户还原故障现场备份,我开始接手恢复
重建控制文件

Fri Feb 21 21:57:19 2025
Successful mount of redo thread 1, with mount id 1721495486
Completed: CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS  NOARCHIVELOG
       MAXLOGFILES 50
       MAXLOGMEMBERS 5
       MAXDATAFILES 5000
       MAXINSTANCES 8
       MAXLOGHISTORY 2920
LOGFILE
       group 1   'D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG' size 50M,
       group 3   'D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG' size 50M,
       group 2   'D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG' size 50M
DATAFILE
        'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF',
        'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSAUX01.DBF',
        'D:\APP\ADMINISTRATOR\ORADATA\ORCL\UNDOTBS01.DBF',
        'D:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS01.DBF',
        'D:\APP\ADMINISTRATOR\ORADATA\ORCL\XIFENFEI.DBF',
        'D:\APP\ADMINISTRATOR\ORADATA\ORCL\XFF.DBF'
CHARACTER SET  ZHS16GBK 

这里重建控制文件使用的是noresetlogs模式,然后尝试recover database

ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 4 slaves
Fri Feb 21 21:57:25 2025
Recovery of Online Redo Log: Thread 1 Group 3 Seq 9087 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Recovery of Online Redo Log: Thread 1 Group 1 Seq 9088 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Fri Feb 21 21:57:30 2025
Completed: ALTER DATABASE RECOVER  database  

数据库直接recover成功,然后尝试正常open库

Fri Feb 21 21:58:17 2025
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Started redo scan
Completed redo scan
 read 12019 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 9088, block 2, scn 223805802
Recovery of Online Redo Log: Thread 1 Group 1 Seq 9088 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 9088, block 24040, scn 223836931
 0 data blocks read, 0 data blocks written, 12019 redo k-bytes read
Fri Feb 21 21:58:17 2025
Thread 1 advanced to log sequence 9089 (thread open)
Thread 1 opened at log sequence 9089
  Current log# 2 seq# 9089 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Feb 21 21:58:17 2025
SMON: enabling cache recovery
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is ZHS16GBK
No Resource Manager plan active
**********************************************************
WARNING: Files may exists in db_recovery_file_dest
that are not known to the database. Use the RMAN command
CATALOG RECOVERY AREA to re-catalog any such files.
If files cannot be cataloged, then manually delete them
using OS command.
One of the following events caused this:
1. A backup controlfile was restored.
2. A standby controlfile was restored.
3. The controlfile was re-created.
4. db_recovery_file_dest had previously been enabled and
   then disabled.
**********************************************************
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Fri Feb 21 21:58:20 2025
QMNC started with pid=22, OS id=524 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open

数据库完美打开,然后增加tempfile,检查/导出数据均没有任何问题,业务可以直接使用本库,不用逻辑迁移
其实这是一个比较小的故障,由于断电导致控制文件损坏,然后客户重建控制文件使用resetlogs方式,然后recover没有正确指定redo,来完成数据库实例恢复,直接使用_allow_resetlogs_corruption参数强制拉库,然后出现了ORA-600 2662/ORA-600 4194/ORA-600 2256等大家熟悉的错误.这个库不大,也做了故障现场保护,如果是一个10T以上大库,发生这样的故障,后果还是比较麻烦的(库可以打开,但是redo中的数据肯定丢失导致数据库不一致,后续可能有很多不一致性问题,可能涉及逻辑迁移),增加企业业务不可用时间成本和业务数据丢失风险,再次提醒对于Oracle隐含参数,还是需要慎重,做好专业的评估再使用.

发表在 Oracle备份恢复 | 标签为 , , | 留下评论

CSSD signal 11 in thread clssnmRcfgMgrThread故障处理

一个客户,集群无法启动,只能启动到如下状态
11


查看cssd日志有CSSD signal 11 in thread clssnmRcfgMgrThread报错

2025-02-21 18:21:25.500: [    CSSD][2788693760]clssnmDoSyncUpdate: node(2) is transitioning from joining state to active state
2025-02-21 18:21:25.500: [    CSSD][2788693760]clssnmDoSyncUpdate: Wait for 0 vote ack(s)
2025-02-21 18:21:25.500: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:25.700: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:25.901: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:25.995: [    CSSD][2801538816]clssnmvDiskPing: Writing with status 0x2, timestamp 1740133285/5870104
2025-02-21 18:21:25.997: [    CSSD][2799818496]clssnmvDiskKillCheck: not evicted, file /dev/dm-4 flags 0x00000000,
                                                                          kill block unique 0, my unique 1740133265
2025-02-21 18:21:26.000: [    CSSD][2793424640]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 2 waited 500
2025-02-21 18:21:26.101: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:26.302: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:26.497: [    CSSD][2801538816]clssnmvDiskPing: Writing with status 0x2, timestamp 1740133286/5870604
2025-02-21 18:21:26.502: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:26.702: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:26.902: [    CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk
2025-02-21 18:21:26.997: [    CSSD][2799818496]clssnmvDiskKillCheck: not evicted, file /dev/dm-4 flags 0x00000000,
                                                                      kill block unique 0, my unique 1740133265
2025-02-21 18:21:26.997: [    CSSD][2801538816]clssnmvDiskPing: Writing with status 0x2, timestamp 1740133286/5871114
2025-02-21 18:21:27.000: [    CSSD][2793424640]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 2 waited 0
2025-02-21 18:21:27.102: [    CSSD][2788693760]clssnmCheckDskInfo: Checking disk info...
2025-02-21 18:21:27.102: [    CSSD][2788693760]clssnmCheckDskInfo: diskTimeout set to (200000)ms
2025-02-21 18:21:27.103: [    CSSD][2788693760]###################################
2025-02-21 18:21:27.103: [    CSSD][2788693760]clssscExit: CSSD signal 11 in thread clssnmRcfgMgrThread
2025-02-21 18:21:27.103: [    CSSD][2788693760]###################################
2025-02-21 18:21:27.103: [    CSSD][2788693760](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
2025-02-21 18:21:27.103: [    CSSD][2788693760]

----- Call Stack Trace -----
2025-02-21 18:21:27.103: [    CSSD][2788693760]calling              call     entry                argument values in hex      
2025-02-21 18:21:27.103: [    CSSD][2788693760]location             type     point                (? means dubious value)     
2025-02-21 18:21:27.103: [    CSSD][2788693760]-------------------- -------- -------------------- ----------------------------
2025-02-21 18:21:27.109: [    CSSD][2788693760]clssscExit()+745     call     kgdsdst()            000000000 ? 000000000 ?
2025-02-21 18:21:27.109: [    CSSD][2788693760]                                                   7F9EA637A650 ? 7F9EA637A728 ?
2025-02-21 18:21:27.109: [    CSSD][2788693760]                                                   7F9EA637F1D0 ? 000000003 ?
2025-02-21 18:21:27.109: [    CSSD][2788693760]s0clsssc_sighandler  call     clssscExit()         001FB9FA0 ? 000000002 ?
2025-02-21 18:21:27.109: [    CSSD][2788693760]()+616                                             7F9EA637A650 ? 7F9EA637A728 ?
2025-02-21 18:21:27.109: [    CSSD][2788693760]                                                   7F9EA637F1D0 ? 000000003 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]__sighandler()       call     s0clsssc_sighandler  00000000B ? 000000002 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                              ()                   7F9EA637A650 ? 7F9EA637A728 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   7F9EA637F1D0 ? 000000003 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]clssnmCheckSplit()+  signal   __sighandler()       001BEE8A8 ? 000000000 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]378                                                002039A80 ? 000000001 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   0004D2B40 ? 7F9EA63803C0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]clssnmCheckDskInfo(  call     clssnmCheckSplit()   001FB9FA0 ? 001DC83F0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760])+387                                              000030D40 ? 000000001 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   0004D2B40 ? 7F9EA63803C0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]clssnmDoSyncUpdate(  call     clssnmCheckDskInfo(  001FB9FA0 ? 001DC83F0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760])+4692                        )                    000000001 ? 000000001 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   0004D2B40 ? 7F9EA63803C0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]clssnmLocalJoinEven  call     clssnmDoSyncUpdate(  001FB9FA0 ? 001DC83F0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]t()+3992                      )                    FFFFFFFFFFFFFFFF ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   000000001 ? 7F9EA6380D20 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   7F9EA63803C0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]clssnmRcfgMgrThread  call     clssnmLocalJoinEven  001FB9FA0 ? 001DC83F0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]()+2290                       t()                  FFFFFFFFFFFFFFFF ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   000000001 ? 7F9EA6380D20 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   7F9EA63803C0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]clssscthrdmain()+25  call     clssnmRcfgMgrThread  001FB9FA0 ? 001DC83F0 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]3                             ()                   FFFFFFFFFFFFFFFF ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   000000001 ? 7F9EA6380D20 ?
2025-02-21 18:21:27.110: [    CSSD][2788693760]                                                   7F9EA63803C0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]start_thread()+209   call     clssscthrdmain()     001FB9FA0 ? 001DC83F0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   FFFFFFFFFFFFFFFF ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   000000001 ? 7F9EA6380D20 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   7F9EA63803C0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]clone()+109          call     start_thread()       7F9EA6381700 ? 001DC83F0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   FFFFFFFFFFFFFFFF ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   000000001 ? 7F9EA6380D20 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   7F9EA63803C0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]0000000000000000     call     clone()              7F9EA6381700 ? 001DC83F0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   FFFFFFFFFFFFFFFF ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   000000001 ? 7F9EA6380D20 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760]                                                   7F9EA63803C0 ?
2025-02-21 18:21:27.111: [    CSSD][2788693760] 
2025-02-21 18:21:27.111: [    CSSD][2788693760]--------------------- Binary Stack Dump ---------------------

这里提示表决盘超时,尝试启动nocrs貌似,在表决盘存在的情况下,启动依旧失败,通过处理让启动过程不读表决盘,启动nocrs模式成功,并mount其他业务磁盘组
22
33
44


确认其他磁盘没有问题,重建crs磁盘组

SQL> create diskgroup OCR  external redundancy disk '/dev/dm-4' force  attribute 'COMPATIBLE.ASM' = '11.2.0';
# ocrconfig -restore /u01/app/11.2.0.3/grid/cdata/scan/backup00.ocr
# crsctl replace votedisk +OCR
SQL> create spfile from pfile='/tmp/pfile.asm';

然后重启crs恢复正常

发表在 Oracle RAC | 标签为 | 留下评论

使用sid方式直接访问pdb(USE_SID_AS_SERVICE_LISTENER)

有些应用,因为特殊原因,需要通过sid来访问数据库,在pdb环境中原则上都是通过服务名访问的,可以通过一定的监听配置实现使用pdb名的sid来访问该pdb
在pdb0中创建u_test用户并授权

[oracle@ora19c:/u01/app/oracle/product/19.3.0/db/network/admin]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Feb 17 22:01:54 2025
Version 19.24.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0

sys@ORA19C 22:01:54> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB0                           READ WRITE NO
         4 PDBXXX                         MOUNTED
sys@ORA19C 22:01:56> alter session set container=pdb0;

Session altered.

Elapsed: 00:00:00.16
sys@ORA19C 22:02:07> create user u_test identified by oracle;

User created.

Elapsed: 00:00:00.29
sys@ORA19C 22:02:21> grant dba to u_test;

Grant succeeded.

Elapsed: 00:00:00.01

监听的配置和状态

[oracle@ora19c:/home/oracle]$ cat /u01/app/oracle/product/19.3.0/db/network/admin/listener.ora
# listener.ora Network Configuration File: /u01/app/oracle/product/19.3.0/db/network/admin/listener.ora
# Generated by Oracle configuration tools.

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = ora19c)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
    )
  )



[oracle@ora19c:/home/oracle]$ lsnrctl status

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 17-FEB-2025 22:07:12

Copyright (c) 1991, 2024, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ora19c)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date                17-FEB-2025 22:06:39
Uptime                    0 days 0 hr. 0 min. 32 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/19.3.0/db/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ora19c/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ora19c)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
Services Summary...
Service "21b067cbda1dbcd4e0630100007f12b6" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "22394b20557aff3ee0630100007fafe0" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "86b637b62fdf7a65e053f706e80a27ca" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "ora19c" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "ora19cXDB" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "pdb0" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "pdbxxx" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
The command completed successfully

创建pdb0基于服务和sid的tns(pdb0,pdb0_sid)

[oracle@ora19c:/u01/app/oracle/product/19.3.0/db/network/admin]$ cat tnsnames.ora
pdb0 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora19c)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = pdb0)
    )
  )
pdb0_sid =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = ora19c)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (sid = pdb0)
    )
  )

[oracle@ora19c:/u01/app/oracle/product/19.3.0/db/network/admin]$ tnsping pdb0

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 17-FEB-2025 22:03:00

Copyright (c) 1997, 2024, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/19.3.0/db/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ora19c)(PORT = 1521)) 
(CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = pdb0)))
OK (0 msec)
[oracle@ora19c:/u01/app/oracle/product/19.3.0/db/network/admin]$ tnsping pdb0_sid

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 17-FEB-2025 22:03:10

Copyright (c) 1997, 2024, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/19.3.0/db/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ora19c)(PORT = 1521))
 (CONNECT_DATA = (SERVER = DEDICATED) (sid = pdb0)))
OK (0 msec)

分别测试pdb0和pdb0_sid访问数据库
测试证明基于服务名的方式可以正常访问pdb,基于sid的方式无法访问pdb

[oracle@ora19c:/home/oracle]$ sqlplus u_test/oracle@pdb0

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Feb 17 22:08:35 2025
Version 19.24.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

Last Successful login time: Mon Feb 17 2025 22:06:11 +08:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0

u_test@PDB0 22:08:35> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0
[oracle@ora19c:/home/oracle]$ sqlplus u_test/oracle@pdb0_sid

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Feb 17 22:08:39 2025
Version 19.24.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

ERROR:
ORA-12505: TNS:listener does not currently know of SID given in connect
descriptor


Enter user-name: 
ERROR:
ORA-01017: invalid username/password; logon denied


Enter user-name: 
ERROR:
ORA-01017: invalid username/password; logon denied


SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

在listener.ora中增加USE_SID_AS_SERVICE_LISTENER = ON,并reload加载
注意:USE_SID_AS_SERVICE_LISTENER 中的LISTENER根据不同的监听名字而发生改变

[oracle@ora19c:/home/oracle]$ cat /u01/app/oracle/product/19.3.0/db/network/admin/listener.ora
# listener.ora Network Configuration File: /u01/app/oracle/product/19.3.0/db/network/admin/listener.ora
# Generated by Oracle configuration tools.

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = ora19c)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
    )
  )
USE_SID_AS_SERVICE_LISTENER = ON

[oracle@ora19c:/home/oracle]$ lsnrctl reload

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 17-FEB-2025 22:12:13

Copyright (c) 1991, 2024, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ora19c)(PORT=1521)))
The command completed successfully

[oracle@ora19c:/home/oracle]$ lsnrctl status

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 17-FEB-2025 22:13:05

Copyright (c) 1991, 2024, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ora19c)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date                17-FEB-2025 22:06:39
Uptime                    0 days 0 hr. 6 min. 26 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/19.3.0/db/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ora19c/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ora19c)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
Services Summary...
Service "21b067cbda1dbcd4e0630100007f12b6" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "22394b20557aff3ee0630100007fafe0" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "86b637b62fdf7a65e053f706e80a27ca" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "ora19c" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "ora19cXDB" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "pdb0" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
Service "pdbxxx" has 1 instance(s).
  Instance "ora19c", status READY, has 1 handler(s) for this service...
The command completed successfully

尝试tns名字为pdb0和pdb0_sid名字登录数据库
在listener.ora文件中设置了USE_SID_AS_SERVICE_LISTENER = ON之后,基于sid的方式可以直接访问pdb

[oracle@ora19c:/home/oracle]$ sqlplus u_test/oracle@pdb0_sid

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Feb 17 22:12:16 2025
Version 19.24.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

Last Successful login time: Mon Feb 17 2025 22:08:35 +08:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0

u_test@PDB0 22:12:16> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0
[oracle@ora19c:/home/oracle]$ sqlplus u_test/oracle@pdb0

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Feb 17 22:12:28 2025
Version 19.24.0.0.0

Copyright (c) 1982, 2024, Oracle.  All rights reserved.

Last Successful login time: Mon Feb 17 2025 22:12:16 +08:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.24.0.0.0

u_test@PDB0 22:12:28> 
发表在 Oracle | 标签为 , | 留下评论