标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (103)
- 数据库 (1,716)
- DB2 (22)
- MySQL (74)
- Oracle (1,576)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (160)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (54)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (575)
- Oracle安装升级 (94)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (28)
- SQL Server恢复 (9)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- 不当使用_allow_resetlogs_corruption参数引起ORA-600 2662错误
- CSSD signal 11 in thread clssnmRcfgMgrThread故障处理
- 使用sid方式直接访问pdb(USE_SID_AS_SERVICE_LISTENER)
- ORA-00069: cannot acquire lock — table locks disabled for xxxx
- ORA-600 [4000] [a]相关bug
- sql server数据库“正在恢复”故障处理
- 如何判断数据文件是否处于begin backup状态
- CDM备份缺少归档打开数据库报ORA-600 kcbzib_kcrsds_1故障处理
- ORA-07445: exception encountered: core dump [expgod()+43] [IN_PAGE_ERROR]
- 2025年第一起ORA-600 16703故障恢复
- _gc_undo_affinity=FALSE触发ORA-01558
- public授权语句
- 中文环境显示AR8MSWIN1256(阿拉伯语字符集)
- 处理 Oracle 块损坏
- Oracle各种类型坏块说明和处理
- fio测试io,导致磁盘文件系统损坏故障恢复
- ORA-742 写丢失常见bug记录
- Oracle 19c 202501补丁(RUs+OJVM)-19.26
- 避免 19c 数据库性能问题需要考虑的事项 (Doc ID 3050476.1)
- Bug 21915719 Database hang or may fail to OPEN in 12c IBM AIX or HPUX Itanium – ORA-742, DEADLOCK or ORA-600 [kcrfrgv_nextlwn_scn] ORA-600 [krr_process_read_error_2]
分类目录归档:Oracle RAC
RAC主机相差超过10分钟导致crs无法启动
客户反馈有一套19c 2节点rac,断电之后,一个节点数据库无法正常启动,通过crsctl命令查看发现crs进程没有正常启动
[root@xifenf1 ~]# /u01/app/19.0/grid/bin/crsctl status res -t -init -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE OFFLINE STABLE ora.cluster_interconnect.haip 1 ONLINE ONLINE xifenf1 STABLE ora.crf 1 ONLINE ONLINE xifenf1 STABLE ora.crsd 1 ONLINE OFFLINE STABLE ora.cssd 1 ONLINE ONLINE xifenf1 STABLE ora.cssdmonitor 1 ONLINE ONLINE xifenf1 STABLE ora.ctssd 1 ONLINE OFFLINE STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.evmd 1 ONLINE ONLINE xifenf1 STABLE ora.gipcd 1 ONLINE ONLINE xifenf1 STABLE ora.gpnpd 1 ONLINE ONLINE xifenf1 STABLE ora.mdnsd 1 ONLINE ONLINE xifenf1 STABLE ora.storage 1 ONLINE ONLINE xifenf1 STABLE --------------------------------------------------------------------------------
查看crs的alert日志发现集群时间间隔超过600s,无法启动csst进程
2024-06-11 17:33:09.953 [OCSSD(5020)]CRS-1605: CSSD voting file is online: /dev/asm_ocr5; details in /u01/app/grid/diag/crs/xifenf1/crs/trace/ocssd.trc. 2024-06-11 17:33:09.956 [OCSSD(5020)]CRS-1605: CSSD voting file is online: /dev/asm_ocr1; details in /u01/app/grid/diag/crs/xifenf1/crs/trace/ocssd.trc. 2024-06-11 17:33:10.024 [OCSSD(5020)]CRS-1605: CSSD voting file is online: /dev/asm_ocr2; details in /u01/app/grid/diag/crs/xifenf1/crs/trace/ocssd.trc. 2024-06-11 17:33:10.031 [OCSSD(5020)]CRS-1605: CSSD voting file is online: /dev/asm_ocr4; details in /u01/app/grid/diag/crs/xifenf1/crs/trace/ocssd.trc. 2024-06-11 17:33:10.040 [OCSSD(5020)]CRS-1605: CSSD voting file is online: /dev/asm_ocr3; details in /u01/app/grid/diag/crs/xifenf1/crs/trace/ocssd.trc. 2024-06-11 17:33:11.900 [OCSSD(5020)]CRS-1601: CSSD Reconfiguration complete. Active nodes are xifenf1 xifenf2 . 2024-06-11 17:33:13.344 [OCSSD(5020)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation. 2024-06-11 17:33:13.809 [OCTSSD(5488)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 5488 2024-06-11 17:33:16.017 [OCTSSD(5488)]CRS-2407: The new Cluster Time Synchronization Service reference node is host xifenf2. 2024-06-11 17:33:16.018 [OCTSSD(5488)]CRS-2401: The Cluster Time Synchronization Service started on host xifenf1. 2024-06-11 17:33:16.105 [OCTSSD(5488)]CRS-2419: The clock on host xifenf1 differs from mean cluster time by 1031504618 microseconds. The Cluster Time Synchronization Service will not perform time synchronization because the time difference is beyond the permissible offset of 600 seconds. Details in /u01/app/grid/diag/crs/xifenf1/crs/trace/octssd.trc. 2024-06-11 17:33:16.579 [OCTSSD(5488)]CRS-2402: The Cluster Time Synchronization Service aborted on host xifenf1. Details at (:ctsselect_mstm4:) in /u01/app/grid/diag/crs/xifenf1/crs/trace/octssd.trc.
查看主机时间
[grid@xifenf1 ~]$ date ;ssh xifenf2 date Tue Jun 11 17:54:09 CST 2024 Tue Jun 11 18:04:34 CST 2024
修改主机时间
[root@xifenf1 ~]# date -s "20240611 18:06:00" Tue Jun 11 18:06:00 CST 2024 [root@xifenf1 ~]# su - grid Last login: Tue Jun 11 17:37:53 CST 2024 on pts/0 [grid@xifenf1 ~]$ date ;ssh xifenf2 date Tue Jun 11 18:06:09 CST 2024 Tue Jun 11 18:05:34 CST 2024
重启crs
[root@xifenf1 ~]# /u01/app/19.0/grid/bin/crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'xifenf1' CRS-2673: Attempting to stop 'ora.storage' on 'xifenf1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'xifenf1' CRS-2673: Attempting to stop 'ora.crf' on 'xifenf1' CRS-2677: Stop of 'ora.storage' on 'xifenf1' succeeded CRS-2673: Attempting to stop 'ora.evmd' on 'xifenf1' CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'xifenf1' CRS-2677: Stop of 'ora.mdnsd' on 'xifenf1' succeeded CRS-2677: Stop of 'ora.crf' on 'xifenf1' succeeded CRS-2677: Stop of 'ora.evmd' on 'xifenf1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'xifenf1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'xifenf1' CRS-2677: Stop of 'ora.cssd' on 'xifenf1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'xifenf1' CRS-2673: Attempting to stop 'ora.gipcd' on 'xifenf1' CRS-2677: Stop of 'ora.gpnpd' on 'xifenf1' succeeded CRS-2677: Stop of 'ora.gipcd' on 'xifenf1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'xifenf1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@xifenf1 ~]# /u01/app/19.0/grid/bin/crsctl start crs CRS-4123: Oracle High Availability Services has been started. [root@xifenf1 ~]# /u01/app/19.0/grid/bin/crsctl status res -t -init -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE xifenf1 STABLE ora.cluster_interconnect.haip 1 ONLINE ONLINE xifenf1 STABLE ora.crf 1 ONLINE ONLINE xifenf1 STABLE ora.crsd 1 ONLINE ONLINE xifenf1 STABLE ora.cssd 1 ONLINE ONLINE xifenf1 STABLE ora.cssdmonitor 1 ONLINE ONLINE xifenf1 STABLE ora.ctssd 1 ONLINE ONLINE xifenf1 ACTIVE:35600,STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.evmd 1 ONLINE ONLINE xifenf1 STABLE ora.gipcd 1 ONLINE ONLINE xifenf1 STABLE ora.gpnpd 1 ONLINE ONLINE xifenf1 STABLE ora.mdnsd 1 ONLINE ONLINE xifenf1 STABLE ora.storage 1 ONLINE ONLINE xifenf1 STABLE --------------------------------------------------------------------------------
ora.storage无法启动报ORA-12514故障处理
19.11集群,节点2人工重启之后,crs启动异常
[grid@xff2 ~]$ crsctl status res -t -init -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE xff2 STABLE ora.cluster_interconnect.haip 1 ONLINE ONLINE xff2 STABLE ora.crf 1 ONLINE ONLINE xff2 STABLE ora.crsd 1 ONLINE OFFLINE STABLE ora.cssd 1 ONLINE ONLINE xff2 STABLE ora.cssdmonitor 1 ONLINE ONLINE xff2 STABLE ora.ctssd 1 ONLINE ONLINE xff2 OBSERVER,STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.drivers.acfs 1 ONLINE ONLINE xff2 STABLE ora.evmd 1 ONLINE ONLINE xff2 STABLE ora.gipcd 1 ONLINE ONLINE xff2 STABLE ora.gpnpd 1 ONLINE ONLINE xff2 STABLE ora.mdnsd 1 ONLINE ONLINE xff2 STABLE ora.storage 1 ONLINE OFFLINE STABLE --------------------------------------------------------------------------------
crs的alert日志显示
2024-03-05 12:46:26.021 [CLSECHO(3653)]ACFS-9327: Verifying ADVM/ACFS devices. 2024-03-05 12:46:26.040 [CLSECHO(3661)]ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'. 2024-03-05 12:46:26.065 [CLSECHO(3673)]ACFS-9156: Detecting control device '/dev/ofsctl'. 2024-03-05 12:46:26.357 [CLSECHO(3703)]ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf 2024-03-05 12:46:26.376 [CLSECHO(3711)]ACFS-9322: completed 2024-03-05 12:46:27.764 [CSSDMONITOR(3855)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 3855 2024-03-05 12:46:27.839 [OSYSMOND(3857)]CRS-8500: Oracle Clusterware OSYSMOND process is starting with operating system process ID 3857 2024-03-05 12:46:28.129 [CSSDAGENT(3890)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 3890 2024-03-05 12:46:29.125 [OCSSD(3910)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 3910 2024-03-05 12:46:30.187 [OCSSD(3910)]CRS-1713: CSSD daemon is started in hub mode 2024-03-05 12:46:31.428 [OCSSD(3910)]CRS-1707: Lease acquisition for node xff2 number 2 completed 2024-03-05 12:46:32.630 [OCSSD(3910)]CRS-1621: The IPMI configuration data for this node stored in the Oracle registry is incomplete; details at (:CSSNK00002:) in /u01/app/grid/diag/crs/xff2/crs/trace/ocssd.trc 2024-03-05 12:46:32.630 [OCSSD(3910)]CRS-1617: The information required to do node kill for node xff2 is incomplete; details at (:CSSNM00004:) in /u01/app/grid/diag/crs/xff2/crs/trace/ocssd.trc 2024-03-05 12:46:32.638 [OCSSD(3910)]CRS-1605: CSSD voting file is online: /dev/sda1; details in /u01/app/grid/diag/crs/xff2/crs/trace/ocssd.trc. 2024-03-05 12:46:33.546 [OCSSD(3910)]CRS-1601: CSSD Reconfiguration complete. Active nodes are xff1 xff2 . 2024-03-05 12:46:35.405 [OCSSD(3910)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation. 2024-03-05 12:46:35.533 [OCTSSD(4138)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 4138 2024-03-05 12:46:36.339 [OCTSSD(4138)]CRS-2403: The Cluster Time Synchronization Service on host xff2 is in observer mode. 2024-03-05 12:46:37.601 [OCTSSD(4138)]CRS-2407: The new Cluster Time Synchronization Service reference node is host xff1. 2024-03-05 12:46:37.601 [OCTSSD(4138)]CRS-2401: The Cluster Time Synchronization Service started on host xff2. 2024-03-05 12:46:54.181 [ORAROOTAGENT(2427)]CRS-5019: All OCR locations are on ASM disk groups [SYSTEMDG], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/u01/app/grid/diag/crs/xff2/crs/trace/ohasd_orarootagent_root.trc". 2024-03-05 12:47:15.209 [OLOGGERD(4553)]CRS-8500: Oracle Clusterware OLOGGERD process is starting with operating system process ID 4553 2024-03-05 12:52:04.581 [CRSCTL(8313)]CRS-1013: The OCR location in an ASM disk group is inaccessible. Details in /u01/app/grid/diag/crs/xff2/crs/trace/crsctl_8313.trc. 2024-03-05 12:56:44.519 [ORAROOTAGENT(2427)]CRS-5818: Aborted command 'start' for resource 'ora.storage'. Details at (:CRSAGF00113:) {0:5:3} in /u01/app/grid/diag/crs/xff2/crs/trace/ohasd_orarootagent_root.trc. 2024-03-05 12:56:44.608 [OHASD(2217)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.storage'. Details at (:CRSPE00221:) {0:5:3} in /u01/app/grid/diag/crs/xff2/crs/trace/ohasd.trc. 2024-03-05 12:56:44.606 [ORAROOTAGENT(2427)]CRS-5017: The resource action "ora.storage start" encountered the following error: 2024-03-05 12:56:44.606+agent's abort action pending. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/xff2/crs/trace/ohasd_orarootagent_root.trc". 2024-03-05 12:57:58.464 [CRSD(11801)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 11801 2024-03-05 12:58:12.059 [CRSD(11801)]CRS-1013: The OCR location in an ASM disk group is inaccessible. Details in /u01/app/grid/diag/crs/xff2/crs/trace/crsd.trc.
ohasd_orarootagent_root 日志
2024-03-05 12:52:00.769 : OCRRAW:4255452928: kgfnConnect3: Got a Connection Error when connecting to ASM. 2024-03-05 12:52:00.771 : OCRRAW:4255452928: kgfnConnect2: failed to connect 2024-03-05 12:52:00.771 : OCRRAW:4255452928: kgfnConnect2Retry: failed to connect connect after 1 attempts, 124s elapsed 2024-03-05 12:52:00.771 : OCRRAW:4255452928: kgfo_kge2slos error stack at kgfoAl06: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor ORA-12514: TNS:listener does not currently know of service requested in connect descriptor 2024-03-05 12:52:00.771 : OCRRAW:4255452928: -- trace dump on error exit -- 2024-03-05 12:52:00.771 : OCRRAW:4255452928: Error [kgfoAl06] in [kgfokge] at kgfo.c:2176 2024-03-05 12:52:00.771 : OCRRAW:4255452928: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor ORA-12514: TNS:listener does not currently know of service requested 2024-03-05 12:52:00.771 : OCRRAW:4255452928: Category: 7 "/u01/app/grid/diag/crs/xff2/crs/trace/crsctl_8313.trc" 208L, 11809C 2024-03-05 12:52:03.543 : OCRRAW:4255452928: 9379 Error 4 opening dom root in 0xf9afdb79c0 2024-03-05 12:52:03.551 : OCRRAW:4255452928: kgfnConnect2: kgfnGetBeqData failed 2024-03-05 12:52:03.577 : OCRRAW:4255452928: kgfnConnect2Int: cstr=(DESCRIPTION=(TCP_USER_TIMEOUT=1)(CONNECT_TIMEOUT=60)(EXPIRE_TIME=1)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=节点1私网IP)(PORT=1525)))(CONNECT_DATA=(SERVICE_NAME=+ASM))) 2024-03-05 12:52:03.578 : OCRRAW:4255452928: kgfnConnect2Int: ServerAttach 2024-03-05 12:52:04.579 : OCRRAW:4255452928: kgfnServerAttachConnErrors: Encountered service based error 12514 2024-03-05 12:52:04.579 : OCRRAW:4255452928: kgfnRecordErr 12514 OCI error: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor 2024-03-05 12:52:04.579 : OCRRAW:4255452928: kgfnConnect3: Got a Connection Error when connecting to ASM. 2024-03-05 12:52:04.581 : OCRRAW:4255452928: kgfnConnect2: failed to connect 2024-03-05 12:52:04.581 : OCRRAW:4255452928: kgfnConnect2Retry: failed to connect connect after 1 attempts, 122s elapsed 2024-03-05 12:52:04.581 : OCRRAW:4255452928: kgfo_kge2slos error stack at kgfoAl06: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor ORA-12514: TNS:listener does not currently know of service requested in connect descriptor 2024-03-05 12:52:04.581 : OCRRAW:4255452928: -- trace dump on error exit -- 2024-03-05 12:52:04.581 : OCRRAW:4255452928: Error [kgfoAl06] in [kgfokge] at kgfo.c:3180 2024-03-05 12:52:04.581 : OCRRAW:4255452928: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor ORA-12514: TNS:listener does not currently know of service requested 2024-03-05 12:52:04.581 : OCRRAW:4255452928: Category: 7 2024-03-05 12:52:04.581 : OCRRAW:4255452928: DepInfo: 12514 2024-03-05 12:52:04.581 : OCRRAW:4255452928: ADR is not properly configured 2024-03-05 12:52:04.581 : OCRRAW:4255452928: -- trace dump end -- OCRASM:4255452928: SLOS : SLOS: cat=7, opn=kgfoAl06, dep=12514, loc=kgfokge 2024-03-05 12:52:04.581 : OCRASM:4255452928: ASM Error Stack : ORA-12514: TNS:listener does not currently know of service requested in connect descriptor ORA-12514: TNS:listener does not currently know of service requested in connect descriptor 2024-03-05 12:52:04.581 : OCRASM:4255452928: proprasmo: kgfoCheckMount returned [7] 2024-03-05 12:52:04.581 : OCRASM:4255452928: proprasmo: The ASM instance is down 2024-03-05 12:52:04.635 : OCRRAW:4255452928: proprioo: Failed to open [+SYSTEMDG/xff-cluster/OCRFILE/registry.255.1072903025]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE. 2024-03-05 12:52:04.635 : OCRRAW:4255452928: proprioo: No OCR/OLR devices are usable OCRUTL:4255452928: u_fill_errorbuf: Error Info : [Insufficient quorum to open OCR devices] default:4255452928: u_set_gbl_comp_error: comptype '107' : error '0' 2024-03-05 12:52:04.635 : OCRRAW:4255452928: proprinit: Could not open raw device 2024-03-05 12:52:04.635 : default:4255452928: a_init:7!: Backend init unsuccessful : [26] 2024-03-05 12:52:04.637 : default:4255452928: clsvactversion:4: Retrieving Active Version from local storage.
通过这里,初步判断是由于节点2访问(DESCRIPTION=(TCP_USER_TIMEOUT=1)(CONNECT_TIMEOUT=60)(EXPIRE_TIME=1)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=节点1私网IP)(PORT=1525)))(CONNECT_DATA=(SERVICE_NAME=+ASM)))异常导致,查看节点1的该监听状态
[grid@xff1 ~]$ lsnrctl status ASMNET1LSNR_ASM LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 05-MAR-2024 13:04:51 Copyright (c) 1991, 2021, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM))) STATUS of the LISTENER ------------------------ Alias ASMNET1LSNR_ASM Version TNSLSNR for Linux: Version 19.0.0.0.0 - Production Start Date 20-MAY-2021 23:53:50 Uptime 25 days 8 hr. 15 min. 15 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/19c/grid/network/admin/listener.ora Listener Log File /u01/app/grid/diag/tnslsnr/xff1/asmnet1lsnr_asm/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=节点1私网IP)(PORT=1525))) The listener supports no services The command completed successfully
发现该监听没有注册服务进去,检查相关listener参数配置
[grid@xff1 ~]$ sqlplus / as sysdba SQL*Plus: Release 19.0.0.0.0 - Production on Tue Mar 5 13:26:29 2024 Version 19.11.0.0.0 Copyright (c) 1982, 2020, Oracle. All rights reserved. Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.11.0.0.0 SQL> show parameter listener; NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ forward_listener string listener_networks string local_listener string remote_listener string
初步判断是由于节点1的ASMNET1LSNR_ASM监听状态异常,很可能是由于asm实例的listener参数异常导致,比较稳妥的解决方案是重启节点1,让其重新生成listener相关参数,实现动态注册,临时解决方法,
[grid@xff1 ~]$ sqlplus / as sysasm SQL*Plus: Release 19.0.0.0.0 - Production on Tue Mar 5 13:05:11 2024 Version 19.11.0.0.0 Copyright (c) 1982, 2020, Oracle. All rights reserved. Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.11.0.0.0 SQL> ALTER SYSTEM SET local_listener ='(ADDRESS=(PROTOCOL=TCP)(HOST=节点1私网IP)(PORT=1525))' sid='+ASM1' SCOPE=MEMORY; System altered. [grid@xff1 ~]$ lsnrctl status ASMNET1LSNR_ASM LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 05-MAR-2024 13:05:21 Copyright (c) 1991, 2021, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM))) STATUS of the LISTENER ------------------------ Alias ASMNET1LSNR_ASM Version TNSLSNR for Linux: Version 19.0.0.0.0 - Production Start Date 20-MAY-2021 23:53:50 Uptime 25 days 8 hr. 15 min. 45 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/19c/grid/network/admin/listener.ora Listener Log File /u01/app/grid/diag/tnslsnr/xff1/asmnet1lsnr_asm/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=ASMNET1LSNR_ASM))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=节点1私网IP)(PORT=1525))) Services Summary... Service "+ASM" has 1 instance(s). Instance "+ASM1", status READY, has 1 handler(s) for this service... Service "+ASM_DATA" has 1 instance(s). Instance "+ASM1", status READY, has 1 handler(s) for this service... Service "+ASM_FRA" has 1 instance(s). Instance "+ASM1", status READY, has 1 handler(s) for this service... Service "+ASM_SYSTEMDG" has 1 instance(s). Instance "+ASM1", status READY, has 1 handler(s) for this service... The command completed successfully [grid@xff1 ~]$
设置节点1的asm实例的local_listener 参数之后,集群启动成功
[grid@xff2 ~]$ crsctl status res -t -init -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.asm 1 ONLINE ONLINE xff2 STABLE ora.cluster_interconnect.haip 1 ONLINE ONLINE xff2 STABLE ora.crf 1 ONLINE ONLINE xff2 STABLE ora.crsd 1 ONLINE ONLINE xff2 STABLE ora.cssd 1 ONLINE ONLINE xff2 STABLE ora.cssdmonitor 1 ONLINE ONLINE xff2 STABLE ora.ctssd 1 ONLINE ONLINE xff2 OBSERVER,STABLE ora.diskmon 1 OFFLINE OFFLINE STABLE ora.drivers.acfs 1 ONLINE ONLINE xff2 STABLE ora.evmd 1 ONLINE ONLINE xff2 STABLE ora.gipcd 1 ONLINE ONLINE xff2 STABLE ora.gpnpd 1 ONLINE ONLINE xff2 STABLE ora.mdnsd 1 ONLINE ONLINE xff2 STABLE ora.storage 1 ONLINE ONLINE xff2 STABLE --------------------------------------------------------------------------------
udev_start导致vip漂移(常见情况:rac在线加盘操作引起)
客户对asm进行扩容,执行udev_start命令之后,所有的vip全部漂移,业务全部中断
优先恢复业务,把所有vip漂移回来
[grid@rac3 ~]$ srvctl relocate vip -i rac1 -n rac1 -f -v VIP was relocated successfully. [grid@rac3 ~]$ srvctl relocate vip -i rac2 -n rac2 -f -v VIP was relocated successfully. [grid@rac3 ~]$ srvctl relocate vip -i rac3 -n rac3 -f -v VIP was relocated successfully. [grid@rac3 ~]$ srvctl relocate vip -i rac4 -n rac4 -f -v VIP was relocated successfully.
出现该问题的原因是由于udev_start命令引起网卡瞬间中断,从而使得vip发生漂移

查看ifcfg配置文件

引起该问题的原因是udev对网卡进行了操作,从而引起该问题,处理建议在对应的ifcfg文件中加上 HOTPLUG=”no” (pulbic,private和其他需要关注的网络)
参考:Network interface going down when dynamically adding disks to storage using udev in RHEL 6 (Doc ID 1569028.1)
