联系:手机/微信(+86 17813235971) QQ(107644445)
标题:CSSD signal 11 in thread clssnmRcfgMgrThread故障处理
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
一个客户,集群无法启动,只能启动到如下状态
查看cssd日志有CSSD signal 11 in thread clssnmRcfgMgrThread报错
2025-02-21 18:21:25.500: [ CSSD][2788693760]clssnmDoSyncUpdate: node(2) is transitioning from joining state to active state 2025-02-21 18:21:25.500: [ CSSD][2788693760]clssnmDoSyncUpdate: Wait for 0 vote ack(s) 2025-02-21 18:21:25.500: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:25.700: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:25.901: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:25.995: [ CSSD][2801538816]clssnmvDiskPing: Writing with status 0x2, timestamp 1740133285/5870104 2025-02-21 18:21:25.997: [ CSSD][2799818496]clssnmvDiskKillCheck: not evicted, file /dev/dm-4 flags 0x00000000, kill block unique 0, my unique 1740133265 2025-02-21 18:21:26.000: [ CSSD][2793424640]clssgmWaitOnEventValue: after CmInfo State val 3, eval 2 waited 500 2025-02-21 18:21:26.101: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:26.302: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:26.497: [ CSSD][2801538816]clssnmvDiskPing: Writing with status 0x2, timestamp 1740133286/5870604 2025-02-21 18:21:26.502: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:26.702: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:26.902: [ CSSD][2788693760]clssnmDoSyncUpdate: waiting to update states on disk 2025-02-21 18:21:26.997: [ CSSD][2799818496]clssnmvDiskKillCheck: not evicted, file /dev/dm-4 flags 0x00000000, kill block unique 0, my unique 1740133265 2025-02-21 18:21:26.997: [ CSSD][2801538816]clssnmvDiskPing: Writing with status 0x2, timestamp 1740133286/5871114 2025-02-21 18:21:27.000: [ CSSD][2793424640]clssgmWaitOnEventValue: after CmInfo State val 3, eval 2 waited 0 2025-02-21 18:21:27.102: [ CSSD][2788693760]clssnmCheckDskInfo: Checking disk info... 2025-02-21 18:21:27.102: [ CSSD][2788693760]clssnmCheckDskInfo: diskTimeout set to (200000)ms 2025-02-21 18:21:27.103: [ CSSD][2788693760]################################### 2025-02-21 18:21:27.103: [ CSSD][2788693760]clssscExit: CSSD signal 11 in thread clssnmRcfgMgrThread 2025-02-21 18:21:27.103: [ CSSD][2788693760]################################### 2025-02-21 18:21:27.103: [ CSSD][2788693760](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally 2025-02-21 18:21:27.103: [ CSSD][2788693760] ----- Call Stack Trace ----- 2025-02-21 18:21:27.103: [ CSSD][2788693760]calling call entry argument values in hex 2025-02-21 18:21:27.103: [ CSSD][2788693760]location type point (? means dubious value) 2025-02-21 18:21:27.103: [ CSSD][2788693760]-------------------- -------- -------------------- ---------------------------- 2025-02-21 18:21:27.109: [ CSSD][2788693760]clssscExit()+745 call kgdsdst() 000000000 ? 000000000 ? 2025-02-21 18:21:27.109: [ CSSD][2788693760] 7F9EA637A650 ? 7F9EA637A728 ? 2025-02-21 18:21:27.109: [ CSSD][2788693760] 7F9EA637F1D0 ? 000000003 ? 2025-02-21 18:21:27.109: [ CSSD][2788693760]s0clsssc_sighandler call clssscExit() 001FB9FA0 ? 000000002 ? 2025-02-21 18:21:27.109: [ CSSD][2788693760]()+616 7F9EA637A650 ? 7F9EA637A728 ? 2025-02-21 18:21:27.109: [ CSSD][2788693760] 7F9EA637F1D0 ? 000000003 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]__sighandler() call s0clsssc_sighandler 00000000B ? 000000002 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] () 7F9EA637A650 ? 7F9EA637A728 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 7F9EA637F1D0 ? 000000003 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]clssnmCheckSplit()+ signal __sighandler() 001BEE8A8 ? 000000000 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]378 002039A80 ? 000000001 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 0004D2B40 ? 7F9EA63803C0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]clssnmCheckDskInfo( call clssnmCheckSplit() 001FB9FA0 ? 001DC83F0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760])+387 000030D40 ? 000000001 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 0004D2B40 ? 7F9EA63803C0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]clssnmDoSyncUpdate( call clssnmCheckDskInfo( 001FB9FA0 ? 001DC83F0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760])+4692 ) 000000001 ? 000000001 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 0004D2B40 ? 7F9EA63803C0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]clssnmLocalJoinEven call clssnmDoSyncUpdate( 001FB9FA0 ? 001DC83F0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]t()+3992 ) FFFFFFFFFFFFFFFF ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 000000001 ? 7F9EA6380D20 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 7F9EA63803C0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]clssnmRcfgMgrThread call clssnmLocalJoinEven 001FB9FA0 ? 001DC83F0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]()+2290 t() FFFFFFFFFFFFFFFF ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 000000001 ? 7F9EA6380D20 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 7F9EA63803C0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]clssscthrdmain()+25 call clssnmRcfgMgrThread 001FB9FA0 ? 001DC83F0 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760]3 () FFFFFFFFFFFFFFFF ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 000000001 ? 7F9EA6380D20 ? 2025-02-21 18:21:27.110: [ CSSD][2788693760] 7F9EA63803C0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760]start_thread()+209 call clssscthrdmain() 001FB9FA0 ? 001DC83F0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] FFFFFFFFFFFFFFFF ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 000000001 ? 7F9EA6380D20 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 7F9EA63803C0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760]clone()+109 call start_thread() 7F9EA6381700 ? 001DC83F0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] FFFFFFFFFFFFFFFF ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 000000001 ? 7F9EA6380D20 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 7F9EA63803C0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760]0000000000000000 call clone() 7F9EA6381700 ? 001DC83F0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] FFFFFFFFFFFFFFFF ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 000000001 ? 7F9EA6380D20 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 7F9EA63803C0 ? 2025-02-21 18:21:27.111: [ CSSD][2788693760] 2025-02-21 18:21:27.111: [ CSSD][2788693760]--------------------- Binary Stack Dump ---------------------
这里提示表决盘超时,尝试启动nocrs貌似,在表决盘存在的情况下,启动依旧失败,通过处理让启动过程不读表决盘,启动nocrs模式成功,并mount其他业务磁盘组
确认其他磁盘没有问题,重建crs磁盘组
SQL> create diskgroup OCR external redundancy disk '/dev/dm-4' force attribute 'COMPATIBLE.ASM' = '11.2.0'; # ocrconfig -restore /u01/app/11.2.0.3/grid/cdata/scan/backup00.ocr # crsctl replace votedisk +OCR SQL> create spfile from pfile='/tmp/pfile.asm';
然后重启crs恢复正常