标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (103)
- 数据库 (1,733)
- DB2 (22)
- MySQL (75)
- Oracle (1,584)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (160)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (54)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (580)
- Oracle安装升级 (94)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (82)
- PostgreSQL (27)
- pdu工具 (5)
- PostgreSQL恢复 (9)
- SQL Server (28)
- SQL Server恢复 (9)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- pg单个数据库目录恢复-pdu恢复单个数据库目录数据
- pg删除数据恢复—pdu恢复pg delete数据
- .[OnlyBuy@cyberfear.com].REVRAC勒索mysql恢复
- 表dml操作权限授权给public,导致只读用户失效
- 21c数据库恢复遭遇ora-600 ktugct: corruption detected
- pg_control丢失/损坏处理
- 当前主流数据库版本服务支持周期-202503
- pg启动报invalid checkpoint record处理
- 删除redo导致ORA-00313 ORA-00312故障处理
- Navicat连接postgresql时出现column “datlastsysoid” does not exist错误解决
- aix磁盘损坏oracle数据库恢复
- pg误删除数据恢复(PostgreSQL delete数据恢复)
- PostgreSQL表文件损坏恢复—pdu恢复损坏的表文件
- linux rm -rf 删除数据文件恢复
- PostgreSQL恢复工具—pdu恢复单个表文件
- PostgreSQL恢复工具—pdu工具介绍
- 近1万个数据文件的恢复case
- 不当使用_allow_resetlogs_corruption参数引起ORA-600 2662错误
- CSSD signal 11 in thread clssnmRcfgMgrThread故障处理
- 使用sid方式直接访问pdb(USE_SID_AS_SERVICE_LISTENER)
分类目录归档:Oracle ASM
ASM中磁盘组权限设置
aix平台11gr2单库使用使用grid和oracle用户分别部署gi和db,在添加磁盘的时候,使用设置磁盘所属用户和组为grid与oinstall,设置权限为755.添加磁盘成功后,数据库直接crash.
asm添加磁盘操作
SQL> alter diskgroup DATA add disk '/dev/rhdisk15' NOTE: Assigning number (2,7) to disk (/dev/rhdisk15) NOTE: requesting all-instance membership refresh for group=2 NOTE: initializing header on grp 2 disk DATA_0007 NOTE: requesting all-instance disk validation for group=2 Wed Apr 03 22:09:03 2013 NOTE: skipping rediscovery for group 2/0xa026f7ec (DATA) on local instance. NOTE: requesting all-instance disk validation for group=2 NOTE: skipping rediscovery for group 2/0xa026f7ec (DATA) on local instance. NOTE: initiating PST update: grp = 2 Wed Apr 03 22:09:03 2013 GMON updating group 2 at 21 for pid 17, osid 22610284 NOTE: PST update grp = 2 completed successfully NOTE: membership refresh pending for group 2/0xa026f7ec (DATA) GMON querying group 2 at 22 for pid 13, osid 20643916 NOTE: cache opening disk 7 of grp 2: DWDATAGRP_0007 path:/dev/rhdisk15 GMON querying group 2 at 23 for pid 13, osid 20643916 SUCCESS: refreshed membership for 2/0xa026f7ec (DATA) NOTE: starting rebalance of group 2/0xa026f7ec (DATA) at power 1 SUCCESS: alter diskgroup DATA add disk '/dev/rhdisk15' Starting background process ARB0 Wed Apr 03 22:09:07 2013 ARB0 started with pid=22, OS id=14155890 NOTE: assigning ARB0 to group 2/0xa026f7ec (DATA) with 1 parallel I/O NOTE: Attempting voting file refresh on diskgroup DATA Wed Apr 03 22:09:19 2013 SQL> alter diskgroup DATA add disk '/dev/rhdisk11' Wed Apr 03 22:09:20 2013 NOTE: stopping process ARB0 NOTE: rebalance interrupted for group 2/0xa026f7ec (DATA) NOTE: Assigning number (2,8) to disk (/dev/rhdisk11) NOTE: requesting all-instance membership refresh for group=2 NOTE: initializing header on grp 2 disk DATA_0008 NOTE: requesting all-instance disk validation for group=2 NOTE: skipping rediscovery for group 2/0xa026f7ec (DATA) on local instance. NOTE: requesting all-instance disk validation for group=2 NOTE: skipping rediscovery for group 2/0xa026f7ec (DATA) on local instance. NOTE: initiating PST update: grp = 2 Wed Apr 03 22:09:23 2013 GMON updating group 2 at 24 for pid 17, osid 22610284 NOTE: PST update grp = 2 completed successfully NOTE: membership refresh pending for group 2/0xa026f7ec (DATA) GMON querying group 2 at 25 for pid 13, osid 20643916 NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/rhdisk11 GMON querying group 2 at 26 for pid 13, osid 20643916 SUCCESS: refreshed membership for 2/0xa026f7ec (DATA) NOTE: starting rebalance of group 2/0xa026f7ec (DATA) at power 1 SUCCESS: alter diskgroup DATA add disk '/dev/rhdisk11' Starting background process ARB0 Wed Apr 03 22:09:26 2013 ARB0 started with pid=22, OS id=22872116 NOTE: assigning ARB0 to group 2/0xa026f7ec (DATA) with 1 parallel I/O NOTE: Attempting voting file refresh on diskgroup DATA Wed Apr 03 22:14:41 2013 NOTE: ASM client xifenfei:xifenfei disconnected unexpectedly. NOTE: check client alert log. NOTE: Trace records dumped in trace file /u01/diag/asm/+asm/+ASM/trace/+ASM_ora_15073468.trc Wed Apr 03 22:16:53 2013 NOTE: client xifenfei:xifenfei registered, osid 20709378, mbr 0x0 Wed Apr 03 22:20:33 2013 NOTE: client xifenfei:xifenfei deregistered
这里可看到增加磁盘操作正常并且开始做rebalance,但是也看到关于client xifenfei异常断开连接(本质就是数据库crash)
crash时的alert日志
Wed Apr 03 22:00:00 2013 Setting Resource Manager plan SCHEDULER[0x318B]:DEFAULT_MAINTENANCE_PLAN via scheduler window Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter Wed Apr 03 22:00:00 2013 Starting background process VKRM Wed Apr 03 22:00:00 2013 VKRM started with pid=31, OS id=22413426 Wed Apr 03 22:09:06 2013 ORA-15025: could not open disk "/dev/rhdisk15" ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 11 Wed Apr 03 22:09:06 2013 SUCCESS: disk DATA_0007 (7.2092304189) added to diskgroup DATA Wed Apr 03 22:09:26 2013 ORA-15025: could not open disk "/dev/rhdisk15" ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 11 Wed Apr 03 22:09:26 2013 SUCCESS: disk DATA_0008 (8.2092304190) added to diskgroup DATA Wed Apr 03 22:14:40 2013 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_dbw0_17367438.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 1 logical extent 0 of file 261 in group 2 on disk 7 allocation unit 464 KCF: read, write or open error, block=0x6a online=1 file=1 '+DATA/xifenfei/datafile/system.261.788373447' error=15081 txt: '' Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_dbw0_17367438.trc: Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_dbw0_17367438.trc: ORA-63999: data file suffered media failure ORA-01114: IO error writing block to file 1 (block # 106) ORA-01110: data file 1: '+DATA/xifenfei/datafile/system.261.788373447' ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk DBW0 (ospid: 17367438): terminating the instance due to error 63999
这里可以看到数据库异常crash是因为/dev/rhdisk15没有权限去操作该文件,导致dbw0进程异常,从而出现该数据库crash
尝试重启数据库(asm重启正常)
SQL> startup ORACLE instance started. Total System Global Area 1.2827E+10 bytes Fixed Size 2233480 bytes Variable Size 1711278968 bytes Database Buffers 1.1073E+10 bytes Redo Buffers 40894464 bytes Database mounted. ORA-01113: file 1 needs media recovery ORA-01110: data file 1: '+DATA/xifenfei/datafile/system.261.788373447'
这里提示file 1需要恢复,查看alert日志,出现以下错误
Completed: ALTER DATABASE MOUNT Wed Apr 03 22:17:02 2013 ALTER DATABASE OPEN Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 3 Additional information: 4 Additional information: 4194306 WARNING: Write Failed. group:2 disk:8 AU:462 offset:16384 size:16384 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 261 in group 2 on disk 8 allocation unit 462 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 3 Additional information: 4 Additional information: 4194306 WARNING: Write Failed. group:2 disk:8 AU:690 offset:16384 size:16384 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 3 Additional information: 4 Additional information: 4194306 WARNING: Write Failed. group:2 disk:8 AU:918 offset:16384 size:16384 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 263 in group 2 on disk 8 allocation unit 918 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 262 in group 2 on disk 8 allocation unit 690 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-01110: data file 3: '+DATA/xifenfei/datafile/undotbs1.263.788373475' ORA-01114: IO error writing block to file 3 (block # 1) ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_11534798.trc: ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.262.788373463' ORA-01114: IO error writing block to file 2 (block # 1) ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk
recover database 操作
SQL> recover database; ORA-00283: recovery session canceled due to errors ORA-01201: file 1 header failed to write correctly
Wed Apr 03 22:18:49 2013 ALTER DATABASE RECOVER database Media Recovery Start started logmerger process Wed Apr 03 22:18:50 2013 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pr00_12714126.trc: ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 3 Additional information: 4 Additional information: 4194306 WARNING: Write Failed. group:2 disk:8 AU:462 offset:16384 size:16384 Errors in file /oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_pr00_12714126.trc: ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 13: Permission denied Additional information: 3 Additional information: 4 Additional information: 4194306
依然是这里的提示依然是因为磁盘无读写权限从而出现数据库无法写数据文件问题,修改刚刚加入的磁盘文件权限问为660(4读2写1执行),表明与oinstall相同组的oracle用户对该磁盘也有读写权限.
这个事故是一个很简单,而且随着11g中asm使用grid和oracle用户的客户越来越多,相关的事故也越来越多,因为大多数使用人习惯直接给某个文件授权为755,而在这样的grid和oracle分开安装的系统中,将出现增加磁盘后,数据库crash,而且不能起来(因为oracle用户对磁盘只有读权限,无写权限),一种比较好的规范:在11gr2的asm系统中(grid和oracle用户),建议设置磁盘为grid.oinstall,权限设置为660
发表在 Oracle ASM
2 条评论
asmlib异常报ORA-00600[kfklLibFetchNext00]
一个朋友的历史库出现故障,在linux 4的平台上asm的10.2.0.1的单库,asm使用asmlib来处理。
asm不能正常mount磁盘组,可以看到asmdisk,alert日志报ORA-00600[kfklLibFetchNext00]
操作系统内核是:2.6.9-78
oracleasmlib是:2.0.2-1
asm磁盘组mount失败
--以前故障 SQL> ALTER DISKGROUP ALL MOUNT Thu Sep 6 14:23:16 2012 NOTE: cache registered group DGARC number=1 incarn=0x2bf96274 NOTE: cache registered group DGDATA number=2 incarn=0x2c196275 NOTE: cache registered group DGSYS number=3 incarn=0x2c196276 Thu Sep 6 14:23:16 2012 Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc: ORA-15183: ASMLIB initialization error [driver/agent not installed] Thu Sep 6 14:23:16 2012 Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_10204.trc: ORA-15183: ASMLIB initialization error [/opt/oracle/extapi/64/asm/orcl/1/libasm.so] ORA-15183: ASMLIB initialization error [driver/agent not installed] Thu Sep 6 14:23:16 2012 ERROR: no PST quorum in group 1: required 2, found 0 Thu Sep 6 14:23:16 2012 NOTE: cache dismounting group 1/0x2BF96274 (DGARC) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DGARC was not mounted Thu Sep 6 14:23:16 2012 ERROR: no PST quorum in group 2: required 2, found 0 Thu Sep 6 14:23:16 2012 NOTE: cache dismounting group 2/0x2C196275 (DGDATA) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DGDATA was not mounted Thu Sep 6 14:23:16 2012 ERROR: no PST quorum in group 3: required 2, found 0 Thu Sep 6 14:23:16 2012 NOTE: cache dismounting group 3/0x2C196276 (DGSYS) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DGSYS was not mounted --现在故障 Thu Jan 24 13:49:45 2013 SQL> ALTER DISKGROUP ALL MOUNT Thu Jan 24 13:49:45 2013 NOTE: cache registered group DGARC number=1 incarn=0xf388cee9 NOTE: cache registered group DGDATA number=2 incarn=0xf3a8ceea NOTE: cache registered group DGSYS number=3 incarn=0xf3a8ceeb Thu Jan 24 13:49:45 2013 Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc: ORA-00600: internal error code, arguments: [kfklLibFetchNext00], [18446744073709551614], [0], [], [], [], [], [] Thu Jan 24 13:49:46 2013 Errors in file /opt/app/oracle/admin/+ASM/bdump/+asm_rbal_13449.trc: ORA-00600: internal error code, arguments: [kfklLibFetchNext00], [18446744073709551614], [0], [], [], [], [], [] Thu Jan 24 13:49:46 2013 ERROR: no PST quorum in group 1: required 2, found 0 Thu Jan 24 13:49:46 2013 NOTE: cache dismounting group 1/0xF388CEE9 (DGARC) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DGARC was not mounted Thu Jan 24 13:49:46 2013 ERROR: no PST quorum in group 2: required 2, found 0 Thu Jan 24 13:49:46 2013 NOTE: cache dismounting group 2/0xF3A8CEEA (DGDATA) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DGDATA was not mounted Thu Jan 24 13:49:46 2013 ERROR: no PST quorum in group 3: required 2, found 0 Thu Jan 24 13:49:46 2013 NOTE: cache dismounting group 3/0xF3A8CEEB (DGSYS) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DGSYS was not mounted Shutting down instance: further logons disabled
trace文件信息
----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedst()+31 call ksedst1() 000000000 ? 000000001 ? 000000000 ? 000000000 ? 000000000 ? 000000001 ? ksedmp()+610 call ksedst() 000000000 ? 000000001 ? 000000000 ? 000000000 ? 000000000 ? 000000001 ? ksfdmp()+21 call ksedmp() 000000003 ? 000000001 ? 000000000 ? 000000000 ? 000000000 ? 000000001 ? kgerinv()+161 call ksfdmp() 000000003 ? 000000001 ? 000000000 ? 000000000 ? 000000000 ? 000000001 ? kgesinv()+33 call kgerinv() 006469D40 ? 0064E1C58 ? 000000000 ? 000000000 ? 000000001 ? 000000001 ? kgesinw()+166 call kgesinv() 006469D40 ? 0064E1C58 ? 000000000 ? 000000000 ? 000000001 ? 000000001 ? kfklLibScanNext()+2 call kgesinw() 006469D40 ? 000000000 ? 39 000000001 ? 000000000 ? FFFFFFFFFFFFFFFE ? 000000000 ? kfkLibFetchNext()+3 call kfklLibScanNext() 0064DDD70 ? 7FBFFFDCD0 ? 43 000000001 ? 000000000 ? FFFFFFFFFFFFFFFE ? 000000000 ? kfuitrnInit()+524 call kfkLibFetchNext() 006469D40 ? 2A971DFF90 ? 000000001 ? 000000000 ? FFFFFFFFFFFFFFFE ? 000000000 ? kfkLibIterInit()+18 call kfuitrnInit() 006469D40 ? 2A971DFCB0 ? 0 2A971DFF90 ? 000000009 ? 000000009 ? 000000000 ? kfkLoadAllLibs()+36 call kfkLibIterInit() 000000000 ? 00646C7E0 ? 3 2A971DFF90 ? 000000009 ? 000000009 ? 000000000 ? kfkDiscoverString() call kfkLoadAllLibs() 000000000 ? 00646C7E0 ? +107 2A971DFF90 ? 000000009 ? 000000009 ? 000000000 ? Cannot find symbol Cannot find symbol Cannot find symbol kfdDiscoverString() call kfkDiscoverString() 067A53768 ? 00646C7E0 ? +28 2A971DFF90 ? 000000009 ? 000000009 ? 000000000 ? kfdDiscoverShallow( call kfdDiscoverString() 067A53768 ? 000000000 ? )+315 2A971DFF90 ? 000000009 ? 000000009 ? 000000000 ? kfgbDriver()+1174 call kfdDiscoverShallow( 000000180 ? 000000000 ? ) 2A971DFF90 ? 000000009 ? 000000009 ? 000000000 ? ksbabs()+564 call kfgbDriver() 7FBFFFE5C0 ? 000000048 ? 000000000 ? 000000009 ? 000000009 ? 000000000 ? ksbrdp()+727 call ksbabs() 7FBFFFE5C0 ? 000000048 ? 000000000 ? 000000009 ? 000000009 ? 000000000 ? opirip()+616 call ksbrdp() 7FBFFFE5C0 ? 000000048 ? 000000001 ? 06002C770 ? 000000009 ? 000000000 ? opidrv()+582 call opirip() 000000032 ? 000000004 ? 7FBFFFF6C8 ? 06002C770 ? 000000009 ? 000000000 ? sou2o()+114 call opidrv() 000000032 ? 000000004 ? 7FBFFFF6C8 ? 06002C770 ? 000000009 ? 000000000 ? opimai_real()+317 call sou2o() 7FBFFFF6A0 ? 000000032 ? 000000004 ? 7FBFFFF6C8 ? 000000009 ? 000000000 ? main()+116 call opimai_real() 000000003 ? 7FBFFFF730 ? 000000004 ? 7FBFFFF6C8 ? 000000009 ? 000000000 ? <0x3c9fb1c40b> call main() 000000003 ? 7FBFFFF730 ? 000000004 ? 7FBFFFF6C8 ? 000000009 ? 000000000 ? --------------------- Binary Stack Dump ---------------------
因为客户的库是一个历史库,基本上不怎么使用,在2012年启动asm就出现了ORA-15183错误,然后在2013年重启机器后,再次启动asm就出现了ORA-00600[kfklLibFetchNext00]错误,通过2012年的错误提示,我们大概可以判断出来该问题和ASMLIB有关系,查询mos发现429945.1,发现Call Stack Trace完全一致,可以定位是该问题(如果想深入分析,可以通过strace继续分析)
ORA-600: [kfklLibFetchNext00], [18446744073709551614], [0] when mounting diskgroup in ASM
Applies to: Linux OS - Version: 2.0.1-1 and later [Release: RHEL4 and later ] Information in this document applies to any platform. Linux Kernel - Version: 2.0.1 Symptoms 3 RAC db. 2 nodes are up and functioning except for 1 node - ASM did not come back up after the reboot eventhough all disks show available from asmlib's perspective: Changes All that was done with resources were stopped on Node1 and an extra LUN added. A reboot was then performed. Cause The cause of the issue is libasm.o corruption Ran the following to confirm that disks are ok: /dev/oracleasm listdisks /usr/sbin/asmtool -I -l /dev/oracleasm -n /dev/sdg1 -a label /usr/sbin/oracleasm-discover 'ORCL:*' dd if=/dev/sdg1 bs=8192 count=1 | od -c ==> output checked out fine . kfod asm_diskstring='ORCL:*' ==> this failed on Node1 KFOD-00600: file not found; argument [610][kfklLibFetchNext00] even though libasm.o exists You might see the following call stack as well ----- Call Stack Trace ----- kfklLibScanNext kfkLibFetchNext kfuitrnInit kfkLibIterInit kfkLoadAllLibs kfkDiscoverString kfdDiscoverString kfdDiscoverShallow kfgbDriver strace showed Node1-failing ------- stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0777, st_size=19344, ...}) = 0 getdents64(4, /* 0 entries */, 4096) = 0 <<<< close(4) = 0 open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1 ENOENT (No such file or directory) open("/opt/oracle/product/10.2.0/db_1/rdbms/mesg/kfodus.msb", O_RDONLY) = -1 ENOENT (No such file or directory) fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9750d000 write(1, "KFOD-00600: file not found; argu"..., 69) = 69 Node2-working ----- stat("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0 open("/opt/oracle/extapi/64/asm/orcl/1/libasm.so", O_RDONLY) = 4 read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\23\0"..., 832) = 832 fstat(4, {st_mode=S_IFREG|0755, st_size=19344, ...}) = 0 mmap(NULL, 1066104, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) 0x2a9750d000
通过MOS的描述,可以明确定位到问题是:libasm.o异常导致
解决方案
To implement the solution, reinstall the ASMlib RPM >rpm -Uvh oracleasmlib-2.0.0-1 This replaces the /opt/oracle/extapi/64/asm/orcl/1/libasm.so
How to Get the Contents of an Spfile on ASM when ASM/GRID is down
在11g中asm的spfile文件是存放在asm中的,如果asm不能正常启动是否可以获得其spfile信息.这里通过gpnptool来获得spfile文件信息,给大家提供了在11gr2的rac是怎么利用asm 中的spfile启动asm的思路
asm spfile信息
[grid@rac1 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Fri Dec 21 01:41:31 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> create pfile='/tmp/pfile' from spfile; File created. SQL> !more /tmp/pfile +ASM1.__oracle_base='/u01/app/gridbase'#ORACLE_BASE set from in memory value +ASM2.asm_diskgroups='XIFENFEI'#Manual Mount +ASM1.asm_diskgroups='XIFENFEI'#Manual Mount *.asm_diskstring='/dev/oracleasm/disks/*' *.asm_power_limit=1 *.diagnostic_dest='/u01/app/gridbase' *.instance_type='asm' *.large_pool_size=12M *.remote_login_passwordfile='EXCLUSIVE'
关闭集群(asm已关闭)
[root@rac1 ~]# crsctl stop crs [root@rac1 ~]# ps -ef|grep pmon root 8768 6372 0 02:53 pts/1 00:00:00 grep pmon [root@rac1 ~]# crsctl stat res CRS-4535: Cannot communicate with Cluster Ready Services CRS-4000: Command Status failed, or completed with errors.
gpnptool命令获取asm disk信息
[root@rac1 ~]# gpnptool get -o- <?xml version="1.0" encoding="UTF-8"?> <gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="4" ClusterUId="885339054e904f1dbfa646b41d7a0edb" ClusterName="rac-cluster" PALocation=""> <gpnp:Network-Profile> <gpnp:HostNetwork id="gen" HostName="*"> <gpnp:Network id="net1" IP="192.168.1.0" Adapter="eth0" Use="public"/> <gpnp:Network id="net2" IP="10.10.1.0" Adapter="eth1" Use="cluster_interconnect"/> </gpnp:HostNetwork> </gpnp:Network-Profile> <orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/> --重点关注信息(asm disk 信息) <orcl:ASM-Profile id="asm" DiscoveryString="/dev/oracleasm/disks/*" SPFile="+DATA/rac-cluster/asmparameterfile/registry.253.776955291"/> <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <ds:SignedInfo> <ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/> <ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/> <ds:Reference URI=""> <ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/> <ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/> </ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/> <ds:DigestValue>T2Q3r+5sER2Rp0VfeqzYh461f2s=</ds:DigestValue> </ds:Reference> </ds:SignedInfo> <ds:SignatureValue> LwcQEtlsPGfywzdYJrOqiTp4cZNFGB/S9Ts8OCvYOGf/Z8HDT2yN5p2nCuxArUfW+KzaPzPHHihpRVaTcAY31nJ2Rcf2vMqYp4e++shliQXC8mg 1oGxQGifkjZwA4pTTEK5MBmr4FTZnR3VArZjjVfJdsmOMfyH4YeSMU5HPjdA= </ds:SignatureValue> </ds:Signature> </gpnp:GPnP-Profile> Success. Error CLSGPNP_NO_DAEMON getting profile.
获得asm spfile信息
通过kfed找磁盘中的kfdhdb.sp|ausize来获得asm spfile相关信息
[root@rac1 ~]# ls /dev/oracleasm/disks/ VOL1 VOL2 VOL3 VOL4 [root@rac1 ~]# kfed dev=/dev/oracleasm/disks/VOL1 op=READ | egrep "kfdhdb.sp|ausize" kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.spfile: 22 ; 0x0f4: 0x00000016 kfdhdb.spfflg: 1 ; 0x0f8: 0x00000001 [root@rac1 ~]# kfed dev=/dev/oracleasm/disks/VOL2 op=READ | egrep "kfdhdb.sp|ausize" kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000 [root@rac1 ~]# kfed dev=/dev/oracleasm/disks/VOL3 op=READ | egrep "kfdhdb.sp|ausize" kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000 [root@rac1 ~]# kfed dev=/dev/oracleasm/disks/VOL4 op=READ | egrep "kfdhdb.sp|ausize" kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000
这里可以看出来asm spfile信息在磁盘VOL1中,spfile从第22个au开始,1个au(1M).
获得asm spfile 内容
[root@rac1 ~]# dd if=/dev/oracleasm/disks/VOL1 bs=1M skip=22 count=1 > /tmp/spfile 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 1.47474 seconds, 711 kB/s [root@rac1 ~]# strings /tmp/spfile +ASM1.__oracle_base='/u01/app/gridbase'#ORACLE_BASE set from in memory value +ASM2.asm_diskgroups='XIFENFEI'#Manual Mount +ASM1.asm_diskgroups='XIFENFEI'#Manual Mount *.asm_diskstring='/dev/oracleasm/disks/*' *.asm_power_limit=1 *.diagnostic_dest='/u01/app/gridbase' *.instance_type='asm' *.large_pool_size=12M *.remote_login_passwordfile='EXCLUSIVE'
通过对比发现,在asm实例未正常启动的情况下,也可以通过其他方面来获得asm spfile文件.本实验只是对于spfile在asm中位置的定位(大家去猜测11gr2的rac是怎么利用asm 中的spfile启动asm的思路),实际生产环境中请勿模仿,gpnptool命令有较大风险
发表在 Oracle ASM
评论关闭