标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 kfed MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (102)
- 数据库 (1,670)
- DB2 (22)
- MySQL (73)
- Oracle (1,532)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (21)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (14)
- ORACLE 21C (3)
- Oracle 23ai (7)
- Oracle ASM (65)
- Oracle Bug (8)
- Oracle RAC (52)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (560)
- Oracle安装升级 (91)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (78)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- ORA-600 krse_arc_complete.4
- Oracle 19c 202410补丁(RUs+OJVM)
- ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复
- .mkp扩展名oracle数据文件加密恢复
- 清空redo,导致ORA-27048: skgfifi: file header information is invalid
- A_H_README_TO_RECOVER勒索恢复
- 通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评
- ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
- ORA-01092 ORA-00604 ORA-01558故障处理
- ORA-65088: database open should be retried
- Oracle 19c异常恢复—ORA-01209/ORA-65088
- ORA-600 16703故障再现
- 数据库启动报ORA-27102 OSD-00026 O/S-Error: (OS 1455)
- .[metro777@cock.li].Elbie勒索病毒加密数据库恢复
- 应用连接错误,初始化mysql数据库恢复
- RAC默认服务配置优先节点
- Oracle 19c RAC 替换私网操作
- 监听报TNS-12541 TNS-12560 TNS-00511错误
- drop tablespace xxx including contents恢复
- Linux 8 修改网卡名称
标签归档:root.sh
11.2.0.4 打patch遭遇extjobO: Operation not permitted处理
有一段时间没有给11.2.0.4打psu补丁了,今天晚上在给一个客户打的过程中发现类似错误
OPatch found the word "error" in the stderr of the make command. Please look at this stderr. You can re-run this make command. Stderr output: chmod: changing permissions of `/u01/app/oracle/product/11.2.0/db_1/bin/extjobO': Operation not permitted make: [iextjob] Error 1 (ignored) Composite patch 31537677 successfully applied. OPatch Session completed with warnings. Log file location: /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2021-04-16_23-11-48PM_1.log OPatch completed with warnings.
对比extjob权限信息
[oracle@localhost trace]$ ls -l /u01/app/oracle/product/11.2.0/db_1/bin/extjob* -rwxr-xr-x 1 oracle oinstall 1254496 Apr 16 23:15 /u01/app/oracle/product/11.2.0/db_1/bin/extjob -rwx------ 1 oracle oinstall 1254496 Apr 16 23:15 /u01/app/oracle/product/11.2.0/db_1/bin/extjobo -rwsr-x--- 1 root oinstall 1254244 Feb 13 2019 /u01/app/oracle/product/11.2.0/db_1/bin/extjobO -rw------- 1 oracle oinstall 1254244 Feb 13 2019 /u01/app/oracle/product/11.2.0/db_1/bin/extjoboO
证明新的extjob程序确实权限不对,不过根据官方的建议可以忽略这个,参考link:
https://updates.oracle.com/Orion/Services/download?type=readme&aru=20775452
Applying Proactive Bundle / PSU Patch fails with Error: “chmod: changing permissions of `$ORACLE_HOME/bin/extjobO’: Operation not permitted” (Doc ID 2265726.1)
给出来的理由是执行root.sh之后权限会变成正常的,另外这个external jobs基本上很少人会使用到,尝试执行root.sh修复
[root@localhost home]# /u01/app/oracle/product/11.2.0/db_1/root.sh Performing root user operation for Oracle 11g The following environment variables are set as: ORACLE_OWNER= oracle ORACLE_HOME= /u01/app/oracle/product/11.2.0/db_1 Enter the full pathname of the local bin directory: [/usr/local/bin]: The contents of "dbhome" have not changed. No need to overwrite. The contents of "oraenv" have not changed. No need to overwrite. The contents of "coraenv" have not changed. No need to overwrite. Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root script. Now product-specific root actions will be performed. Finished product-specific root actions. [root@localhost home]# ls -l /u01/app/oracle/product/11.2.0/db_1/bin/extjob* -rwsr-x--- 1 root oinstall 1254496 Apr 16 23:15 /u01/app/oracle/product/11.2.0/db_1/bin/extjob -rwx------ 1 oracle oinstall 1254496 Apr 16 23:15 /u01/app/oracle/product/11.2.0/db_1/bin/extjobo -rwsr-x--- 1 root oinstall 1254244 Feb 13 2019 /u01/app/oracle/product/11.2.0/db_1/bin/extjobO -rw------- 1 oracle oinstall 1254244 Feb 13 2019 /u01/app/oracle/product/11.2.0/db_1/bin/extjoboO
确实通过执行root.sh修复该问题.
多cpu环境中运行root.sh失败,asm报ORA-04031
有朋友和我反馈,说他们在装linux 6.5上面装11.2.0.3的rac出现异常,root.sh在第一个节点执行就失败了,请求帮助
根据上面记录,查看asmca日志
[main] [ 2015-07-24 12:49:35.885 CST ] [SQLEngine.reInitialize:738] Reinitializing SQLEngine... [main] [ 2015-07-24 12:49:35.885 CST ] [OracleHome.getVersion:889] OracleHome.getVersion called. Current Version: 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.885 CST ] [OracleHome.getVersion:957] Current Version From Inventory: 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.885 CST ] [OracleHome.getVersion:889] OracleHome.getVersion called. Current Version: 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.886 CST ] [OracleHome.getVersion:957] Current Version From Inventory: 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.886 CST ] [OracleHome.getVersion:889] OracleHome.getVersion called. Current Version: 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.886 CST ] [OracleHome.getVersion:957] Current Version From Inventory: 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.886 CST ] [SQLPlusEngine.getCmmdParams:222] m_home 11.2.0.3.0 [main] [ 2015-07-24 12:49:35.887 CST ] [SQLPlusEngine.getCmmdParams:223] version > 112 true [main] [ 2015-07-24 12:49:35.887 CST ] [SQLEngine.getEnvParams:555] Default NLS_LANG: AMERICAN_AMERICA.AL32UTF8 [main] [ 2015-07-24 12:49:35.887 CST ] [SQLEngine.getEnvParams:565] NLS_LANG: AMERICAN_AMERICA.AL32UTF8 [main] [ 2015-07-24 12:49:35.888 CST ] [SQLEngine.initialize:325] Execing SQLPLUS/SVRMGR process... [main] [ 2015-07-24 12:49:35.900 CST ] [SQLEngine.initialize:362] m_bReaderStarted: false [main] [ 2015-07-24 12:49:35.900 CST ] [SQLEngine.initialize:366] Starting Reader Thread... [main] [ 2015-07-24 12:49:35.901 CST ] [SQLEngine.initialize:415] Waiting for m_bReaderStarted to be true [main] [ 2015-07-24 12:49:35.972 CST ] [SQLEngine.done:2189] Done called [main] [ 2015-07-24 12:49:35.972 CST ] [UsmcaLogger.logException:173] SEVERE:method oracle.sysman.assistants.usmca.backend.USMInstance:configureLocalASM [main] [ 2015-07-24 12:49:35.973 CST ] [UsmcaLogger.logException:174] ORA-01012: not logged on [main] [ 2015-07-24 12:49:35.973 CST ] [UsmcaLogger.logException:175] oracle.sysman.assistants.util.sqlEngine.SQLFatalErrorException: ORA-01012: not logged on oracle.sysman.assistants.util.sqlEngine.SQLEngine.executeImpl(SQLEngine.java:1658) oracle.sysman.assistants.util.sqlEngine.SQLEngine.executeQuery(SQLEngine.java:831) oracle.sysman.assistants.usmca.backend.USMInstance.configureLocalASM(USMInstance.java:3036) oracle.sysman.assistants.usmca.service.UsmcaService.configureLocalASM(UsmcaService.java:1049) oracle.sysman.assistants.usmca.model.UsmcaModel.performConfigureLocalASM(UsmcaModel.java:944) oracle.sysman.assistants.usmca.model.UsmcaModel.performOperation(UsmcaModel.java:797) oracle.sysman.assistants.usmca.Usmca.execute(Usmca.java:174) oracle.sysman.assistants.usmca.Usmca.main(Usmca.java:369) [main] [ 2015-07-24 12:49:35.989 CST ] [UsmcaLogger.logException:173] SEVERE:method oracle.sysman.assistants.usmca.backend.USMInstance:configureLocalASM [main] [ 2015-07-24 12:49:35.989 CST ] [UsmcaLogger.logException:174] ORA-03113: end-of-file on communication channel [main] [ 2015-07-24 12:49:35.989 CST ] [UsmcaLogger.logException:175] oracle.sysman.assistants.util.sqlEngine.SQLFatalErrorException: ORA-03113: end-of-file on communication channel
这里可以看出来,asm实例无法登陆(ORA-01012和ORA-03113),根据这样的错误,分析asm日志
Reconfiguration complete Fri Jul 24 12:49:29 2015 LCK0 started with pid=22, OS id=46913 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lmd0_46887.trc (incident=81): ORA-04031: unable to allocate 7072 bytes of shared memory ("shared pool","unknown object","sga heap(1,1)","ges resource ") Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_81/+ASM1_lmd0_46887_i81.trc Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lck0_46913.trc (incident=177): ORA-04031: unable to allocate 760 bytes of shared memory ("shared pool","unknown object","KKSSP^1343","kglss") Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_177/+ASM1_lck0_46913_i177.trc Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_46885.trc (incident=73): ORA-04031: unable to allocate 632 bytes of shared memory ("shared pool","unknown object","sga heap(1,1)","name-service ") Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_73/+ASM1_lmon_46885_i73.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_lck0_46913.trc: ORA-04031: unable to allocate 760 bytes of shared memory ("shared pool","unknown object","KKSSP^1343","kglss") System state dump requested by (instance=1, osid=46913 (LCK0)), summary=[abnormal instance termination]. System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_diag_46879.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. LCK0 (ospid: 46913): terminating the instance due to error 4031 Fri Jul 24 12:49:35 2015 ORA-1092 : opitsk aborting process Instance terminated by LCK0, pid = 46913
进一步分析asm日志,发现是大家熟悉的asm的ORA-4031问题,那就是说明数据库在执行root.sh的时候使用默认参数文件启动asm的时候shared pool不够大(根据ORACLE最佳实践,建议memory_target=1536M及其以上值),从而出现该问题。类似Bug 14292825 ORA-4031 in ASM as default memory parameters values for 11.2 ASM instances low,根据官方描述该问题在11.2.0.4中修复
通过asm日志发现相关默认值配置
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options. ORACLE_HOME = /u01/app/11.2.0/grid System name: Linux Node name: RAC01 Release: 2.6.32-358.el6.x86_64 Version: #1 SMP Tue Jan 29 11:47:41 EST 2013 Machine: x86_64 Using parameter settings in client-side pfile /u01/app/11.2.0/grid/dbs/init+ASM1.ora on machine RAC01 System parameters with non-default values: large_pool_size = 16M instance_type = "asm" remote_login_passwordfile= "EXCLUSIVE" asm_power_limit = 1 diagnostic_dest = "/u01/app/grid" Cluster communication is configured to use the following interface(s) for this instance 10.10.10.31 cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2 Fri Jul 24 12:49:27 2015
通过查询/proc/cpuinfo,检查cpu数量
processor : 191 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E7-8850 v2 @ 2.30GHz stepping : 7 cpu MHz : 1200.000 cache size : 24576 KB physical id : 7 siblings : 24 core id : 13 cpu cores : 12 apicid : 251 initial apicid : 251 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes
而根据How To Determine The Default Number Of Subpools Allocated During Startup (Doc ID 455179.1)中描述
最多7个subpool(这里一共有192个cpu,因此subpool数量为7)
每个suppool最少512m内存,因此shared pool最小需要3.5G(而默认值几百M,远远不够)
由于cpu多,导致shared pool的Subpools 更加多,使得shared pool的需求量更加大。至此本次故障原因可以总结:
由于cpu较多,需要更多的shared pool,而11.2.0.3中由于asm默认内存分配较少,导致在asm启动之时出现shared pool不足(本身默认值小,而且shared pool需求大,从而出现了ORA-04031就不奇怪了),因为运行root.sh过程中asm无法正常启动,从而使得root.sh运行失败。
处理办法:临时disable部分cpu,然后重新执行root.sh,修改asm内存分配,再enable cpu.
特别说明:此故障acs的兄弟遇到过,所以这次我能够快速反应,感谢acs兄弟们的帮忙,另外有权限的朋友可以看看:3-10479952701和3-7976215751等sr描述