标签云
asm恢复 bbed bootstrap$ dul In Memory kcbzib_kcrsds_1 kccpb_sanity_check_2 MySQL恢复 ORA-00312 ORA-00607 ORA-00704 ORA-00742 ORA-01110 ORA-01555 ORA-01578 ORA-08103 ORA-600 2131 ORA-600 2662 ORA-600 2663 ORA-600 3020 ORA-600 4000 ORA-600 4137 ORA-600 4193 ORA-600 4194 ORA-600 16703 ORA-600 kcbzib_kcrsds_1 ORA-600 KCLCHKBLK_4 ORA-15042 ORA-15196 ORACLE 12C oracle dul ORACLE PATCH Oracle Recovery Tools oracle加密恢复 oracle勒索 oracle勒索恢复 oracle异常恢复 Oracle 恢复 ORACLE恢复 ORACLE数据库恢复 oracle 比特币 OSD-04016 YOUR FILES ARE ENCRYPTED 勒索恢复 比特币加密文章分类
- Others (2)
- 中间件 (2)
- WebLogic (2)
- 操作系统 (103)
- 数据库 (1,702)
- DB2 (22)
- MySQL (74)
- Oracle (1,563)
- Data Guard (52)
- EXADATA (8)
- GoldenGate (24)
- ORA-xxxxx (159)
- ORACLE 12C (72)
- ORACLE 18C (6)
- ORACLE 19C (15)
- ORACLE 21C (3)
- Oracle 23ai (8)
- Oracle ASM (68)
- Oracle Bug (8)
- Oracle RAC (53)
- Oracle 安全 (6)
- Oracle 开发 (28)
- Oracle 监听 (28)
- Oracle备份恢复 (571)
- Oracle安装升级 (94)
- Oracle性能优化 (62)
- 专题索引 (5)
- 勒索恢复 (81)
- PostgreSQL (18)
- PostgreSQL恢复 (6)
- SQL Server (27)
- SQL Server恢复 (8)
- TimesTen (7)
- 达梦数据库 (2)
- 生活娱乐 (2)
- 至理名言 (11)
- 虚拟化 (2)
- VMware (2)
- 软件开发 (37)
- Asp.Net (9)
- JavaScript (12)
- PHP (2)
- 小工具 (20)
-
最近发表
- Oracle各种类型坏块说明和处理
- fio测试io,导致磁盘文件系统损坏故障恢复
- ORA-742 写丢失常见bug记录
- Oracle 19c 202501补丁(RUs+OJVM)-19.26
- 避免 19c 数据库性能问题需要考虑的事项 (Doc ID 3050476.1)
- Bug 21915719 Database hang or may fail to OPEN in 12c IBM AIX or HPUX Itanium – ORA-742, DEADLOCK or ORA-600 [kcrfrgv_nextlwn_scn] ORA-600 [krr_process_read_error_2]
- ORA-600 ktuPopDictI_1恢复
- impdp导入数据丢失sys授权问题分析
- impdp 创建index提示ORA-00942: table or view does not exist
- 数据泵导出 (expdp) 和导入 (impdp)工具性能降低分析参考
- 19c非归档数据库断电导致ORA-00742故障恢复
- Oracle 19c – 手动升级到 Non-CDB Oracle Database 19c 的完整核对清单
- sqlite数据库简单操作
- Oracle 暂定和恢复功能
- .pzpq扩展名勒索恢复
- Oracle read only用户—23ai新特性:只读用户
- 迁移awr快照数据到自定义表空间
- .hmallox加密mariadb/mysql数据库恢复
- 2025年首个故障恢复—ORA-600 kcbzib_kcrsds_1
- 第一例Oracle 21c恢复咨询
分类目录归档:Oracle性能优化
CURSOR_SHARING=SIMILAR引起的悲剧
一个客户反馈说有一系统经常性负载比较高,让我帮忙分析原因
系统负载情况
[oracle@zwq-kfdialdb ~]$ top -c top - 17:11:06 up 78 days, 1:12, 5 users, load average: 124.83, 125.90, 112.13 Tasks: 836 total, 152 running, 684 sleeping, 0 stopped, 0 zombie Cpu(s): 98.1%us, 0.1%sy, 0.0%ni, 1.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 132070908k total, 90494280k used, 41576628k free, 1147384k buffers Swap: 67108856k total, 0k used, 67108856k free, 79109904k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12179 oracle 15 0 32.5g 142m 139m R 39.0 0.1 6:09.20 oracleahdial (LOCAL=NO) 11530 oracle 16 0 32.5g 469m 466m R 36.1 0.4 11:14.23 oracleahdial (LOCAL=NO) 11816 oracle 15 0 32.5g 467m 463m R 36.1 0.4 6:33.86 oracleahdial (LOCAL=NO) 11577 oracle 15 0 32.5g 480m 477m R 34.7 0.4 7:15.98 oracleahdial (LOCAL=NO) 12136 oracle 16 0 32.5g 455m 452m R 31.9 0.4 9:07.88 oracleahdial (LOCAL=NO) 11237 oracle 16 0 32.5g 997m 992m R 31.2 0.8 20:53.50 oracleahdial (LOCAL=NO) 11427 oracle 16 0 32.5g 137m 135m R 31.2 0.1 11:50.16 oracleahdial (LOCAL=NO) 12051 oracle 16 0 32.5g 459m 456m R 31.2 0.4 6:12.67 oracleahdial (LOCAL=NO) [oracle@zwq-kfdialdb ~]$ vmstat 3 10 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 137 0 0 41566600 1147384 79109888 0 0 0 3 0 0 0 0 99 0 0 146 0 0 41567548 1147384 79109888 0 0 0 47 1058 32190 98 0 2 0 0 150 0 0 41568056 1147384 79109888 0 0 0 39 1081 31543 98 0 2 0 0 12 0 0 41568464 1147384 79109888 0 0 0 23 1056 32111 98 0 2 0 0 62 0 0 41568776 1147384 79109888 0 0 0 11 1067 31474 98 0 2 0 0 108 0 0 41568304 1147384 79109888 0 0 0 39 1059 31193 98 0 2 0 0 140 0 0 41569280 1147384 79109888 0 0 0 48 1063 31171 98 0 2 0 0 140 0 0 41569444 1147384 79109888 0 0 0 40 1075 30508 98 0 2 0 0
通过top和vmstat看出系统现在负载很高,主要都是用户进程导致.
查询等待事件
SQL> select event from v$session where wait_class#<>6; EVENT ---------------------------------------------------------------- cursor: mutex S SQL> / EVENT ---------------------------------------------------------------- cursor: mutex S SQL> / EVENT ---------------------------------------------------------------- cursor: mutex S SQL> / EVENT ---------------------------------------------------------------- cursor: mutex S SQL> SELECT a.*, s.sql_text 2 FROM v$sql s, 3 (SELECT sid, 4 event, 5 wait_class, 6 p1 cursor_hash_value, 7 p2raw Mutex_value, 8 TO_NUMBER (SUBSTR (p2raw, 1, 8), 'xxxxxxxx') hold_mutex_x_sid 9 FROM v$session_wait 10 WHERE event LIKE 'cursor%') a 11 WHERE s.HASH_VALUE = a.cursor_hash_value 12 / no rows selected SQL> select event from v$session where wait_class#<>6; EVENT ---------------------------------------------------------------- SQL*Net message to client
数据库开始的等待事件只有cursor: mutex S,等该等待事件消失后系统负载也恢复正常
再次查看系统负载
[oracle@zwq-kfdialdb ~]$ top -c -i10 top - 17:13:51 up 78 days, 1:15, 6 users, load average: 12.57, 78.21, 96.45 Tasks: 702 total, 2 running, 700 sleeping, 0 stopped, 0 zombie Cpu(s): 0.5%us, 0.2%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 132070908k total, 86477808k used, 45593100k free, 1147500k buffers Swap: 67108856k total, 0k used, 67108856k free, 79116036k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15045 oracle 16 0 13136 1476 724 R 10.6 0.0 0:00.12 top -c -i10 10010 root 10 -5 0 0 0 D 0.0 0.0 0:00.68 [kondemand/4] 10019 root 10 -5 0 0 0 D 0.0 0.0 1:41.58 [kondemand/13] 10020 root 10 -5 0 0 0 D 0.0 0.0 1:52.28 [kondemand/14] 10021 root 10 -5 0 0 0 R 0.0 0.0 2:01.54 [kondemand/15] 12166 root 24 0 10084 300 216 D 0.0 0.0 0:00.00 /opt/VRTSgab/gablogd [oracle@zwq-kfdialdb ~]$ vmstat 3 10 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 45484264 1147528 79117360 0 0 0 3 0 0 0 0 99 0 0 2 0 0 45468452 1147528 79117456 0 0 4 161 1372 5369 2 2 97 0 0 3 0 0 45463712 1147528 79117584 0 0 4 187 1602 7253 6 0 93 0 0 1 0 0 45458220 1147528 79117648 0 0 1 99 1358 5821 2 0 98 0 0 0 0 0 45475168 1147528 79117712 0 0 0 41 1321 5321 2 0 98 0 0 0 0 0 45473624 1147528 79117744 0 0 3 104 1378 5455 2 0 98 0 0 2 0 0 45474656 1147528 79117776 0 0 0 55 1196 4872 1 0 99 0 0 0 0 0 45474376 1147532 79117824 0 0 8 113 1170 4990 2 0 98 0 0 1 0 0 45475440 1147532 79117872 0 0 1 56 1187 5301 3 0 97 0 0 1 0 0 45475824 1147532 79117888 0 0 3 99 1083 4643 3 0 97 0 0
结合上面的等待事件查询,我们可以大概评估出来,当cursor: mutex S等待消失后,系统负载也恢复正常,现在已经不存在环境,如果要找出问题只能够是借助AWR和ASH
分析ASH
Top User Events
Top SQL with Top Events
Activity Over Time
通过对ASH分析,可以大概确定,在这段时间内,引起系统负载高主要是cursor: mutex S导致
分析AWR
awr整体信息(从这里看数据库是相当的繁忙)
Load Profile(从这里看数据库业务比较小)
Top 5 Timed Foreground Events(主要等待事件cursor: mutex S,和前面分析相符)
OS LOAD(虽然和系统看到有一定出入,但是整体还是展示系统负载较高)
SQL ordered by Version Count(出现cursor: mutex S,因为load profile中解析不多,所以想到高版本问题,这里确实非常高)
通过这里的一些列分析,我们已经基本上可以确定,该数据库因为高版本问题导致cursor: mutex S以及library cache 相关等待严重,从而出现系统负载过高.
找出高版本原因
高版本相关信息和查询请见:关于High Versions Count总结
SQL> select * from table(version_rpt('f8b9tba7sfsb5')); COLUMN_VALUE -------------------------------------------------------------------------------- Version Count Report Version 3.2.1 -- Today's Date 19-jul-12 18:13 RDBMS Version :11.2.0.1.0 Host: zwq-kfdialdb Instance 1 : ahdial ================================================================== Addr: 000000080FA4CEA0 Hash_Value: 2408014181 SQL_ID f8b9tba7sfsb5 Sharable_Mem: 206315729 bytes Parses: 48689 Stmt: 0 select count(*) as col_0_0_ from TBL_SP_SALES_RECORDS tblspsales 1 0_ where tblspsales0_.SALES_RECORDS_STATUS=:"SYS_B_0" and tblsps 2 ales0_.MOBILE_TELE_NO=:"SYS_B_1" 3 COLUMN_VALUE -------------------------------------------------------------------------------- Versions Summary ---------------- AUTH_CHECK_MISMATCH :4 TRANSLATION_MISMATCH :4 ROLL_INVALID_MISMATCH :10219 PURGED_CURSOR :9 Total Versions:10219 Plan Hash Value Summary ----------------------- COLUMN_VALUE -------------------------------------------------------------------------------- Plan Hash Value Count =============== ===== 791727930 920 2820478500 9300 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for AUTH_CHECK_MISMATCH : # of Ver PARSING_USER_ID PARSING_SCHEMA_ID PARSING_SCHEMA_NAME ========== =============== ================= =================== 10218 75 75 HOLLYSP COLUMN_VALUE -------------------------------------------------------------------------------- 2 107 107 HOLLYSP_TEST ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for TRANSLATION_MISMATCH : Summary of objects probably causing TRANSLATION_MISMATCH Object# Owner.Object_Name ========= ================= 76737 HOLLYSP.TBL_SP_SALES_RECORDS HOLLYSP.TBL_SP_SALES_RECORDS HOLLYSP_TEST.TBL_SP_SALES_RECORDS COLUMN_VALUE -------------------------------------------------------------------------------- 107043 HOLLYSP_TEST.TBL_SP_SALES_RECORDS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for ROLL_INVALID_MISMATCH : No details available ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for PURGED_CURSOR : No details available #### To further debug Ask Oracle Support for the appropiate level LLL. COLUMN_VALUE -------------------------------------------------------------------------------- alter session set events 'immediate trace name cursortrace address 2408014181, level LLL'; To turn it off do use address 1, level 2147483648 ================================================================ 59 rows selected. SQL> select * from table(version_rpt('6zhjf3qh8gyp9')); COLUMN_VALUE -------------------------------------------------------------------------------- Version Count Report Version 3.2.1 -- Today's Date 19-jul-12 18:15 RDBMS Version :11.2.0.1.0 Host: zwq-kfdialdb Instance 1 : ahdial ================================================================== Addr: 00000007FD46D2A8 Hash_Value: 2693266089 SQL_ID 6zhjf3qh8gyp9 Sharable_Mem: 111904227 bytes Parses: 4880 Stmt: 0 select count(*) as col_0_0_ from TBL_SP_SALES_RECORDS tblspsales 1 0_ where tblspsales0_.SALES_RECORDS_STATUS=:"SYS_B_0" and tblsps 2 ales0_.CAMPAIGN_ID=:"SYS_B_1" and tblspsales0_.IS_CONN=:"SYS_B_2 3 " and tblspsales0_.SALES_TIME>=:"SYS_B_3" and tblspsales0_.SALES 4 _PERSON=:"SYS_B_4" COLUMN_VALUE -------------------------------------------------------------------------------- 5 Versions Summary ---------------- BIND_MISMATCH :4804 INCOMP_LTRL_MISMATCH :372 HASH_MATCH_FAILED :4936 Total Versions:4935 Plan Hash Value Summary COLUMN_VALUE -------------------------------------------------------------------------------- ----------------------- Plan Hash Value Count =============== ===== 1645985080 3 2040125427 4933 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for BIND_MISMATCH : Consolidated details for : BIND_MISMATCH,USER_BIND_PEEK_MISMATCH,BIND_UACS_DIFF and COLUMN_VALUE -------------------------------------------------------------------------------- BIND_EQUIV_FAILURE (Mislabled as ROW_LEVEL_SEC_MISMATCH BY bug 6964441 in 11gR1) from v$sql_bind_capture COUNT(*) POSITION MIN(MAX_LENGTH) MAX(MAX_LENGTH) DATATYPE (PRECISION,SCALE) ======== ======== =============== =============== ======== ================ 4936 1 32 32 1 (,) 4936 2 32 32 1 (,) 4936 3 32 32 1 (,) 4936 4 32 32 1 (,) 4936 5 32 32 1 (,) COLUMN_VALUE -------------------------------------------------------------------------------- SUM(DECODE(column,Y, 1, 0) FROM V$SQL IS_OBSOLETE IS_BIND_SENSITIVE IS_BIND_AWARE IS_SHAREABLE =========== ================= ============= ============ 0 4309 0 4309 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for INCOMP_LTRL_MISMATCH : No details available ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Details for HASH_MATCH_FAILED : COLUMN_VALUE -------------------------------------------------------------------------------- No details available #### To further debug Ask Oracle Support for the appropiate level LLL. alter session set events 'immediate trace name cursortrace address 2693266089, level LLL'; To turn it off do use address 1, level 2147483648 ================================================================ 62 rows selected.
根据经验,出现这么多的高版本情况,很可能是cursor_sharing参数设置问题
--对应sql语句 select count(*) as col_0_0_ from TBL_SP_SALES_RECORDS tblspsales0_ where tblspsales0_.SALES_RECORDS_STATUS=:"SYS_B_0" and tblspsales0_.CAMPAIGN_ID=:"SYS_B_1" and tblspsales0_.IS_CONN=:"SYS_B_2" and tblspsales0_.SALES_TIME>=:"SYS_B_3" and tblspsales0_.SALES_PERSON=:"SYS_B_4" --cursor_sharing参数 SQL> show parameter cursor_sharing; NAME TYPE VALUE ------------------------------------ -------------------------------- ---------------- cursor_sharing string SIMILAR
根据oracle官方建议在11g中不推荐使用cursor_sharing=SIMILAR,其实在所有版本中都不推荐,设置为该值很容易导致高版本问题.而且该值会出现莫名其妙的,无法解释的高版本问题.而且根据oracle相关文档,在即将发布的12c版本中,将除掉SIMILAR值.对于客户库的该问题,因为很多sql未绑定参数,为了减少硬解析,建议在业务低谷时设置cursor_sharing=FORCE,并刷新shared pool.
发表在 Oracle性能优化
2 条评论
未收集统计信息对象—执行sql动态采样
在一次ORA-7445导致oracle数据库down掉故障分析中,发现一条类似的sql非常大(通过复制到文档确定该sql大小是5M左右)
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("DCREDITMSG_00") FULL("DCREDITMSG_00") NO_PARALLEL_INDEX("DCREDITMSG_00") */ 1 AS C1, CASE WHEN "DCREDITMSG_00"."PHONE_NO"='具体电话号码' OR "DCREDITMSG_00"."PHONE_NO"='具体电话号码' OR ……………………N多OR "DCREDITMSG_00"."PHONE_NO"='具体电话号码' "DCREDITMSG_00"."PHONE_NO"='具体电话号码' THEN 1 ELSE 0 END AS C2 FROM "BSSADMIN"."DCREDITMSG_00" SAMPLE BLOCK (0.032410 , 1) SEED (1) "DCREDITMSG_00") SAMPLESUB
当时该sql因某种原因导致大量的sql area中很多内存泄露,最终导致数据库down掉.通过实验找出类此奇怪SQL.
创建模拟表
SQL> create table t_xifenfei 2 as 3 select * from dba_objects; Table created. SQL> select count(*) from t_xifenfei; COUNT(*) ---------- 74605 SQL> select NUM_ROWS,LAST_ANALYZED from dba_tables 2 where table_name='T_XIFENFEI' and owner='CHF'; NUM_ROWS LAST_ANALYZE ---------- ------------
得出信息:
1.该表一共有记录数74605条
2.该表未收集统计信息
查看执行计划
SQL> set autotrace trace exp SQL> select /*+ dynamic_sampling(t 0) */ * from t_xifenfei t; Execution Plan ---------------------------------------------------------- Plan hash value: 548923532 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 88868 | 17M| 299 (2)| 00:00:04 | | 1 | TABLE ACCESS FULL| T_XIFENFEI | 88868 | 17M| 299 (2)| 00:00:04 | -------------------------------------------------------------------------------- --通过hint指定动态采样sql相关对象统计信息, 可以看到我们实际的表记录是74605而数据库采样出来的记录为88868,原则上还是可以接受 SQL> select * from t_xifenfei; Execution Plan ---------------------------------------------------------- Plan hash value: 548923532 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 73449 | 14M| 298 (1)| 00:00:04 | | 1 | TABLE ACCESS FULL| T_XIFENFEI | 73449 | 14M| 298 (1)| 00:00:04 | -------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) --因为没有统计信息,数据库动态采样sql相关对象统计信息 可以看到我们实际的表记录是74605而数据库采样出来的记录为73449,比手工指定采样准确
对自动采样进行10046跟踪
SQL> conn / as sysdba Connected. SQL> oradebug setmypid Statement processed. SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 Statement processed. SQL> select count(*) from CHF.t_xifenfei; COUNT(*) ---------- 74605 SQL> oradebug TRACEFILE_NAME /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_27967.trc
分析trace文件
*** 2012-07-12 15:42:34.991 WAIT #0:nam='SQL*Net message from client'ela= 56716427 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1342078954991525 ===================== PARSING IN CURSOR #3063864268 len=404 dep=1 uid=0 oct=3 lid=0 tim=1342078955037387 hv=4184780033 ad='385d3708' sqlid='3gjvvxzwqxb81' SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ NO_PARALLEL("T_XIFENFEI") FULL("T_XIFENFEI") NO_PARALLEL_INDEX("T_XIFENFEI") */ 1 AS C1, 1 AS C2 FROM "CHF"."T_XIFENFEI" SAMPLE BLOCK (5.790441 , 1) SEED (1) "T_XIFENFEI") SAMPLESUB END OF STMT PARSE #3063864268:c=6000,e=5404,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=205916192,tim=1342078955037303 EXEC #3063864268:c=0,e=206,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=205916192,tim=1342078955037901 FETCH #3063864268:c=4998,e=4759,p=0,cr=65,cu=0,mis=0,r=1,dep=1,og=1,plh=205916192,tim=1342078955042730 STAT #3063864268 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=65 pr=0 pw=0 time=4795 us)' STAT #3063864268 id=2 cnt=4253 pid=1 pos=1 obj=76370 op='TABLE ACCESS SAMPLE T_XIFENFEI (cr=65 pr=0 pw=0 time=8247 us cost=19 size=61752 card=5146)' CLOSE #3063864268:c=0,e=7,dep=1,type=0,tim=1342078955043024 ===================== PARSING IN CURSOR #3063864784 len=35 dep=0 uid=0 oct=3 lid=0 tim=1342078955043465 hv=2174183953 ad='3ed2d700' sqlid='fadutqq0tfuhj' select count(*) from CHF.t_xifenfei END OF STMT PARSE #3063864784:c=51991,e=51648,p=0,cr=66,cu=0,mis=1,r=0,dep=0,og=1,plh=2715729601,tim=1342078955043464 EXEC #3063864784:c=0,e=75,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2715729601,tim=1342078955043645 WAIT #3063864784: nam='SQL*Net message to client' ela= 3 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1342078955043705
通过这个trace的分析,果然发现在执行我们需要的sql前,执行了SELECT /* OPT_DYN_SAMP */组成的一个复杂的采样sql语句.
收集统计信息查看执行计划
SQL> EXEC DBMS_STATS.gather_table_stats('CHF','T_XIFENFEI'); PL/SQL procedure successfully completed. SQL> select NUM_ROWS,LAST_ANALYZED from dba_tables 2 where table_name='T_XIFENFEI' and owner='CHF'; NUM_ROWS LAST_ANALYZE ---------- ------------ 74605 12-JUL-12 SQL> set autotrace trace exp SQL> select * from t_xifenfei; Execution Plan ---------------------------------------------------------- Plan hash value: 548923532 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 74605 | 7139K| 298 (1)| 00:00:04 | | 1 | TABLE ACCESS FULL| T_XIFENFEI | 74605 | 7139K| 298 (1)| 00:00:04 | -------------------------------------------------------------------------------- --执行计划未提示dynamic sampling
继续做10046
SQL> conn / as sysdba Connected. SQL> oradebug setmypid Statement processed. SQL> oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 Statement processed. SQL> select count(*) from CHF.t_xifenfei; COUNT(*) ---------- 74605 SQL> oradebug TRACEFILE_NAME /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_29780.trc
分析trace文件
*** 2012-07-12 16:14:53.914 Oradebug command 'EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12' console output: <none> WAIT #0: nam='SQL*Net message to client' ela= 8 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1342080893914307 *** 2012-07-12 16:14:59.376 WAIT #0: nam='SQL*Net message from client' ela= 5461608 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1342080899376008 ===================== PARSING IN CURSOR #3063709248 len=35 dep=0 uid=0 oct=3 lid=0 tim=1342080899379562 hv=2174183953 ad='3ed2d700' sqlid='fadutqq0tfuhj' select count(*) from CHF.t_xifenfei END OF STMT
通过这里可以发现,当有了统计信息后,数据库不能再使用那条N多hint的sql去动态采样统计信息.
总结建议
动态采样(Dynamic Sampling)技术的最初提出是在Oracle 9i R2,在段(表,索引,分区)没有分析的情况下,为了使CBO 优化器得到足够的信息以保证做出正确的执行计划而发明的一种技术,可以把它看做分析手段的一种补充。当段对象没有统计信息时(即没有做分析),动态采样技术可以通过直接从需要分析的对象上收集数据块(采样)来获得CBO需要的统计信息。为了cbo,oracle引进了该功能,原则上说是一个很不错的东西,但是偶尔也是会出现一些意外,所以如果发现数据库中有表未做统计分析,建议手工处理下,ORACLE的自动收集统计信息程序也有不靠谱的时候(发现多次10g的库中有部分表未收集统计信息)
发表在 Oracle性能优化
评论关闭
11g DirectPath Reads 噩梦案例
DirectPath Reads 说明
在oracle 11g以前的版本中,如果对大表进行全表扫描,wait event是:db file scattered read;在11g中,如果对大表进行全表扫描,wait event是:direct path read。在11g中,大表全表扫描时数据块不经过sga而直接进pga,这样会造成每次进行大表全表扫描,物理读都是很大,而在10g中,由于全表扫描的数据块在sga中已经存在,所以执行全表扫描时,它的物理读为0。但是这里主要是oracle在优化策略上的进步,即假定大表频繁全表扫描这种现象,在生产库上不会太多,通过把数据直接读入pga,进而减少了cache buffer的繁忙交换程度,提高了cache buffer的使用效率.
DirectPath Reads 优势
1. 减少了对栓的使用,避免可能的栓争用
2. 物理IO的大小不再取决于buffer_cache中所存在的块;试想某个8个块的extent中1,3,5,7号块在高速缓存中,而2,4,6,8块没有被缓存,传统的方式在读取该extent时将会是对2,4,6,8块进行4次db file sequential read,这是一种十分可怕的状况,其效率往往要比单次读取这个区间的所有8个块还要低得多,虽然Oracle为了避免这种情况总是尽可能的不缓存大表的块(读入后总是放在队列最冷的一端);而direct path read则可以完全避免这类问题,尽可能地单次读入更多的物理块。
DirectPath Reads 噩梦
这一切听起来都很美好,但是在大并发的OLTP系统中,这东西简直是一个噩梦.通过一个awr来说明该问题:这个是一个系统的awr报告,朋友反馈说系统有段运行缓慢,请求帮忙找出原因
分析总体信息
系统这段时间会话临时大幅度增加(从102增加到223),系统出现异常繁忙(60.62*16=969.92<2,454.52)
分析Load Profile信息
通过这个截图发现系统的业务不是很大,但是Physical reads参数异常
1.物理读大小:25071.1*8192/1024/1024=195.86796875M/S
2.物理读将近逻辑读一半,这个在一般系统中很难得到这个比例,进一步说明物理读过高
分析Top 5信息
这里可以发现direct path read等待很多
分析Host CPU
可以发现iowait很大占40.5%,io等待异常高(195M/S能不高吗?)
补充说明:在这里我们看到的%Idle=1-%System-%User不包括%WIO
处理建议
通过上面的评估,可以确定大部分是由于 导致了数据库的物理读过高,从而使得系统反应变慢,处理方法就是关闭掉11g该新特性
alter system set event= ’10949 trace name context forever, level 1′ scope=spfile;
重启数据库
发表在 Oracle性能优化
评论关闭