[转帖]NEST LOOP改为HASH JOIN 性能提高6倍_MySQL, Oracle及数据库讨论区_Weblogic技术|Tuxedo技术|中间件技术|Oracle论坛|JAVA论坛|Linux/Unix技术|hadoop论坛_联动北方技术论坛  
网站首页 | 关于我们 | 服务中心 | 经验交流 | 公司荣誉 | 成功案例 | 合作伙伴 | 联系我们 |
联动北方-国内领先的云技术服务提供商
»  游客             当前位置:  论坛首页 »  自由讨论区 »  MySQL, Oracle及数据库讨论区 »
总帖数
1
每页帖数
101/1页1
返回列表
0
发起投票  发起投票 发新帖子
查看: 5310 | 回复: 0   主题: [转帖]NEST LOOP改为HASH JOIN 性能提高6倍        下一篇 
derek
注册用户
等级:中校
经验:1550
发帖:209
精华:0
注册:2011-7-21
状态:离线
发送短消息息给derek 加好友    发送短消息息给derek 发消息
发表于: IP:您无权察看 2011-9-2 14:23:33 | [全部帖] [楼主帖] 楼主

NEST LOOP改为 HASH JOIN 性能提高6倍

Kevin Zou

2011-8-31



在看系统的AWR report时,发现有个 SQL语句出现频率很高,而且都是在消耗资源的top 5内。决定把这个SQL拉出来看看。


523,370           1     523,370.0    2.7   253.31   2576.45 bcy886xsqp4u1
Module: sqr@phcmdb (TNS V1-V3)
SELECT b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATU
S, b.TL_PYBL_REASON_CD from ps_job a, ps_tl_payable_time b where a.emplid = b.e
mplid and a.empl_rcd = b.empl_rcd and a.effdt = (select max(a_ed.effdt) from ps_
job a_ed where a.emplid = a_ed.emplid and a.empl_rcd = a_ed.empl_rcd and a_ed.ef


通过V$SQL找到完整的SQL语句:

SELECT  b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATUS, 

b.TL_PYBL_REASON_CD  from  ps_job a,  ps_tl_payable_time b where a.emplid = b.emplid 

and a.empl_rcd = b.empl_rcd 

and a.effdt = (select max(a_ed.effdt) from sysadm.ps_job a_ed where a.emplid = a_ed.emplid 

and a.empl_rcd = a_ed.empl_rcd and a_ed.effdt <= B.DUR) 

AND a.EFFSEQ =(SELECT MAX(a_es.EFFSEQ) FROM sysadm.ps_job a_es WHERE a.EMPLID=a_es.EMPLID 

AND a.EMPL_RCD=a_es.EMPL_RCD AND a_es.EFFDT=a.EFFDT) AND A.COMPANY = '&py'

 AND B.DUR BETWEEN '01-JAN-2011' AND '15-AUG-2011' AND A.PAYGROUP IN('2DS')  

 AND B.TRC IN ('2OT10','2OT15','2OT20','2OT30')   Order by b.EMPLID,b.DUR,b.TRC



 ps_job表有130W的数据;

ps_tl_payable_time 有430W的数据;



查看其执行计划:

explain plan for 

SELECT  b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATUS, 

b.TL_PYBL_REASON_CD  from  ps_job a,  ps_tl_payable_time b where a.emplid = b.emplid 

and a.empl_rcd = b.empl_rcd 

and a.effdt = (select max(a_ed.effdt) from sysadm.ps_job a_ed where a.emplid = a_ed.emplid 

and a.empl_rcd = a_ed.empl_rcd and a_ed.effdt <= B.DUR) 

AND a.EFFSEQ =(SELECT MAX(a_es.EFFSEQ) FROM sysadm.ps_job a_es WHERE a.EMPLID=a_es.EMPLID 

AND a.EMPL_RCD=a_es.EMPL_RCD AND a_es.EFFDT=a.EFFDT) AND A.COMPANY = '&py'

 AND B.DUR BETWEEN '01-JAN-2011' AND '15-AUG-2011' AND A.PAYGROUP IN('2DS')  

 AND B.TRC IN ('2OT10','2OT15','2OT20','2OT30')   Order by b.EMPLID,b.DUR,b.TRC



execute plan:

SYS@hr9prd>select * from  table(dbms_xplan.display);



PLAN_TABLE_OUTPUT

Plan hash value: 1194964640



-----------------------------------------------------------------------------------------------------

Id   Operation                       Name               Rows   Bytes Cost (%CPU) Time    

-----------------------------------------------------------------------------------------------------

  0 SELECT STATEMENT                                       1    73  1369K  (1) 03:33:53

  1  SORT ORDER BY                                             1    73  1369K  (1) 03:33:53

*  2   FILTER                                                                                      

  3    NESTED LOOPS                                         108  7884   157K  (1) 00:24:34

*  4     TABLE ACCESS FULL           PS_JOB             40296  1180K 17164   (1) 00:02:41

*  5     TABLE ACCESS BY INDEX ROWID PS_TL_PAYABLE_TIME     1    43     4   (0) 00:00:01

*  6      INDEX RANGE SCAN           PS_TL_PAYABLE_TIME     1           3   (0) 00:00:01

  7       SORT AGGREGATE                                   1    19                      

*  8        INDEX RANGE SCAN         PSAJOB                 1    19     3   (0) 00:00:01

  9    SORT AGGREGATE                                       1    22                      

* 10     INDEX RANGE SCAN           PSAJOB                 1    22     3   (0) 00:00:01

-----------------------------------------------------------------------------------------------------



Predicate Information (identified by operation id):

---------------------------------------------------



   2 - filter("A"."EFFSEQ"= (SELECT MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFSEQ"))) FROM

              "SYSADM"."PS_JOB" "A_ES" WHERE SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B1) AND

              "A_ES"."EMPL_RCD"=:B2 AND "A_ES"."EMPLID"=:B3 AND

              SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B4))

   4 - filter("A"."PAYGROUP"='2DS' AND "A"."COMPANY"='SPT')

   5 - filter("B"."TRC"='2OT10' OR "B"."TRC"='2OT15' OR "B"."TRC"='2OT20' OR

              "B"."TRC"='2OT30')

   6 - access("A"."EMPLID"="B"."EMPLID" AND "A"."EMPL_RCD"="B"."EMPL_RCD" AND

              SYS_OP_DESCEND("DUR")>=HEXTORAW('8790F7F0FEF8FEFAFF')  AND

              SYS_OP_DESCEND("DUR")<=HEXTORAW('8790FEF8FEF8FEFAFF') )

       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("DUR"))>=TO_DATE(' 2011-01-01 00:00:00',

              'syyyy-mm-dd hh24:mi:ss') AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("DUR"))<=TO_DATE(' 2011-08-15

              00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "A"."EFFDT"= (SELECT

              MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))) FROM "SYSADM"."PS_JOB" "A_ED" WHERE

              SYS_OP_DESCEND("EFFDT") IS NOT NULL AND SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B1) AND

              "A_ED"."EMPL_RCD"=:B2 AND "A_ED"."EMPLID"=:B3 AND

              SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B4))

   8 - access("A_ED"."EMPLID"=:B1 AND "A_ED"."EMPL_RCD"=:B2 AND

              SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B3) AND SYS_OP_DESCEND("EFFDT") IS NOT NULL)

       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B1)

  10 - access("A_ES"."EMPLID"=:B1 AND "A_ES"."EMPL_RCD"=:B2 AND

              SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B3))

       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B1)



44 rows selected.



看到两个大表在做NEST LOOP.这明显是不合理的。



查看其消耗的资源:

SYS@hr9prd>select  RUNTIME_MEM , FETCHES ,  EXECUTIONS , DISK_READS , BUFFER_GETS  ,USER_IO_WAIT_TIME ,ROWS_PROCESSED 

from v$sqlarea

where sql_id ='bcy886xsqp4u1'   2    3  ;


RUNTIME_MEM       FETCHES EXECUTIONS DISK_READS   BUFFER_GETS    USER_IO_WAIT_TIME ROWS_PROCESSED
----------- ----------   ---------- - ---------           -----------       -----------------    --------------
26448           132435          1                  599315          20765595            2611266235        1324338


USER_IO_WAIT_TIME为2611266235microseconds = 2611 seconds. 而在AWR REPORT中SQL elapsed time为 2576.45 (这里USER_IO_WAIT_TIME >elapsed_time ,可能是两者的统计有出入导致,在这里不是重点)。说明SQL的等待时间全部都是发在IO 等待上。

加入HINT,改写SQL

SELECT b.EMPLID, b.DUR, b.TRC, b.TL_QUANTITY, b.ACTUAL_PUB_DATE, b.PAYABLE_STATUS, 

b.TL_PYBL_REASON_CD  from  ps_tl_payable_time b ,ps_job a  where a.emplid = b.emplid 

and a.empl_rcd = b.empl_rcd 

and a.effdt = (select max(a_ed.effdt) from sysadm.ps_job a_ed where a.emplid = a_ed.emplid 

and a.empl_rcd = a_ed.empl_rcd and a_ed.effdt <= B.DUR) 

AND a.EFFSEQ =(SELECT MAX(a_es.EFFSEQ) FROM sysadm.ps_job a_es WHERE a.EMPLID=a_es.EMPLID 

AND a.EMPL_RCD=a_es.EMPL_RCD AND a_es.EFFDT=a.EFFDT) AND A.COMPANY = '&py'

 AND B.DUR BETWEEN '01-JAN-2011' AND '15-AUG-2011' AND A.PAYGROUP IN('2DS')  

 AND B.TRC IN ('2OT10','2OT15','2OT20','2OT30')   Order by b.EMPLID,b.DUR,b.TRC 



Elapsed: 00:06:14.59



Execution Plan

----------------------------------------------------------

Plan hash value: 2212746762



Id   Operation             Name               Rows   Bytes TempSpc Cost (%CPU) Time    

  0 SELECT STATEMENT                               1    73        1470K  (1) 03:49:40

  1  SORT ORDER BY                                 1    73    34M  1470K  (1) 03:49:40

*  2   FILTER                                                                      

*  3    HASH JOIN                                 393K    27M    94M   291K  (1) 00:45:35

*  4     TABLE ACCESS FULL PS_TL_PAYABLE_TIME  1798K    73M         272K  (1) 00:42:31

*  5     TABLE ACCESS FULL PS_JOB             40296  1180K       17164   (1) 00:02:41

  6    SORT AGGREGATE                             1    19                        

*  7     INDEX RANGE SCAN   PSAJOB                 1    19       3   (0) 00:00:01

  8      SORT AGGREGATE                           1    22                        

*  9       INDEX RANGE SCAN PSAJOB                 1    22       3   (0) 00:00:01



Predicate Information (identified by operation id):

---------------------------------------------------



   2 - filter("A"."EFFDT"= (SELECT MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))) FROM



              "SYSADM"."PS_JOB" "A_ED" WHERE SYS_OP_DESCEND("EFFDT") IS NOT NULL AND



              SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B1) AND "A_ED"."EMPL_RCD"=:B2 AND



              "A_ED"."EMPLID"=:B3 AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B4) AND "A"."EFFSEQ"=



              (SELECT MAX(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFSEQ"))) FROM "SYSADM"."PS_JOB" "A_ES" WHERE



              SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B5) AND "A_ES"."EMPL_RCD"=:B6 AND



              "A_ES"."EMPLID"=:B7 AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B8))



   3 - access("A"."EMPLID"="B"."EMPLID" AND "A"."EMPL_RCD"="B"."EMPL_RCD")

   4 - filter("B"."DUR">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND



              ("B"."TRC"='2OT10' OR "B"."TRC"='2OT15' OR "B"."TRC"='2OT20' OR "B"."TRC"='2OT30') AND



              "B"."DUR"<=TO_DATE(' 2011-08-15 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))



   5 - filter("A"."PAYGROUP"='2DS' AND "A"."COMPANY"='SPT')

   7 - access("A_ED"."EMPLID"=:B1 AND "A_ED"."EMPL_RCD"=:B2 AND

              SYS_OP_DESCEND("EFFDT")>=SYS_OP_DESCEND(:B3) AND SYS_OP_DESCEND("EFFDT") IS NOT NULL)



       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))<=:B1)

   9 - access("A_ES"."EMPLID"=:B1 AND "A_ES"."EMPL_RCD"=:B2 AND

              SYS_OP_DESCEND("EFFDT")=SYS_OP_DESCEND(:B3))

       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("EFFDT"))=:B1)



Statistics

----------------------------------------------------------

        178  recursive calls

          3  db block gets

    7622155  consistent gets

    1572401  physical reads

          0  redo size

   36340125  bytes sent via SQL*Net to client

     971589  bytes received via SQL*Net from client

      88284  SQL*Net roundtrips to/from client

          0  sorts (memory)

          1  sorts (disk)

    1324243  rows processed



整个SQL用时6分15秒就执行完毕。

对比RUNTIME STATS:

                          前          后     比较

physical reads      599315      1572401      增加了973086

consistent gets     20765595    7622155     减少了13143440

Elaped   Time       2576.45      375        仅为之前的15%



如果看到两个大表直接做联合,如果优化器选择了NEST LOOP 作为ACCESS PATH,执行效率很差,要考虑改为HASH JOIN 以提高性能。


-THE END-




赞(0)    操作        顶端 
总帖数
1
每页帖数
101/1页1
返回列表
发新帖子
请输入验证码: 点击刷新验证码
您需要登录后才可以回帖 登录 | 注册
技术讨论