一次SQL優化經歷分享
2016/3/22 20:37:46??????點擊:
這個禮拜開發工作量不是很大,就對系統內一個運行很慢的SQL語句進行優化,該SQL是ERP系統內一個報警Alert調用,發送相關的數據給用戶進行報警提示。前提先說明下,我不是DBA,平時做的是EBS的開發,空的時候搗鼓下系統里運行緩慢的一些SQL。雖然我的方法比較土也比較死,但優化過程中還是遇到了不少問題,特別是union all時謂詞推進失效,10G綁定變量缺陷導致的執行計劃走偏的問題。
1.問題SQL語句
SELECT DISTINCT 'PO' TYPE
,sales_code
,mfg_code
,label_code
,battery_code
,pph.po_number
,pph.internal_po_number
,pph.customer_po_number
FROM (SELECT piai.inventory_item_id mfg_id
,sales.inventory_item_id sales_id
,piai.export_whs_code sales_code
,mfg.segment1 mfg_code
,sales.attribute1 label_code
,mfg3.battery_code
FROM bom_bill_of_materials bom
,bom_inventory_components ic
,mtl_system_items msi1
,pcm_inv_assembly_items piai
,mtl_system_items_b sales
,mtl_system_items_b mfg
,( SELECT DISTINCT bom2.assembly_item_id
,mfg2.battery_code
FROM bom_bill_of_materials bom2
,bom_inventory_components ic2
,mtl_system_items msi12
,( SELECT DISTINCT bom0.assembly_item_id
,msi10.segment1 battery_code
FROM bom_bill_of_materials bom0
,bom_inventory_components ic0
,mtl_system_items msi10
,pcg_apply_item_for_customs pafi0
WHERE msi10.inventory_item_id = ic0.component_item_id
AND msi10.organization_id = bom0.organization_id
AND ic0.bill_sequence_id = bom0.bill_sequence_id
AND bom0.organization_id = 105
AND bom0.alternate_bom_designator = 'MPL'
AND pafi0.parts_code = msi10.segment1
AND ic0.effectivity_date <= SYSDATE
AND (ic0.disable_date IS NULL OR ic0.disable_date >= SYSDATE )) mfg2
WHERE msi12.inventory_item_id = ic2.component_item_id
AND msi12.organization_id = bom2.organization_id
AND ic2.bill_sequence_id = bom2.bill_sequence_id
AND msi12.inventory_item_id = mfg2.assembly_item_id
AND bom2.organization_id = 105
AND bom2.alternate_bom_designator = 'MPL'
AND ic2.effectivity_date <= SYSDATE
AND (ic2.disable_date IS NULL OR ic2.disable_date >= SYSDATE)) mfg3
WHERE msi1.inventory_item_id = ic.component_item_id
AND msi1.organization_id = bom.organization_id
AND ic.bill_sequence_id = bom.bill_sequence_id
AND piai.organization_id = bom.organization_id
AND piai.organization_id = sales.organization_id
AND piai.inventory_item_id = bom.assembly_item_id
AND sales.segment1 = piai.export_whs_code
AND piai.organization_id = mfg.organization_id
AND mfg.inventory_item_id = piai.inventory_item_id
AND msi1.inventory_item_id = mfg3.assembly_item_id
AND bom.organization_id = 105
AND bom.alternate_bom_designator = 'MPL'
AND mfg.segment1 NOT LIKE '_____EY%'
AND substr(mfg.segment1, length(mfg.segment1) - 2, 1) <> 'C'
AND ic.effectivity_date <= SYSDATE
AND (ic.disable_date IS NULL OR ic.disable_date >= SYSDATE)
AND sales.attribute1 IS NOT NULL
UNION ALL
SELECT piai.inventory_item_id mfg_id
,sales.inventory_item_id sales_id
,piai.export_whs_code sales_code
,mfg.segment1 mfg_code
,sales.attribute1 label_code
,msi1.segment1 parts_code
FROM bom_bill_of_materials bom
,bom_inventory_components ic
,mtl_system_items msi1
,pcm_inv_assembly_items piai
,mtl_system_items_b sales
,mtl_system_items_b mfg
WHERE msi1.inventory_item_id = ic.component_item_id
AND msi1.organization_id = bom.organization_id
AND ic.bill_sequence_id = bom.bill_sequence_id
AND piai.organization_id = bom.organization_id
AND piai.organization_id = sales.organization_id
AND piai.inventory_item_id = bom.assembly_item_id
AND sales.segment1 = piai.export_whs_code
AND piai.organization_id = mfg.organization_id
AND mfg.inventory_item_id = piai.inventory_item_id
AND msi1.segment1 IN ( SELECT DISTINCT parts_code
FROM pcg_apply_item_for_customs)
AND bom.organization_id = 105
AND bom.alternate_bom_designator = 'MPL'
AND mfg.segment1 NOT LIKE '_____EY%'
AND substr(mfg.segment1, length(mfg.segment1) - 2, 1) <> 'C'
AND ic.effectivity_date <= SYSDATE
AND (ic.disable_date IS NULL OR ic.disable_date >= SYSDATE)
AND sales.attribute1 IS NOT NULL) tbl
,pcg_po_headers pph
,pcg_po_lines ppl
WHERE pph.acceptance_header_id = ppl.acceptance_header_id
AND ppl.organization_id = 105
AND ppl.inventory_item_id = tbl.sales_id
AND ppl.shipped_quantity < ppl.promised_quantity
AND ppl.cancel_flag IS NULL
AND pph.cancel_flag IS NULL
AND ppl.creation_date >= to_date( : DATE_LAST_CHECKED, 'DD-MON-YYYY HH24:MI:SS') - 10 / ( 24 * 60 * 60)
AND NOT EXISTS (SELECT 1
FROM pcg_apply_item_for_customs
WHERE label_code IS NOT NULL
AND mfg_code LIKE label_code || '%'
AND tbl.battery_code = parts_code)
大致可以簡單理解為這么一個查詢
select * from
(select from t_bom
union all
select from t_bom ) t1,t2,t3 where t1.a = t2.a and t2.b = t3.b這樣的一個查詢
系統上查到的執行計劃比較糟糕,主要是因為對表 BOM_STRUCTURES_B走了全表掃描,這是我們ERP里的BOM表,數據量很大,走全掃的話會很久,所以這段SQL目前運行一次耗費將近40分鐘,so bad。
執行計劃見附件:執行計劃1.txt
2.其實針對這個SQL優化,我第一想到的就是去改寫整段SQL,因為這里可以使用遞歸查詢來進行改寫,但是有些點需要去和業務用戶確認過,反而會托更久,所以就針對原有SQL進行優化吧
3.原SQL中有個時間參數是綁定變量的,: DATE_LAST_CHECKED,我就先具體代入個值進去,在本地跑下看看效果,正常業務傳入的該值為上一次運行時間,也就是 sysdate - 1/12, 但執行計劃沒有變,還是和前面的一樣,效率還是很低。
4.想不到什么好的方法,突然想到嘗試把union all 改為union,因為這個SQL里數據重不重復都是不影響最后的結果集的。
理論上union all 比union效率更高,因為union是需要去重的,而union all是不需要去重的。反正先試了再說,死馬當活馬醫。
5.把union all 改為union后,執行計劃竟然變了,而且訪問很快。
發現先把pcg_po_lines的結果集跑出來,再和BOM_STRUCTURES_B表的據進行join,這樣不用全掃BOM表,效率快很多。
執行計劃見附件: 執行計劃2.txt
6.優化方案,把union all改為union,所以我就按照這個方式對EBS 系統中的SQL語句進行了修改。
但是萬萬沒想到把union all改為union后,執行計劃還是沒有變,效率很低。
突然又想到那個綁定變量的時間參數,問題又出在這里,因為我們的庫還是10G的,綁定變量窺探存在問題,服務器上解析該SQL的執行計劃時,由于綁定變量: DATE_LAST_CHECKED的存在,所以算開銷的時候訪問pcg_po_lines的cost會很大(它是不會把程序實際傳入的值代入后再去重新生成執行計劃的),不會去優先訪問,所以解析出來的執行計劃總是優先去全表掃描BOM的表。
所以對原SQL又取消了這個綁定變量,把它改成了常量 sysdate - 1/12
這樣之后執行計劃確實就OK了,從之前的40分鐘,現在運行一次只要20秒左右。
7.后續調查,針對這個問題,我很好奇uinon all 和union為什么會導致執行計劃不一樣
我把改善成union后的SQL語句的outline抓了出來,分析outline,我在里面找到了hints:/*+ push_pred(tbl) */, tbl就是union操作的結果集,于是我先去了解了下push_pred的相關資料,總之和謂詞推進有關聯。
8.我在最初的bad sql頭部加入hint:/*+ push_pred(tbl) */,那個時間參數依然使用綁定變量: DATE_LAST_CHECKED,再次查看執行計劃:
可以發現這里有個問題,雖然jion的順序是符合我們要求的,但發現cost值很高,尤其是pcg_po_lines的cost值,這里就是綁定變量帶來的對執行計劃的影響,如果這里指定的是一個時間常量,解析的時候系統會把這個常量直 接代入,加上這個時間過濾算到訪問pcg_po_lines的cost會很低。
執行計劃見附件: 執行計劃3.txt
9.關于push_pred,我也查了一些文檔,主要是針對 UNION ALL PUSHED PREDICATE
不過后來也進一步找到了原因: Predicates are not pushed into a union all view unless a PUSH_PRED hint is specified,可以參照meta-link文檔 ID 463406.1
具體內容如下:
In this Document
SymptomsCauseSolutionReferencesApplies to:Oracle Database - Enterprise Edition - Version 10.2.0.3 to 11.2.0.2.0 [Release 10.2 to 11.2]
Information in this document applies to any platform.
SymptomsWith a query of the form:
SELECT *
FROM T1 ,T2 ,
(SELECT T3.a, T3.b
FROM T3
UNION ALL
SELECT T4.a, T4.b
FROM T4
) V
WHERE T1.a = V.a
AND T2.c = V.b
Predicates are not pushed into a union all view unless a PUSH_PRED hint is specified.
CauseIn the query the only join between the 2 tables (T1 and T2) is via the view (V) (ie T1 - V - T2 there is no join T1 - T2).
In order to push join predicates from tables T1 and T2 into the view, it is necessary to access both tables before accessing the view (if we access the view first then pushing the predicates in would be pointless since the predicates would have no values and would not filter any rows). Forcing tables T1 and T2 to be first in the join order results in a Cartesian product between tables T1 and T2.
Cartesian products can be very expensive even if the optimizer thinks otherwise and so development deliberately decided to disable this view for pushing unless a PUSH_PRED hint is specified.
SolutionThis is not a bug, but rather it is a deliberate omission in 10g where a cartesian product is the only way possible to push a predicate into the view, due to limitations in the search space usage. Workaround is to use the PUSH_PRED hint (and store this in a SPM baseline).
This limitation is addressed in 11g with extended Join Predicate Pushdown (JPPD) support and is discussed in the following Bug:
Bug 6729436 OPTIMIZER CHOOSES SUB OPTIMAL EXECUTION PLAN WHEN UNION ALL VIEWS ARE USED
10.當然最終該SQL的性能問題還是順利優化了,過程回頭想想還是曲折的,特別是其中遇到的一些疑點,很開心和大家分享,也算是給自己這次弱弱的優化做個總結。
雖然平時做的比較多的還是開發,但自己對SQL優化一直都比較感興趣,希望以后也可以和大家一起學習共進步。
1.問題SQL語句
SELECT DISTINCT 'PO' TYPE
,sales_code
,mfg_code
,label_code
,battery_code
,pph.po_number
,pph.internal_po_number
,pph.customer_po_number
FROM (SELECT piai.inventory_item_id mfg_id
,sales.inventory_item_id sales_id
,piai.export_whs_code sales_code
,mfg.segment1 mfg_code
,sales.attribute1 label_code
,mfg3.battery_code
FROM bom_bill_of_materials bom
,bom_inventory_components ic
,mtl_system_items msi1
,pcm_inv_assembly_items piai
,mtl_system_items_b sales
,mtl_system_items_b mfg
,( SELECT DISTINCT bom2.assembly_item_id
,mfg2.battery_code
FROM bom_bill_of_materials bom2
,bom_inventory_components ic2
,mtl_system_items msi12
,( SELECT DISTINCT bom0.assembly_item_id
,msi10.segment1 battery_code
FROM bom_bill_of_materials bom0
,bom_inventory_components ic0
,mtl_system_items msi10
,pcg_apply_item_for_customs pafi0
WHERE msi10.inventory_item_id = ic0.component_item_id
AND msi10.organization_id = bom0.organization_id
AND ic0.bill_sequence_id = bom0.bill_sequence_id
AND bom0.organization_id = 105
AND bom0.alternate_bom_designator = 'MPL'
AND pafi0.parts_code = msi10.segment1
AND ic0.effectivity_date <= SYSDATE
AND (ic0.disable_date IS NULL OR ic0.disable_date >= SYSDATE )) mfg2
WHERE msi12.inventory_item_id = ic2.component_item_id
AND msi12.organization_id = bom2.organization_id
AND ic2.bill_sequence_id = bom2.bill_sequence_id
AND msi12.inventory_item_id = mfg2.assembly_item_id
AND bom2.organization_id = 105
AND bom2.alternate_bom_designator = 'MPL'
AND ic2.effectivity_date <= SYSDATE
AND (ic2.disable_date IS NULL OR ic2.disable_date >= SYSDATE)) mfg3
WHERE msi1.inventory_item_id = ic.component_item_id
AND msi1.organization_id = bom.organization_id
AND ic.bill_sequence_id = bom.bill_sequence_id
AND piai.organization_id = bom.organization_id
AND piai.organization_id = sales.organization_id
AND piai.inventory_item_id = bom.assembly_item_id
AND sales.segment1 = piai.export_whs_code
AND piai.organization_id = mfg.organization_id
AND mfg.inventory_item_id = piai.inventory_item_id
AND msi1.inventory_item_id = mfg3.assembly_item_id
AND bom.organization_id = 105
AND bom.alternate_bom_designator = 'MPL'
AND mfg.segment1 NOT LIKE '_____EY%'
AND substr(mfg.segment1, length(mfg.segment1) - 2, 1) <> 'C'
AND ic.effectivity_date <= SYSDATE
AND (ic.disable_date IS NULL OR ic.disable_date >= SYSDATE)
AND sales.attribute1 IS NOT NULL
UNION ALL
SELECT piai.inventory_item_id mfg_id
,sales.inventory_item_id sales_id
,piai.export_whs_code sales_code
,mfg.segment1 mfg_code
,sales.attribute1 label_code
,msi1.segment1 parts_code
FROM bom_bill_of_materials bom
,bom_inventory_components ic
,mtl_system_items msi1
,pcm_inv_assembly_items piai
,mtl_system_items_b sales
,mtl_system_items_b mfg
WHERE msi1.inventory_item_id = ic.component_item_id
AND msi1.organization_id = bom.organization_id
AND ic.bill_sequence_id = bom.bill_sequence_id
AND piai.organization_id = bom.organization_id
AND piai.organization_id = sales.organization_id
AND piai.inventory_item_id = bom.assembly_item_id
AND sales.segment1 = piai.export_whs_code
AND piai.organization_id = mfg.organization_id
AND mfg.inventory_item_id = piai.inventory_item_id
AND msi1.segment1 IN ( SELECT DISTINCT parts_code
FROM pcg_apply_item_for_customs)
AND bom.organization_id = 105
AND bom.alternate_bom_designator = 'MPL'
AND mfg.segment1 NOT LIKE '_____EY%'
AND substr(mfg.segment1, length(mfg.segment1) - 2, 1) <> 'C'
AND ic.effectivity_date <= SYSDATE
AND (ic.disable_date IS NULL OR ic.disable_date >= SYSDATE)
AND sales.attribute1 IS NOT NULL) tbl
,pcg_po_headers pph
,pcg_po_lines ppl
WHERE pph.acceptance_header_id = ppl.acceptance_header_id
AND ppl.organization_id = 105
AND ppl.inventory_item_id = tbl.sales_id
AND ppl.shipped_quantity < ppl.promised_quantity
AND ppl.cancel_flag IS NULL
AND pph.cancel_flag IS NULL
AND ppl.creation_date >= to_date( : DATE_LAST_CHECKED, 'DD-MON-YYYY HH24:MI:SS') - 10 / ( 24 * 60 * 60)
AND NOT EXISTS (SELECT 1
FROM pcg_apply_item_for_customs
WHERE label_code IS NOT NULL
AND mfg_code LIKE label_code || '%'
AND tbl.battery_code = parts_code)
大致可以簡單理解為這么一個查詢
select * from
(select from t_bom
union all
select from t_bom ) t1,t2,t3 where t1.a = t2.a and t2.b = t3.b這樣的一個查詢
系統上查到的執行計劃比較糟糕,主要是因為對表 BOM_STRUCTURES_B走了全表掃描,這是我們ERP里的BOM表,數據量很大,走全掃的話會很久,所以這段SQL目前運行一次耗費將近40分鐘,so bad。
執行計劃見附件:執行計劃1.txt
2.其實針對這個SQL優化,我第一想到的就是去改寫整段SQL,因為這里可以使用遞歸查詢來進行改寫,但是有些點需要去和業務用戶確認過,反而會托更久,所以就針對原有SQL進行優化吧
3.原SQL中有個時間參數是綁定變量的,: DATE_LAST_CHECKED,我就先具體代入個值進去,在本地跑下看看效果,正常業務傳入的該值為上一次運行時間,也就是 sysdate - 1/12, 但執行計劃沒有變,還是和前面的一樣,效率還是很低。
4.想不到什么好的方法,突然想到嘗試把union all 改為union,因為這個SQL里數據重不重復都是不影響最后的結果集的。
理論上union all 比union效率更高,因為union是需要去重的,而union all是不需要去重的。反正先試了再說,死馬當活馬醫。
5.把union all 改為union后,執行計劃竟然變了,而且訪問很快。
發現先把pcg_po_lines的結果集跑出來,再和BOM_STRUCTURES_B表的據進行join,這樣不用全掃BOM表,效率快很多。
執行計劃見附件: 執行計劃2.txt
6.優化方案,把union all改為union,所以我就按照這個方式對EBS 系統中的SQL語句進行了修改。
但是萬萬沒想到把union all改為union后,執行計劃還是沒有變,效率很低。
突然又想到那個綁定變量的時間參數,問題又出在這里,因為我們的庫還是10G的,綁定變量窺探存在問題,服務器上解析該SQL的執行計劃時,由于綁定變量: DATE_LAST_CHECKED的存在,所以算開銷的時候訪問pcg_po_lines的cost會很大(它是不會把程序實際傳入的值代入后再去重新生成執行計劃的),不會去優先訪問,所以解析出來的執行計劃總是優先去全表掃描BOM的表。
所以對原SQL又取消了這個綁定變量,把它改成了常量 sysdate - 1/12
這樣之后執行計劃確實就OK了,從之前的40分鐘,現在運行一次只要20秒左右。
7.后續調查,針對這個問題,我很好奇uinon all 和union為什么會導致執行計劃不一樣
我把改善成union后的SQL語句的outline抓了出來,分析outline,我在里面找到了hints:/*+ push_pred(tbl) */, tbl就是union操作的結果集,于是我先去了解了下push_pred的相關資料,總之和謂詞推進有關聯。
8.我在最初的bad sql頭部加入hint:/*+ push_pred(tbl) */,那個時間參數依然使用綁定變量: DATE_LAST_CHECKED,再次查看執行計劃:
可以發現這里有個問題,雖然jion的順序是符合我們要求的,但發現cost值很高,尤其是pcg_po_lines的cost值,這里就是綁定變量帶來的對執行計劃的影響,如果這里指定的是一個時間常量,解析的時候系統會把這個常量直 接代入,加上這個時間過濾算到訪問pcg_po_lines的cost會很低。
執行計劃見附件: 執行計劃3.txt
9.關于push_pred,我也查了一些文檔,主要是針對 UNION ALL PUSHED PREDICATE
不過后來也進一步找到了原因: Predicates are not pushed into a union all view unless a PUSH_PRED hint is specified,可以參照meta-link文檔 ID 463406.1
具體內容如下:
In this Document
SymptomsCauseSolutionReferencesApplies to:Oracle Database - Enterprise Edition - Version 10.2.0.3 to 11.2.0.2.0 [Release 10.2 to 11.2]
Information in this document applies to any platform.
SymptomsWith a query of the form:
SELECT *
FROM T1 ,T2 ,
(SELECT T3.a, T3.b
FROM T3
UNION ALL
SELECT T4.a, T4.b
FROM T4
) V
WHERE T1.a = V.a
AND T2.c = V.b
Predicates are not pushed into a union all view unless a PUSH_PRED hint is specified.
CauseIn the query the only join between the 2 tables (T1 and T2) is via the view (V) (ie T1 - V - T2 there is no join T1 - T2).
In order to push join predicates from tables T1 and T2 into the view, it is necessary to access both tables before accessing the view (if we access the view first then pushing the predicates in would be pointless since the predicates would have no values and would not filter any rows). Forcing tables T1 and T2 to be first in the join order results in a Cartesian product between tables T1 and T2.
Cartesian products can be very expensive even if the optimizer thinks otherwise and so development deliberately decided to disable this view for pushing unless a PUSH_PRED hint is specified.
SolutionThis is not a bug, but rather it is a deliberate omission in 10g where a cartesian product is the only way possible to push a predicate into the view, due to limitations in the search space usage. Workaround is to use the PUSH_PRED hint (and store this in a SPM baseline).
This limitation is addressed in 11g with extended Join Predicate Pushdown (JPPD) support and is discussed in the following Bug:
Bug 6729436 OPTIMIZER CHOOSES SUB OPTIMAL EXECUTION PLAN WHEN UNION ALL VIEWS ARE USED
10.當然最終該SQL的性能問題還是順利優化了,過程回頭想想還是曲折的,特別是其中遇到的一些疑點,很開心和大家分享,也算是給自己這次弱弱的優化做個總結。
雖然平時做的比較多的還是開發,但自己對SQL優化一直都比較感興趣,希望以后也可以和大家一起學習共進步。
- 上一篇:程序員遇到Bug時的30個反應 2016/3/22
- 下一篇:關于互聯網,中國要下一盤很大的棋…… 2016/3/22