Skip to content

Commit 7a36c15

Browse files
author
Chaithra Gopalareddy
committed
Bug #18194196: OPTIMIZER EXECUTES STATEMENT INPERFORMANT
Problem: While choosing the join order, the cost for doing table_scan is wrongly calculated. As a result table_scan is preferred over eq_ref, thereby choosing a bad plan. Analysis: While calculating the fanout in semijoin_dupsweedout method, if an inner table is ahead of the outer table in the join order, fanout is not calculated correctly. This is what is happening w.r.t the query in the bugpage. As seen from the trace, a table scan is preferred over eq_ref. This is because when calculating the cost for eq_ref, optimizer takes into consideration the correct prefix_row_count calculated in prev_records_read. In this case the first table emailstorerel has around 42000 records and the next one is a ref scan with 1 row as fanout and the next one is also 1. Cost for eq_ref is calculated based on this. While in table_scan, cost is calculated based on row_count passed to best_access_path which ideally should be the prefix_row_count based on the partial plan chosen. In this case it should be more than 42000. It is currently 1. This changes to "1" after semi-join strategy for dups_weedout is checked after choosing emailstore as the second table in the following part of trace. "plan_prefix": [ "`emailstoreorel`" ], "table": "`emailstore`", "best_access_path": { "considered_access_paths": [ { "access_type": "ref", "index": "PRIMARY", "rows": 1, "cost": 42092, "chosen": true }, { "access_type": "scan", "using_join_cache": true, "rows": 70554, "cost": 5.94e8, "chosen": false } ] }, "cost_for_plan": 67583, "rows_for_plan": 42092, "semijoin_strategy_choice": [ { "strategy": "DuplicatesWeedout", "cost": 76004, "rows": 1, "duplicate_tables_left": true, "chosen": true } ], So after Optimize_table_order::semijoin_dupsweedout_access_paths(), the current_rowcount now becomes "1" because the prefix_row_count is calculated solely on outer fanout (from table emailstore) which in this case is 1. The inner fanout from table emailstoreorel is not considered at all. Solution: Change the current formulas to the one's mentioned in the todo text. The formulas now take into consideration all the scenarios which can have inner tables ahead of outer tables in a join order. In such a scenario, if inner_fanout is more than 1, this will be moved to outer_fanout and inner_fanout is re-calculated. max_outer_fanout is introduced to keep a cap on outer_fanout not to exceed the cardinality of the cross product of outer tables. changes to test files: Two sets of changes can be noted. 1. When a inner table with full table scan is chosen as the first table in the join order, earlier fanout was wrongly calculated. As a result the cost for doing dups weedout was less. With the current formulas, the cost for doing writes become more because of increased outer fanout in these cases. As a result the cost is not less than materialized scan. 2. Optimizer now calculates the cost of table scan correctly. As a result, eq_ref is preferred over table scan in these cases.
1 parent d9d2140 commit 7a36c15

31 files changed

+2527
-880
lines changed

mysql-test/include/subquery_sj.inc

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6522,6 +6522,50 @@ p2, t1 p3 WHERE p0.id=p2.id6 AND p2.id7=p3.id));
65226522

65236523
DROP TABLE t1,t2;
65246524

6525+
--echo #
6526+
--echo # Bug#18194196: OPTIMIZER EXECUTES STATEMENT INPERFORMANT
6527+
--echo #
6528+
6529+
CREATE TABLE t1 (uid INTEGER, fid INTEGER, INDEX(uid));
6530+
INSERT INTO t1 VALUES
6531+
(1,1), (1,2), (1,3), (1,4),
6532+
(2,5), (2,6), (2,7), (2,8),
6533+
(3,1), (3,2), (3,9);
6534+
6535+
CREATE TABLE t2 (uid INT PRIMARY KEY, name VARCHAR(128), INDEX(name));
6536+
INSERT INTO t2 VALUES
6537+
(1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
6538+
(6, "F"), (7, "G"), (8, "H"), (9, "I");
6539+
6540+
CREATE TABLE t3 (uid INT, fid INT, INDEX(uid));
6541+
INSERT INTO t3 VALUES
6542+
(1,1), (1,2), (1,3),(1,4),
6543+
(2,5), (2,6), (2,7), (2,8),
6544+
(3,1), (3,2), (3,9);
6545+
6546+
CREATE TABLE t4 (uid INT PRIMARY KEY, name VARCHAR(128), INDEX(name));
6547+
INSERT INTO t4 VALUES
6548+
(1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
6549+
(6, "F"), (7, "G"), (8, "H"), (9, "I");
6550+
6551+
ANALYZE TABLE t1,t2,t3,t4;
6552+
6553+
EXPLAIN SELECT name FROM t2, t1
6554+
WHERE t1.uid IN (SELECT t4.uid FROM t4, t3 WHERE t3.uid=1 AND t4.uid=t3.fid)
6555+
AND t2.uid=t1.fid;
6556+
6557+
FLUSH STATUS;
6558+
6559+
SELECT name FROM t2, t1
6560+
WHERE t1.uid IN (SELECT t4.uid FROM t4, t3 WHERE t3.uid=1 AND t4.uid=t3.fid)
6561+
AND t2.uid=t1.fid;
6562+
6563+
SHOW STATUS LIKE '%handler_read%';
6564+
6565+
DROP TABLE t1,t2,t3,t4;
6566+
6567+
--echo # End of test for Bug#18194196
6568+
65256569
set @@optimizer_switch=@old_opt_switch;
65266570
# New tests go here.
65276571

mysql-test/r/subquery_sj_all.result

Lines changed: 116 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -2939,42 +2939,48 @@ EXPLAIN
29392939
{
29402940
"query_block": {
29412941
"select_id": 1,
2942-
"duplicates_removal": {
2943-
"using_temporary_table": true,
2944-
"nested_loop": [
2945-
{
2946-
"table": {
2947-
"table_name": "t11",
2948-
"access_type": "ALL",
2949-
"rows": 8,
2950-
"filtered": 100
2951-
}
2952-
},
2953-
{
2954-
"table": {
2955-
"table_name": "t1",
2956-
"access_type": "eq_ref",
2957-
"possible_keys": [
2958-
"PRIMARY"
2959-
],
2960-
"key": "PRIMARY",
2961-
"used_key_parts": [
2962-
"a"
2963-
],
2964-
"key_length": "4",
2965-
"ref": [
2966-
"test.t11.a"
2967-
],
2968-
"rows": 1,
2969-
"filtered": 100
2942+
"nested_loop": [
2943+
{
2944+
"table": {
2945+
"table_name": "<subquery2>",
2946+
"access_type": "ALL",
2947+
"materialized_from_subquery": {
2948+
"using_temporary_table": true,
2949+
"query_block": {
2950+
"table": {
2951+
"table_name": "t11",
2952+
"access_type": "ALL",
2953+
"rows": 8,
2954+
"filtered": 100
2955+
}
2956+
}
29702957
}
29712958
}
2972-
]
2973-
}
2959+
},
2960+
{
2961+
"table": {
2962+
"table_name": "t1",
2963+
"access_type": "eq_ref",
2964+
"possible_keys": [
2965+
"PRIMARY"
2966+
],
2967+
"key": "PRIMARY",
2968+
"used_key_parts": [
2969+
"a"
2970+
],
2971+
"key_length": "4",
2972+
"ref": [
2973+
"<subquery2>.a"
2974+
],
2975+
"rows": 1,
2976+
"filtered": 100
2977+
}
2978+
}
2979+
]
29742980
}
29752981
}
29762982
Warnings:
2977-
Note 1003 /* select#1 */ select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c` from `test`.`t1` semi join (`test`.`t11`) where (`test`.`t1`.`a` = `test`.`t11`.`a`)
2983+
Note 1003 /* select#1 */ select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c` from `test`.`t1` semi join (`test`.`t11`) where (`test`.`t1`.`a` = `<subquery2>`.`a`)
29782984
select t21.* from t21,t22 where t21.a = t22.a and
29792985
t22.a in (select t12.a from t11, t12 where t11.a in(255,256) and t11.a = t12.a and t11.c is null) and t22.c is null order by t21.a;
29802986
a b c
@@ -3227,8 +3233,9 @@ create table t3 ( a int , filler char(100), key(a));
32273233
insert into t3 select A.a + 10*B.a, 'filler' from t0 A, t0 B;
32283234
explain select * from t3 where a in (select a from t2) and (a > 5 or a < 10);
32293235
id select_type table type possible_keys key key_len ref rows Extra
3230-
1 SIMPLE t2 ALL NULL NULL NULL NULL 2 Using where; Start temporary
3231-
1 SIMPLE t3 ref a a 5 test.t2.a 1 End temporary
3236+
1 SIMPLE <subquery2> ALL NULL NULL NULL NULL NULL Using where
3237+
1 SIMPLE t3 ref a a 5 <subquery2>.a 1 NULL
3238+
2 MATERIALIZED t2 ALL NULL NULL NULL NULL 2 NULL
32323239
select * from t3 where a in (select a from t2);
32333240
a filler
32343241
1 filler
@@ -5746,11 +5753,12 @@ INNER JOIN t2 c ON c.idContact=cona.idContact
57465753
WHERE cona.postalStripped='T2H3B2'
57475754
);
57485755
id select_type table type possible_keys key key_len ref rows filtered Extra
5749-
1 SIMPLE cona ALL NULL NULL NULL NULL 2 100.00 Using where; Start temporary
5750-
1 SIMPLE c eq_ref PRIMARY PRIMARY 4 test.cona.idContact 1 100.00 Using where
5751-
1 SIMPLE a eq_ref PRIMARY PRIMARY 4 test.c.idObj 1 100.00 Using index; End temporary
5756+
1 SIMPLE <subquery2> ALL NULL NULL NULL NULL NULL 0.00 NULL
5757+
1 SIMPLE a index PRIMARY PRIMARY 4 NULL 2 100.00 Using where; Using index; Using join buffer (Block Nested Loop)
5758+
2 MATERIALIZED cona ALL NULL NULL NULL NULL 2 100.00 Using where
5759+
2 MATERIALIZED c eq_ref PRIMARY PRIMARY 4 test.cona.idContact 1 100.00 NULL
57525760
Warnings:
5753-
Note 1003 /* select#1 */ select `test`.`a`.`idIndividual` AS `idIndividual` from `test`.`t1` `a` semi join (`test`.`t3` `cona` join `test`.`t2` `c`) where ((`test`.`c`.`idContact` = `test`.`cona`.`idContact`) and (`test`.`a`.`idIndividual` = `test`.`c`.`idObj`) and (`test`.`cona`.`postalStripped` = 'T2H3B2'))
5761+
Note 1003 /* select#1 */ select `test`.`a`.`idIndividual` AS `idIndividual` from `test`.`t1` `a` semi join (`test`.`t3` `cona` join `test`.`t2` `c`) where ((`test`.`c`.`idContact` = `test`.`cona`.`idContact`) and (`test`.`a`.`idIndividual` = `<subquery2>`.`idObj`) and (`test`.`cona`.`postalStripped` = 'T2H3B2'))
57545762
drop table t1,t2,t3;
57555763
CREATE TABLE t1 (one int, two int, flag char(1));
57565764
CREATE TABLE t2 (one int, two int, flag char(1));
@@ -6868,8 +6876,8 @@ and t2.uid=t1.fid;
68686876
id select_type table type possible_keys key key_len ref rows Extra
68696877
1 SIMPLE t3 ref uid uid 5 const 4 Using where; Start temporary
68706878
1 SIMPLE t4 eq_ref PRIMARY PRIMARY 4 test.t3.fid 1 Using index
6871-
1 SIMPLE t1 ref uid uid 5 test.t3.fid 2 End temporary
6872-
1 SIMPLE t2 ALL PRIMARY NULL NULL NULL 9 Using where; Using join buffer (Block Nested Loop)
6879+
1 SIMPLE t1 ref uid uid 5 test.t3.fid 2 Using where
6880+
1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.fid 1 End temporary
68736881
select name from t2, t1
68746882
where t1.uid in (select t4.uid from t4, t3 where t3.uid=1 and t4.uid=t3.fid)
68756883
and t2.uid=t1.fid;
@@ -7641,8 +7649,9 @@ WHERE col_varchar_key IN (SELECT col_varchar_nokey
76417649
FROM t2)
76427650
ORDER BY col_datetime_key LIMIT 4;
76437651
id select_type table type possible_keys key key_len ref rows Extra
7644-
1 SIMPLE t2 ALL NULL NULL NULL NULL 6 Using temporary; Using filesort; Start temporary
7645-
1 SIMPLE t1 ref col_varchar_key col_varchar_key 3 test.t2.col_varchar_nokey 1 End temporary
7652+
1 SIMPLE t1 ALL col_varchar_key NULL NULL NULL 20 Using where; Using filesort
7653+
1 SIMPLE <subquery2> eq_ref <auto_key> <auto_key> 3 test.t1.col_varchar_key 1 NULL
7654+
2 MATERIALIZED t2 ALL NULL NULL NULL NULL 6 NULL
76467655
SELECT col_varchar_key
76477656
FROM t1
76487657
WHERE col_varchar_key IN (SELECT col_varchar_nokey
@@ -7714,9 +7723,10 @@ AND grandparent1.col_varchar_key IS NOT NULL
77147723
);
77157724
id select_type table type possible_keys key key_len ref rows Extra
77167725
1 PRIMARY t1 ALL NULL NULL NULL NULL 2 Using where
7717-
2 SUBQUERY parent1 ALL NULL NULL NULL NULL 20 Start temporary
7718-
2 SUBQUERY parent2 eq_ref PRIMARY PRIMARY 4 test.parent1.pk 1 Using index
7719-
2 SUBQUERY grandparent1 ref col_varchar_key col_varchar_key 3 test.parent1.col_varchar_nokey 1 Using index condition; End temporary
7726+
2 SUBQUERY <subquery3> ALL NULL NULL NULL NULL NULL NULL
7727+
2 SUBQUERY grandparent1 ref col_varchar_key col_varchar_key 3 <subquery3>.p1 1 Using index condition
7728+
3 MATERIALIZED parent1 ALL NULL NULL NULL NULL 20 NULL
7729+
3 MATERIALIZED parent2 eq_ref PRIMARY PRIMARY 4 test.parent1.pk 1 Using index
77207730
SELECT *
77217731
FROM t1
77227732
WHERE g1 NOT IN
@@ -10582,6 +10592,68 @@ p2, t1 p3 WHERE p0.id=p2.id6 AND p2.id7=p3.id));
1058210592
ID
1058310593
126
1058410594
DROP TABLE t1,t2;
10595+
#
10596+
# Bug#18194196: OPTIMIZER EXECUTES STATEMENT INPERFORMANT
10597+
#
10598+
CREATE TABLE t1 (uid INTEGER, fid INTEGER, INDEX(uid));
10599+
INSERT INTO t1 VALUES
10600+
(1,1), (1,2), (1,3), (1,4),
10601+
(2,5), (2,6), (2,7), (2,8),
10602+
(3,1), (3,2), (3,9);
10603+
CREATE TABLE t2 (uid INT PRIMARY KEY, name VARCHAR(128), INDEX(name));
10604+
INSERT INTO t2 VALUES
10605+
(1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
10606+
(6, "F"), (7, "G"), (8, "H"), (9, "I");
10607+
CREATE TABLE t3 (uid INT, fid INT, INDEX(uid));
10608+
INSERT INTO t3 VALUES
10609+
(1,1), (1,2), (1,3),(1,4),
10610+
(2,5), (2,6), (2,7), (2,8),
10611+
(3,1), (3,2), (3,9);
10612+
CREATE TABLE t4 (uid INT PRIMARY KEY, name VARCHAR(128), INDEX(name));
10613+
INSERT INTO t4 VALUES
10614+
(1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"),
10615+
(6, "F"), (7, "G"), (8, "H"), (9, "I");
10616+
ANALYZE TABLE t1,t2,t3,t4;
10617+
Table Op Msg_type Msg_text
10618+
test.t1 analyze status OK
10619+
test.t2 analyze status OK
10620+
test.t3 analyze status OK
10621+
test.t4 analyze status OK
10622+
EXPLAIN SELECT name FROM t2, t1
10623+
WHERE t1.uid IN (SELECT t4.uid FROM t4, t3 WHERE t3.uid=1 AND t4.uid=t3.fid)
10624+
AND t2.uid=t1.fid;
10625+
id select_type table type possible_keys key key_len ref rows Extra
10626+
1 SIMPLE t3 ref uid uid 5 const 4 Using where; Start temporary
10627+
1 SIMPLE t4 eq_ref PRIMARY PRIMARY 4 test.t3.fid 1 Using index
10628+
1 SIMPLE t1 ALL uid NULL NULL NULL 11 Using where; End temporary; Using join buffer (Block Nested Loop)
10629+
1 SIMPLE t2 eq_ref PRIMARY PRIMARY 4 test.t1.fid 1 NULL
10630+
FLUSH STATUS;
10631+
SELECT name FROM t2, t1
10632+
WHERE t1.uid IN (SELECT t4.uid FROM t4, t3 WHERE t3.uid=1 AND t4.uid=t3.fid)
10633+
AND t2.uid=t1.fid;
10634+
name
10635+
A
10636+
B
10637+
C
10638+
D
10639+
E
10640+
F
10641+
G
10642+
H
10643+
A
10644+
B
10645+
I
10646+
SHOW STATUS LIKE '%handler_read%';
10647+
Variable_name Value
10648+
Handler_read_first 0
10649+
Handler_read_key 16
10650+
Handler_read_last 0
10651+
Handler_read_next 4
10652+
Handler_read_prev 0
10653+
Handler_read_rnd 0
10654+
Handler_read_rnd_next 12
10655+
DROP TABLE t1,t2,t3,t4;
10656+
# End of test for Bug#18194196
1058510657
set @@optimizer_switch=@old_opt_switch;
1058610658
# End of 5.6 tests
1058710659
set optimizer_switch=default;

0 commit comments

Comments
 (0)