Skip to content

Conversation

@morrySnow
Copy link
Contributor

@morrySnow morrySnow commented Dec 31, 2025

What problem does this PR solve?

Related PR: #36784

Problem Summary:

This pull request refactors the logic in the visitPhysicalHashJoin method to improve the handling and enforcement of bucket shuffle join strategies in the Nereids planner. It introduces new checks and flags to more accurately determine when bucket shuffle downgrades are needed, and updates the plan output to reflect the use of bucketShuffle joins in various TPC-DS queries.

Key changes include:

Planner logic improvements

  • Added shouldCheckLeftBucketDownGrade and shouldCheckrightBucketDownGrade flags to control when to check for bucket shuffle downgrades, leading to more precise enforcement of shuffle strategies.
  • Refactored conditional logic to use these new flags, replacing previous direct downgrade checks, and improved the handling of cases where shuffle key orders do not match. [1] [2]

These changes enhance the accuracy and efficiency of join planning in distributed query execution.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@morrySnow morrySnow marked this pull request as draft December 31, 2025 04:26
@Thearas
Copy link
Contributor

Thearas commented Dec 31, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35730 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3fcb7b1c4cf5fa4a7feceb459902511a926860c2, data reload: false

------ Round 1 ----------------------------------
q1	18208	4269	4034	4034
q2	3149	467	357	357
q3	10175	1275	759	759
q4	10248	793	324	324
q5	7756	2154	2021	2021
q6	190	164	134	134
q7	949	788	651	651
q8	9419	1484	1262	1262
q9	6556	4759	4825	4759
q10	6777	1778	1397	1397
q11	495	311	295	295
q12	715	719	567	567
q13	17821	3977	3224	3224
q14	290	287	272	272
q15	591	509	510	509
q16	668	676	638	638
q17	685	720	629	629
q18	7301	7861	7833	7833
q19	1260	1090	656	656
q20	424	401	256	256
q21	4546	4177	4146	4146
q22	1177	1007	1067	1007
Total cold run time: 109400 ms
Total hot run time: 35730 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4283	4195	4376	4195
q2	349	433	321	321
q3	2440	3008	2373	2373
q4	1357	1922	1445	1445
q5	4447	4476	4346	4346
q6	207	173	126	126
q7	1958	1888	1783	1783
q8	2539	2307	2338	2307
q9	6723	6772	6749	6749
q10	2322	2486	2084	2084
q11	542	486	454	454
q12	701	718	573	573
q13	3481	3948	3269	3269
q14	273	288	254	254
q15	517	504	478	478
q16	607	659	609	609
q17	1068	1212	1219	1212
q18	7565	7159	7289	7159
q19	877	850	856	850
q20	1881	1950	1824	1824
q21	4466	4202	4061	4061
q22	1070	1024	982	982
Total cold run time: 49673 ms
Total hot run time: 47454 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174649 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3fcb7b1c4cf5fa4a7feceb459902511a926860c2, data reload: false

query5	4829	593	457	457
query6	339	224	212	212
query7	4214	467	278	278
query8	344	246	264	246
query9	8766	2620	2653	2620
query10	532	358	317	317
query11	15148	15047	14856	14856
query12	179	112	114	112
query13	1256	494	392	392
query14	6479	2944	2699	2699
query14_1	2581	2562	2596	2562
query15	203	193	174	174
query16	982	476	442	442
query17	1069	694	573	573
query18	2591	432	348	348
query19	225	216	191	191
query20	127	118	115	115
query21	211	138	126	126
query22	4016	3980	3858	3858
query23	15865	15875	15418	15418
query23_1	15401	15695	15503	15503
query24	7431	1594	1204	1204
query24_1	1194	1206	1195	1195
query25	552	458	409	409
query26	1228	268	153	153
query27	2756	452	294	294
query28	4521	2202	2195	2195
query29	785	542	471	471
query30	309	237	210	210
query31	838	633	554	554
query32	78	69	64	64
query33	516	354	270	270
query34	886	896	531	531
query35	736	806	699	699
query36	861	858	802	802
query37	127	92	75	75
query38	2733	2678	2713	2678
query39	776	767	746	746
query39_1	714	713	701	701
query40	211	130	110	110
query41	68	65	64	64
query42	104	102	104	102
query43	439	475	417	417
query44	1320	788	751	751
query45	191	185	177	177
query46	851	963	617	617
query47	1447	1489	1393	1393
query48	310	320	243	243
query49	597	413	319	319
query50	645	280	209	209
query51	3821	3761	3786	3761
query52	106	105	94	94
query53	296	334	268	268
query54	271	248	233	233
query55	76	75	69	69
query56	272	286	282	282
query57	1013	1010	885	885
query58	266	255	242	242
query59	2081	2234	2175	2175
query60	312	316	318	316
query61	156	155	152	152
query62	407	369	324	324
query63	299	263	264	263
query64	4927	1275	949	949
query65	3801	3712	3666	3666
query66	1408	419	319	319
query67	15041	15462	15537	15462
query68	8284	990	724	724
query69	493	336	298	298
query70	1007	977	905	905
query71	373	300	272	272
query72	5952	4675	4784	4675
query73	681	575	314	314
query74	8714	8767	8681	8681
query75	2879	2849	2509	2509
query76	3876	1045	666	666
query77	540	374	276	276
query78	9800	9755	9168	9168
query79	1288	920	605	605
query80	663	586	484	484
query81	506	264	230	230
query82	233	140	108	108
query83	258	254	239	239
query84	263	118	102	102
query85	877	481	454	454
query86	387	325	324	324
query87	2826	2871	2753	2753
query88	3174	2254	2239	2239
query89	398	345	319	319
query90	2166	154	140	140
query91	180	164	140	140
query92	84	69	59	59
query93	1869	894	559	559
query94	573	314	285	285
query95	568	318	364	318
query96	591	456	207	207
query97	2306	2367	2268	2268
query98	228	197	199	197
query99	590	594	510	510
Total cold run time: 254533 ms
Total hot run time: 174649 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.1 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3fcb7b1c4cf5fa4a7feceb459902511a926860c2, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.09	0.09
query4	1.61	0.11	0.10
query5	0.27	0.27	0.26
query6	1.15	0.66	0.64
query7	0.03	0.03	0.03
query8	0.06	0.04	0.05
query9	0.56	0.50	0.50
query10	0.57	0.55	0.55
query11	0.16	0.10	0.11
query12	0.16	0.12	0.13
query13	0.61	0.60	0.61
query14	0.99	0.96	0.98
query15	0.80	0.79	0.80
query16	0.39	0.39	0.42
query17	1.06	1.07	1.05
query18	0.22	0.21	0.21
query19	1.87	1.76	1.74
query20	0.01	0.01	0.01
query21	15.43	0.26	0.14
query22	4.80	0.04	0.04
query23	15.88	0.28	0.10
query24	1.41	0.57	0.62
query25	0.10	0.05	0.08
query26	0.14	0.13	0.13
query27	0.05	0.05	0.06
query28	3.63	1.05	0.88
query29	12.63	3.94	3.15
query30	0.28	0.14	0.12
query31	2.82	0.62	0.39
query32	3.23	0.56	0.46
query33	3.01	3.01	3.02
query34	16.57	5.13	4.45
query35	4.51	4.44	4.46
query36	0.68	0.50	0.49
query37	0.11	0.07	0.06
query38	0.07	0.04	0.04
query39	0.05	0.03	0.03
query40	0.18	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 96.68 s
Total hot run time: 27.1 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.56% (16/45) 🎉
Increment coverage report
Complete coverage report

@morrySnow morrySnow changed the title [dnm](draft) test right dsg [opt](shuffle-strategy) bucket shuffle down grade only check not shuffle side Jan 4, 2026
@morrySnow morrySnow changed the title [opt](shuffle-strategy) bucket shuffle down grade only check not shuffle side [opt](ditributed-plan) bucket shuffle down grade only check not shuffle side Jan 4, 2026
@morrySnow morrySnow marked this pull request as ready for review January 4, 2026 07:56
@morrySnow morrySnow force-pushed the test_bcg branch 2 times, most recently from 2e31cce to eb05d6b Compare January 5, 2026 09:31
@morrySnow
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31194 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6cc98e86392e427a349289135281911d2225de14, data reload: false

------ Round 1 ----------------------------------
q1	17584	4180	4029	4029
q2	2023	365	232	232
q3	10162	1246	699	699
q4	10226	918	322	322
q5	7516	2108	1828	1828
q6	197	176	142	142
q7	932	796	659	659
q8	9261	1342	1188	1188
q9	4780	4592	4507	4507
q10	6771	1803	1407	1407
q11	533	299	277	277
q12	694	717	609	609
q13	17792	3799	3075	3075
q14	298	290	274	274
q15	580	506	512	506
q16	704	680	614	614
q17	660	786	512	512
q18	6656	6267	6197	6197
q19	1084	956	584	584
q20	411	359	256	256
q21	2968	2372	2307	2307
q22	1039	992	970	970
Total cold run time: 102871 ms
Total hot run time: 31194 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4106	4042	4040	4040
q2	329	387	332	332
q3	2093	2578	2192	2192
q4	1287	1745	1320	1320
q5	4069	4008	4067	4008
q6	210	170	133	133
q7	1854	1846	1663	1663
q8	2645	2536	2401	2401
q9	7355	7306	7081	7081
q10	2549	2676	2297	2297
q11	576	480	468	468
q12	678	757	603	603
q13	3612	4089	3358	3358
q14	291	302	283	283
q15	548	622	517	517
q16	654	675	623	623
q17	1165	1287	1297	1287
q18	7985	7786	7726	7726
q19	872	864	857	857
q20	2002	2069	1934	1934
q21	4806	4491	4152	4152
q22	1093	1029	1004	1004
Total cold run time: 50779 ms
Total hot run time: 48279 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172753 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6cc98e86392e427a349289135281911d2225de14, data reload: false

query5	4368	582	433	433
query6	334	224	205	205
query7	4223	464	272	272
query8	334	261	241	241
query9	8766	2651	2618	2618
query10	511	375	315	315
query11	15426	14959	14846	14846
query12	175	116	115	115
query13	1260	481	389	389
query14	6242	2993	2745	2745
query14_1	2659	2644	2617	2617
query15	202	195	177	177
query16	997	477	463	463
query17	1108	699	586	586
query18	2477	445	337	337
query19	224	224	199	199
query20	122	119	116	116
query21	212	140	121	121
query22	3982	3814	3973	3814
query23	15931	15496	15353	15353
query23_1	15375	15389	15415	15389
query24	7414	1558	1180	1180
query24_1	1197	1197	1188	1188
query25	569	480	428	428
query26	1256	266	172	172
query27	2754	454	294	294
query28	4563	2150	2146	2146
query29	782	556	474	474
query30	310	248	214	214
query31	825	640	565	565
query32	75	71	70	70
query33	549	351	299	299
query34	900	860	538	538
query35	748	822	706	706
query36	833	919	816	816
query37	130	90	79	79
query38	2769	2718	2702	2702
query39	797	765	728	728
query39_1	701	714	733	714
query40	218	126	114	114
query41	72	61	64	61
query42	105	103	99	99
query43	426	442	432	432
query44	1309	721	726	721
query45	185	183	173	173
query46	833	932	602	602
query47	1428	1503	1320	1320
query48	325	332	242	242
query49	607	410	327	327
query50	643	266	206	206
query51	3763	3839	3764	3764
query52	100	105	92	92
query53	286	323	267	267
query54	277	250	251	250
query55	80	75	77	75
query56	283	286	291	286
query57	981	996	939	939
query58	273	246	261	246
query59	2062	2165	2079	2079
query60	310	315	301	301
query61	156	153	154	153
query62	433	364	313	313
query63	295	265	268	265
query64	4862	1336	975	975
query65	3779	3683	3655	3655
query66	1436	414	301	301
query67	15084	15357	15373	15357
query68	7517	969	709	709
query69	505	345	297	297
query70	985	956	970	956
query71	374	312	271	271
query72	6104	3406	3206	3206
query73	765	725	303	303
query74	8803	8804	8614	8614
query75	2818	2812	2448	2448
query76	3902	1045	635	635
query77	516	367	278	278
query78	9655	9689	9123	9123
query79	1359	908	634	634
query80	615	565	488	488
query81	508	261	228	228
query82	208	145	111	111
query83	264	264	240	240
query84	253	120	102	102
query85	881	501	448	448
query86	394	294	296	294
query87	2899	2839	2742	2742
query88	3767	2224	2230	2224
query89	385	352	324	324
query90	2197	157	159	157
query91	173	163	142	142
query92	82	69	63	63
query93	1689	910	534	534
query94	584	312	309	309
query95	567	327	302	302
query96	590	486	202	202
query97	2303	2373	2291	2291
query98	223	202	197	197
query99	585	597	527	527
Total cold run time: 254529 ms
Total hot run time: 172753 ms

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.56% (16/45) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

ClickBench: Total hot run time: 26.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6cc98e86392e427a349289135281911d2225de14, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.09	0.09
query4	1.61	0.12	0.11
query5	0.28	0.26	0.25
query6	1.14	0.66	0.65
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.57	0.50	0.49
query10	0.55	0.54	0.56
query11	0.14	0.10	0.09
query12	0.15	0.11	0.11
query13	0.60	0.58	0.58
query14	0.96	0.96	0.95
query15	0.79	0.78	0.77
query16	0.43	0.40	0.42
query17	1.06	1.05	1.04
query18	0.23	0.21	0.21
query19	1.95	1.87	1.84
query20	0.02	0.01	0.01
query21	15.44	0.27	0.14
query22	5.18	0.06	0.05
query23	15.78	0.30	0.10
query24	2.31	0.66	0.31
query25	0.07	0.08	0.08
query26	0.14	0.14	0.14
query27	0.07	0.06	0.07
query28	4.03	1.05	0.88
query29	12.61	3.86	3.18
query30	0.27	0.13	0.12
query31	2.80	0.65	0.40
query32	3.24	0.56	0.46
query33	3.02	2.99	3.00
query34	16.63	5.10	4.46
query35	4.49	4.53	4.46
query36	0.68	0.50	0.50
query37	0.11	0.07	0.06
query38	0.06	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.14	0.13
query41	0.08	0.03	0.02
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.29 s
Total hot run time: 26.97 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants