Skip to content

Conversation

@GalLalouche
Copy link
Contributor

@GalLalouche GalLalouche commented Dec 11, 2025

This PR fixes a slowness introduced in #132757 that was first encountered on our nightly benchmarks. After much digging, and with @nik9000's invaluable help, the culprit was found to be the usage of what is apparently a very slow functional Stream in what is apparently a very hot path.

Screenshot from 2025-12-11 20-17-49

@GalLalouche GalLalouche added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL labels Dec 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @GalLalouche, I've created a changelog YAML for you.

@julian-elastic
Copy link
Contributor

julian-elastic commented Dec 11, 2025

Can you please explain how and why the fix works?

@idegtiarenko
Copy link
Contributor

Do you have a corresponding after measurement to compare the gains?

@GalLalouche
Copy link
Contributor Author

GalLalouche commented Dec 12, 2025

Can you please explain how and why the fix works?

@julian-elastic Like the comment says: the Stream code was apparently way too slow for this hot path. I admit I was very surprised! I know streams are supposed to be slower than for loops, but not that slow. It seems this function is called for every row read (to know if it should create a new page or not), which is why performance is critical here.

@GalLalouche
Copy link
Contributor Author

GalLalouche commented Dec 12, 2025

Do you have a corresponding after measurement to compare the gains?

@idegtiarenko Unfortunately, I was unable to run the rally tracks locally, as it took forever and then crashed. But running the relevant queries on my laptop was fairly consistent: this fix remove almost all of the slowness introduced by #132757. I say almost, since there are some things which are not as trivial to optimize, caused by moving to a shared context, e.g., the fact that we are iterating over Maps instead of arrays.

@idegtiarenko
Copy link
Contributor

Sounds good, lets see the change after the next nightly benchmark run.
I believe that the change is a good idea to try.

return this.buildersAndLoaders.stream().flatMap(e -> e.values().stream()).mapToLong(bl -> bl.builder.estimatedBytes()).sum();
long sum = 0;
for (var builderAndLoader : this.buildersAndLoaders) {
for (BlockBuilderAndLoader blockBuilderAndLoader : builderAndLoader.values()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: sometimes it is also nice to avoid iterator creation in outer and especially inner loop.
Not sure if it is possible here, since inner is iterating over map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the inner it is indeed impossible, but I've changed the outer to use a plain for.

@GalLalouche
Copy link
Contributor Author

Sounds good, lets see the change after the next nightly benchmark run.
I believe that the change is a good idea to try.

At least for the rally benchmarks, this should reduce the time by almost 50%! So it should be very noticeable.

@GalLalouche GalLalouche enabled auto-merge (squash) December 13, 2025 16:49
@GalLalouche GalLalouche merged commit e4ef300 into elastic:main Dec 13, 2025
35 checks passed
parkertimmins pushed a commit to parkertimmins/elasticsearch that referenced this pull request Dec 17, 2025
…stic#139397)

This PR fixes a slowness introduced in elastic#132757 that was first encountered on our nightly benchmarks. After much digging, and with @nik9000's invaluable help, the culprit was found to be the usage of what is apparently a very slow functional Stream in what is apparently a very hot path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants