Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heuristic for pattern matching untagged variants. #7128

Merged
merged 5 commits into from
Oct 26, 2024

Conversation

cristianoc
Copy link
Collaborator

Fixes #7121

@cristianoc cristianoc force-pushed the uncurried-variant-heuristic branch from bdc0d05 to c825694 Compare October 25, 2024 05:48
}

function ff(x) {
if (x === "G" || x === "F" || x === "E" || x === "D" || x === "C" || x === "B" || x === "A") {
Copy link
Collaborator Author

@cristianoc cristianoc Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cknitt see this case that is now worse

This is why I propose a heuristic based on how many cases with, and without payloads exist.
To determine which code to generate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cometkim do you have a suggestion on how to set the threshold to maximise things like perf/code size etc?
There are these 2 ways of generating code. One is clearly better with a lot of cases without payload (ff here). The other is better when there are very few cases without payload and many cases with payload (e.g. json). The question is how to decide in the middle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cknitt see this case that is now worse

This is why I propose a heuristic based on how many cases with, and without payloads exist. To determine which code to generate.

Yes, sounds good to me, a heuristic-based approach will definitely improve the current situation without making things worse like in the example here. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this improves the predictability because the output is easier to understand. Untagged variants are not generally expected to be the best performance.

However, from a perf perspective, typeof checks are always slightly better than full literal checks, especially if it allows you to skip other branches.

I value how much more understandable it is rather than the actual code size. If the number of circuits is the same or similar, there is no real difference. It's just that typeof checks tend to be harder to read and understand when they get complex.

There are these 2 ways of generating code. One is clearly better with a lot of cases without payload (ff here). The other is better when there are very few cases without payload and many cases with payload (e.g. json). The question is how to decide in the middle.

Not sure if possible, I'd guess it would be nice to use the current on the default strategy but do the opposite if the check matches the _ pattern.

@cristianoc cristianoc requested a review from cknitt October 25, 2024 05:49
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Syntax Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.05.

Benchmark suite Current: b12f189 Previous: e1b7fb7 Ratio
Parse RedBlackTree.res - time/run 1.3672911466666666 ms 1.2123143266666667 ms 1.13

This comment was automatically generated by workflow using github-action-benchmark.

@cristianoc
Copy link
Collaborator Author

@cknitt I have chosen size of generated code as heuristic, ready for final review.

Btw make test is broken as currently one might need to do touch tests/tests/src/* to make it pick up changes. The change time of those files is not a reliable indicator as changes in the compiler will lead to producing different code even if the sources have not changed.
Also, it would be nice to have a makefile option to just refresh the generated code without having to run all tests too. As that's frequent during development (as make lib used to do implicitly).

@@ -386,11 +386,11 @@ function check$1(s) {
return;
}
let match = s[0];
if (match === true) {
if ((match === null || match === true || match === false || match === undefined) && match === true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cristianoc Thanks a lot for looking into this so quickly!
Most cases look better now. But this one still feels a bit weird as the lhs of the && is obviously unnecessary.
Is it possible to improve it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and it's not related to this PR.
You're asking for a back end code simplification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I think this PR is ready to go and we can track the above case in a different issue. 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with_literal_cases:

(e == Null) || (e ==true) || (e == false) || (e == Undefined)

without_literal_cases:

!(e instanceof Array) && ((e == Null) || (typeof e != Object)) && (typeof e != Int) && (typeof e != String)

So it is true that it generates the smaller expression (the former).
The second one must be simplified in the back-end, and not the first one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added extra simplification directly to this PR.

@@ -386,11 +386,11 @@ function check$1(s) {
return;
}
let match = s[0];
if (match === true) {
if ((match === null || match === true || match === false || match === undefined) && match === true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I think this PR is ready to go and we can track the above case in a different issue. 👍

@cristianoc
Copy link
Collaborator Author

@cknitt ready for a final review.

Copy link
Member

@cknitt cknitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! Thanks again! 🙇

@cristianoc cristianoc merged commit 60b99ea into master Oct 26, 2024
20 checks passed
@cristianoc cristianoc deleted the uncurried-variant-heuristic branch October 26, 2024 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inefficient code emitted for pattern match on unboxed variants
4 participants