-
Notifications
You must be signed in to change notification settings - Fork 5.5k
ZJIT: Use Vec instead of HashMap for profiling #13809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
ZJIT: Pre-size all profile entries for number of operands
- Loading branch information
commit c00566de23a1aeed7a74c7675d961b1371511d54
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't quite understand how the pre-sizing is helping here. You're still giving
vec![]as the initial value to each vector and then resizing each.If we resize the vector when we insert to the index (only) for the first time, it's probably not slower, is it? If we do so, we can skip resizing the vector for unused instructions, so not doing the pre-sizing could be faster and save memory, IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pre-sizing happens once per ISEQ per VM
The previous thing checked once per profile per opcode per ISEQ on the hot path if the vec was the right size
It was an additional 100-200ms savings, I think
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that the current version is faster, but I'm only suggesting the same speedup could be achieved without pre-sizing.
It seems like the slowness in the previous version came (not from "checking" it but) from the fact that the vector was optional and we ended up using
clone()every time to upsertSome(). But now that it's a non-optional vector and you can always get&mutof it, you should be able to check if the vector length is 0 and resize it on the first profile, without paying an allocation overhead on future profiles for the same opcode. Checking the length of a vector should be a negligible overhead compared to cloning (or resizing) a vector.So I didn't get why it was changed to pre-sizing, but since it's already an improvement, I'm fine with merging this as is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the second commit for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, reverting it seems to slow it down by 5% or so. I guess doing multiple allocations at once helps? Since it does give a speedup in benchmarks, I'm actually fine with not reverting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll probably instead change profiling to not use
Typebut instead do aClassDistribution-type thing of(VALUE class, shape_id_t shape)tuples. Is there a clean way of representing:in one small data structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no such thing in CRuby itself. YJIT built its own variable-length encoding of profiling data in
Context::encode()/Context::decode(). We probably need something different from YJIT's, so I guess we'll build one for ZJIT too.Btw shapes are effectively used only for ivars today, so it's probably inefficient to use a Tuple that contains a shape for every instruction.