-
Notifications
You must be signed in to change notification settings - Fork 0
Implement JIT-to-JIT calls #109
Conversation
Love the ambition, excited to see how it performs! |
|
I finished up the implementation using |
|
Just arrived at the airport hotel in Narita. Getting settled and flying to Matsuyama tomorrow :)
Wow, go Kokubun! Awesome that we can indeed beat YJIT on this microbenchmark with fast JIT-to-JIT calls! For performance I think we should avoid manipulating the CFP in JIT-to-JIT calls, to be maximally efficient. We can recover callees and unwind the stack using C stack pointers instead. We can still merge this PR without that change though. |
| // Recursively compile callee ISEQs | ||
| while let Some((branch, iseq)) = branch_iseqs.pop() { | ||
| // Disable profiling. This will be the last use of the profiling information for the ISEQ. | ||
| unsafe { rb_zjit_profile_disable(iseq); } | ||
|
|
||
| // Compile the ISEQ | ||
| if let Some((callee_ptr, callee_branch_iseqs)) = gen_iseq(cb, iseq) { | ||
| let callee_addr = callee_ptr.raw_ptr(cb); | ||
| branch.regenerate(cb, |asm| { | ||
| asm.ccall(callee_addr, vec![]); | ||
| }); | ||
| branch_iseqs.extend(callee_branch_iseqs); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to avoid this long term. @tekknolagi had suggested using a trampoline to implement calls. The advantage is that this permits each ISEQ to only get compiled when its call threshold is hit.
Still open to merging this PR, but we should move towards using a trampoline for ISEQs that were not yet compiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sure, we can leave a trampoline to lazily compile callee ISEQs.
I'd like to land the changes in this PR first and work on it in a separate PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can land the changes in this PR first 👍
I really appreciate the work that you put into this :)
maximecb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to give other a chance to review the PR as well. Impressed you got it working this fast.
Yeah I agree. We should probably make |
I've been leaning towards that too. Logically, it seems like self should be a method argument. We can also have logic (later on) to simply not pass function arguments that are not used by the callee when we know the identity of the callee (call direct), so arguments that are not used can be "free" in many cases. |
* Implement JIT-to-JIT calls * Use a closer dummy address for Arm64 * Revert an obsoleted change * Revert a few more obsoleted changes * Fix outdated comments * Explain PosMarkers for CCall * s/JIT code/machine code/ * Get rid of ParallelMov
* Implement JIT-to-JIT calls * Use a closer dummy address for Arm64 * Revert an obsoleted change * Revert a few more obsoleted changes * Fix outdated comments * Explain PosMarkers for CCall * s/JIT code/machine code/ * Get rid of ParallelMov
* Implement JIT-to-JIT calls * Use a closer dummy address for Arm64 * Revert an obsoleted change * Revert a few more obsoleted changes * Fix outdated comments * Explain PosMarkers for CCall * s/JIT code/machine code/ * Get rid of ParallelMov
* Implement JIT-to-JIT calls * Use a closer dummy address for Arm64 * Revert an obsoleted change * Revert a few more obsoleted changes * Fix outdated comments * Explain PosMarkers for CCall * s/JIT code/machine code/ * Get rid of ParallelMov
* Implement JIT-to-JIT calls * Use a closer dummy address for Arm64 * Revert an obsoleted change * Revert a few more obsoleted changes * Fix outdated comments * Explain PosMarkers for CCall * s/JIT code/machine code/ * Get rid of ParallelMov
This PR implements JIT-to-JIT calls using
call/retinstructions.Benchmark
x86_64
arm64
Generated code
Details