-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Vulkan Portability code path on MacOS with MVK extensions #4996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kd-11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to vulkan are ok, with the exception of minor formatting issues. Add a space between the // and the text in comments to avoid clang-format from messing up the changes. There are existing cases of the old format without leading space, but it was agreed that going forward leading spaces were preferred.
As for LLVM changes, they seem strange to me but neko will ultimately have to review that.
rpcs3/Emu/RSX/VK/VKHelpers.h
Outdated
|
|
||
| extensions.push_back(VK_KHR_SURFACE_EXTENSION_NAME); | ||
| extensions.push_back(VK_EXT_DEBUG_REPORT_EXTENSION_NAME); | ||
| if (support.is_supported(VK_EXT_DEBUG_REPORT_EXTENSION_NAME)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brace on newline
rpcs3/Emu/CPU/CPUTranslator.h
Outdated
| } | ||
| }; | ||
|
|
||
| #ifdef __APPLE__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this block, but I can't see why unsigned long needed specification since all the sized types are already handled above. Also why return 8-bit representation for 64-bit long? Or is it a compiler quirk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not completely sure either. Apparently, clang treats unsigned long as a distinct type for the template specialization.
Also why return 8-bit representation for 64-bit long? Or is it a compiler quirk?
oops, nice catch! 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not necessary, the issue is incompatibility between u64 and uptr on MacOS (unsigned long long vs unsigned long, same thing but compiler doesn't deal with it properly). Change uptr to uint64_t in types.h and Mac should build fine.
2328: [mtl] expose anisotropy and DXT format features r=grovesNL a=kvark These are required for RPCS3/rpcs3#4996 PR checklist: - [ ] `make` succeeds (on *nix) - [ ] `make reftests` succeeds - [ ] tested examples with the following backends: - [ ] `rustfmt` run on changed code Co-authored-by: Dzmitry Malyshau <[email protected]>
|
I believe the current issues are addressed. |
|
@kvark Any plans to eventually finish working on the NSView display code? I don't have access to relevant hardware to help with that and it would be a shame to have it stick around as dead code. |
| target_link_libraries(rpcs3 "${RPCS3_SRC_DIR}/../3rdparty/discord-rpc/lib/libdiscord-rpc-linux.a") | ||
| elseif(APPLE) | ||
| target_link_libraries(rpcs3 "${RPCS3_SRC_DIR}/../3rdparty/discord-rpc/lib/libdiscord-rpc-mac.a") | ||
| target_link_libraries(rpcs3 "${RPCS3_SRC_DIR}/../3rdparty/discord-rpc/lib/libdiscord-rpc-mac.a" objc "-framework Foundation" "-framework CoreServices") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd appreciate it if these changes were added to the other -framework link_libraries on line 421
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are needed specifically for discord integration, hence I put it this way. I don't mind changing it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, alright it's fine to leave them there then.
|
Swizzle emulation is fixed by #5024. I tested on windows and it worked fine for both ingame render and the UI overlay. Just need to finish the NSView drawing and support should be complete, I'm marking this set as WIP so we can finish this in one go. Are occlusion queries a motlenVK limitation or just gfx-rs? Because some games will never work properly without this and simulation is very difficult. |
AFAIK, the missing code is for an emulated swapchain (confusingly called "native" swapchain...) - to be used when the real Vulkan swapchain is unavailable. It's not supposed to be used most of the time, so I don't think the lack of this path should be blocking the PR.
Amazing, thank you!
They are a limitation of gfx-rs at the moment. I've just finished implementing them in gfx-rs/gfx#2334, passing all the relevant Vulkan CTS, so we just need to land all the things to be ready. |
|
I don't know if Qt does proper hw-swapchain-compatible windows, so you're saying it does actually work? iirc the MVK swapchain extension was not working last time someone tried to get it working. If it does actually work then simply adding the swizzle emulation and this PR should have MVK working. |
When I was testing it last time, swapchain creation went fine on gfx-portability.
IIRC, MVK expects the |
|
@kd-11 I tried your branch with my latest changes in queries and got stuck in the next problem: RPCS3 tries to create a 384MB buffer that has |
|
@kvark For now, yes. Working around this would involve a separate upload buffer and a second buffer copy. As far as I could see the host-visible nature does not degrade performance when used for upload in this fashion so it was optimized around that. How much can you allocate once for host access on metal? Those are just storage heaps, you can make them smaller for testing, with possible out-of-memory situations that will degrade performance. You can set to 128M for example and simple applications will not crash. |
|
Why would a second buffer copy be needed?
The Metal limitation here is not the size (sorry, I should have clarified that). We don't support texel buffer views from shared memory.
… On Aug 23, 2018, at 08:47, kd-11 ***@***.***> wrote:
@kvark For now, yes. Working around this would involve a separate upload buffer and a second buffer copy. As far as I could see the host-visible nature does not degrade performance when used for upload in this fashion so it was optimized around that. How much can you allocate once for host access on metal? Those are just storage heaps, you can make them smaller for testing, with possible out-of-memory situations that will degrade performance. You can set to 128M for example and simple applications will not crash.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
How else do we transfer a large bytestream (even 384M is not large enough sometimes) for random access in the vertex shader? Isn't the only other solution to have an upload buffer and then copy to the actual stream using a GPU-side transfer? By second buffer copy I meant there are 2 copies, one from CPU->buffer1 then buffer1->buffer2. |
|
I'll push a workaround for the texel buffer incompatibility on the textures PR in a few minutes and we can see if it now works. 🤞 |
Oh, I read this as another buffer in memory. If you mean "buffer copy" in terms of operation (e.g. 2 copies), then sure, I understand. Although, CPU could be working with the HOST_VISIBLE memory directly (instead of doing "CPU->buffer1" copy), in which case there is a single copy. Do the contents change a lot between frames? |
|
@kd-11 FYI, our query support is almost landed, pending gfx-rs/portability#134
This is incredible! Thank you for addressing our blockers so promptly! ❤️ |
|
It would be wasteful of vram as the memory content is only guaranteed to remain intact for the immediate near future. This is largely due to design constraints where the cpu is often used to offload vertex calculations on ps3, so you end up reuploading almost the entire graphics memory every frame as it has changed. |
|
Thats wierd, the triangle is supposed to be centered. The colors are correct though so at least we know all the pipeline stages are working. With that I could just merge this PR as is then. How does it run on MVK alone (without gfx-rs) ? |
|
I forgot to mention - the part that is seemingly not applying is the matrix multiplication at the end of the vertex shader. Basically goes something like: The non-inline nature is largely since the code generator is shared with other languages such as HLSL. I'm wondering if for some reason on gfx-rs (and maybe MVK?) the |
Oops. Well, sounds like a low hanging fruit to fix anyway.
Someone still needs to look at the LLVM code (who should we ping?). That path doesn't work, and I'm running RPCS3 in interpretation mode.
Sorry, I'm not interested to debug MVK 😛
That's a great observation, it's going to easy the investigation, thank you! FWIW, we still use the same shader translator as MVK (SPIRV-Cross). I'll check the produced code. |
|
For all things LLVM, one pings his High Kotliness, i. e. @Nekotekina Neko, can you review the LLVM stuff int the PR |
|
@kd-11 I resolved the presentation issue (mentioned earlier) in gfx-rs/gfx#2345
I do see a matrix being applied, but it's a fairly identity-ish matrix. And the buffer it's read from is in Shared memory, so there isn't much to screw up in terms of synchronization. I'll keep looking... |
|
Hi all,
thanks.. |
We'll have them on our official website https://rpcs3.net
I don't really see any good reason why you should be running rpcs3 on wine. It just seems to me like an unnecessary option for the one person who wants to run rpcs3 on wine on MacOS instead of just running the native version. |
|
@hcorion thanks for pointing https://rpcs3.net .. I almost forgot it.. |
|
@kd-11 Alright, I nailed down the issues with the triangle sample (in gfx-rs/gfx#2348), and we are in a much better shape now: The only known issue so far is the crash on exit, happening within |
|
Looking very good! Nice work. Interesting how many issues were caught just trying to get RPCS3 to run, but such is the norm when real-world testing comes into play. The |
|
@kd-11 I'm fine with going through a bunch of blockers if it's all running smooth and dandy in the end :) Your help was crucial here, makes me respect RPCS3 developer team quite a bit. For those curious, I attached the pre-built binaries to gfx-rs/portability#144 |
|
The plan I believe is to distribute the gfx-rs/portability library with RPCS3, if licenses and such work together. |
rpcs3/Emu/RSX/VK/VKHelpers.h
Outdated
| VkResult present(u32 index) override | ||
| { | ||
| auto& src = swapchain_images[index]; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw an assert here, as this is theoretically unreachable since 'init' will fail with 0x0 dimensions
|
All major technical blockers have been dealt with, I have made some minor comments on how to fix the remaining one. With these changes this PR will be mergable. |
|
@kd-11 I updated the code according to your review notes and rebased. There is still an issue with exiting RPCS3 where it crashes (inside |
rpcs3/Emu/RSX/VK/VKHelpers.h
Outdated
|
|
||
| VkResult present(u32 index) override | ||
| { | ||
| assert(!"native MacOS swapchain is not implemented yet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use fmt::throw_exception(string) for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
There have been some reports about crashes when exiting on other systems as well, so it could be that your changes have nothing to do with it. But you can still continue to debug, it might be a seperate issue or you might find a simple bug and fix it for all versions. |
|
@MSuih I did attempt an investigation, but haven't found anything. Are the other reports registered somewhere so that I could at least compare the call stacks? |
|
Don't think so, I've just seen the occasional complaint on Discord. |
|
If its related I'll look into it, unfortunately I do not have a way to recreate this issue at the moment on my end. Merging this set as is, any other issues can be fixed in follow-ups. |



A step towards #4372
This PR enables us to start running with gfx-portability or MoltenVK. It's not enough to make the emulator functional just yet - much depends on maturing of the portability libraries. It also doesn't provide a "native' swapchain path yet (not a priority until we get something working first).
I believe there is value in having the code path open, for curious developers to experiment and advance the state of Vulkan Portability.
Technical issues/blockers I'm aware of (to be extended):
Too many samplers are requested by a pipeline layout. Metal only supports 16 samplers per stage on all devices, while RPCS3 requested 20.Texture swizzling is (almost) not supported.Occlusion queries are missing.Buffers inHOST_VISIBLEmemory withTEXEL_BUFFERusage are requested but not supportedBuffer views size limits are too small and not checked forSurface needs to handle being out of dateMatrix isn't applied properly in vertex shaders.~user_interfaceon exit.r? @kd-11