-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add subgroups feature #4963
base: main
Are you sure you want to change the base?
Conversation
* API * new feature: subgroups * new properties for adapter info: * minSubgroupSize * maxSubgroupSize * WGSL * new built-in values * subgroup_invocation_id * subgroup_size * subgroup and quad built-in functions * new uniformity diagnostic subgroup_uniformity
Previews, as seen when this build job started (95fceec): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main question is whether we want a new GPUAdapterProperties object or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Added a bunch of comments but they are relatively minor.
operating in non-uniform control flow and device compilers often aggressively | ||
optimize such code. | ||
The result is that the subgroup may contain a different set of active | ||
invocations than the shader author expects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe link in to the new uniformity analysis again, telling authors that there is more portability if they abide by them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is noted below, but IDK that shader authors would go read ### Subgroup Operations ### {#subgroup-ops}
itself instead of the intro + definition of builtins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The surrounding text was rewritten, please take another look.
|
||
A [=trigger/subgroup_uniformity=] [=diagnostic=] is [=triggered=] if | ||
`delta` is not a [=uniform value=]. | ||
An [=indeterminate value=] is returned if `subgroup_invocation_id + delta` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand how one would use this and what's the benefit compared to a subgroupShuffle
. It look like a uniform delta would be useful to do a shift, but then the highest active invocation would get an indeterminate value (what we'd want would be a rotation). Should we instead expose some form of shuffleRotate that is easy to use correctly when in subgroup-uniform control flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is likely much cheaper on some implementations than a shuffle. I guess it depends on use case. I don't know all the best use cases for this, but some sort of iterative compaction wouldn't care about indeterminate values.
There is a rotate primitive in MSL and SPIR-V, though support in SPIR-V seems rather sparse (VK_KHR_shader_subgroup_rotate is ~8.3% supported). It could be polyfilled, but it would likely also be more expensive than a pure rotate on some hardware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No further comments on the API side
* move minSubgroupSize and maxSubgroupSize into GPUAdapterInfo * provide default values * Fix link to wgsl spec * Fix uniformity rules in wgsl related to parameters they are required to be uniform
@@ -1888,6 +1889,8 @@ interface GPUAdapterInfo { | |||
readonly attribute DOMString architecture; | |||
readonly attribute DOMString device; | |||
readonly attribute DOMString description; | |||
readonly attribute unsigned long minSubgroupSize; | |||
readonly attribute unsigned long maxSubgroupSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we can come up with better names, but even with this is in GPUAdapterInfo, I find it really confusing that these sound like limits on what the application can do, rather than bounds on what the implementation can do.
subgroupSizeLowerBound
subgroupSizeUpperBound
??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally don't find it problematic, but I'm open to suggestions. The bounds don't seem that different to me. What about minSupportedSubgroupSize
and maxSupportedSubgroupSize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maxSupportedSubgroupSize
still doesn't solve the problem. It still sounds like the programmer tries something and the system accepts or denies. Also "supported" is used in the name of GPUSupportedLimits
to bound the things the programmer can do.
Second, in future the causality will be flipped around.
We're likely to add a future feature to let the programmer control the subgroup size. There are good reasons to do so, it's been requested (#3950), and more recent D3D and Vulkan can support it. It's not in this round because Metal can't. See the TODO at https://github.com/gpuweb/gpuweb/blob/main/proposals/subgroups.md#gpu-feature
I think the current names are good. (Also, it helps that Vulkan uses the same names. No reason to make a new name for the same concep.t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... interesting. I wonder if we would want to add real (configurable, enforced) limits when that happens, but I'm guessing that would not actually work, if I'm correct in my understanding that the subgroup size ranges of different devices are disjoint.
Is it possible that the limits on a user-specified subgroup size would ever be different from the bounds on what automatic subgroup size an implementation will produce? If that's possible then we would need to expose both (regardless of whether the new one is a limit or an info property), and they shouldn't have the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Vulkan feature is accessed at pipeline creation and the D3D feature is a shader attribute.
Vulkan has 2 pipeline creation flags:
- one allows varying subgroup sizes
- one requires full pipelines
Vulkan also has a pipeline creation structure to specify a specific subgroup size. It is not compatible with the varying subgroup size flag, but the full pipeline bit is optional.
D3D can specify a required subgroup size via an attribute in SM6.6.
The usual case is that the shader author has specialized versions of the shader for specific subgroup sizes.
So this additional functionality isn't really limits. It would be more like a shader/pipeline value to specify a specific subgroup size. That value could be validated against the adapter info being added here (otherwise you'd expect failed compiles).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WGSL side looking very good. Thanks!
Co-authored-by: Kai Ninomiya <[email protected]>
* Mention subgroup_size is uniform in compute shaders in the uniformity analysis description of built-in values * Updates to the subgroups section * Fix typos and links
* add notes about uniform control flow scope * add prefix to inclusive and exclusive subgroup op descriptions
WGSL 2024-11-19 Minutes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with wording nit.
Thanks!
Missing subgroup_branching diagnostic and uniformity updates.