Minutes 2022 09 28

GPU Web 2022-09-28

Chair: Corentin

Scribe: Ken

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
“Tacit Resolution” Queue
Override constant should not provided by pipeline as NaN or Inf #3435
createPipelineAsync() doesn't reject its promise, even if the created pipeline is invalid #3296
mapAsync(WRITE) zero-fills the relevant region of the buffer #2926
Frame advancement of canvases that aren't visible #3295
(stretch) Require effective storage buffer binding size be a multiple of 4 #3477
(stretch) Passive Fingerprinting Surface #3101
Agenda for next meeting

Attendance

Apple
- Myles C. Maxfield
Google
- Brandon Jones
- Corentin Wallez
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
Microsoft
- Rafael Cintron
Mozilla
- Kelsey Gilbert

Administrivia

CW: Postpone APAC meeting for China National Day.
- Oct 13-14 now.

CTS Update

Gyuyoung continuing work on validation tests
Continued finding more validation rules that weren't tested
Working on more operation tests
Ryan found issues with pack16x2 undefined behavior on some backends

“Tacit Resolution” Queue

Throw an exception when using a wrong-sized array for the array overload of GPUExtent/Origin/Color. (Add extent/origin/color "reification" step #1838)
- BJ: brief background - doesn't seem like it'd be a problem to take in an array of any size larger than the amount, but in Chrome's bindings, you may copy a much much larger array as it's passed into the native side. Perf cliff we don't want people to run into.
- Either - write a lot of custom bindings code, or make it a validation error and have people do the right thing all the time.
- BJ: we'd potentially say this is a valid use case, but provide a performance warning - seemed more straightforward to make it a validation error. Can always loosen it in the future.
- KG: less mandatory validation better in general, but this is fine.
- CW: now waiting for a PR.
Don’t change object label behavior, the label getter returns the last label from that JS wrapper even if there are other JS wrappers for the same object. The way the browser actually uses the label is completely implementation-defined. (Object labels require client-shared state #3263) We can update the non-normative note to give an example of how labels might be used.
getBindGroupLayout already returns a new JS wrapper each time. With #3263, it no longer cares whether it returns the same underlying object or a new one. (Does getBindGroupLayout return the same underlying object or a new one? #3454) We can reland #3341 so the spec doesn’t have to have a concept of “same underlying object” yet.

Override constant should not provided by pipeline as NaN or Inf #3435

KG: Sort of like run-time clamp Web IDL extra attribute?
MM: unlike garbage-in-garbage-out, this is more throwing an exception. No specific preference.
CW: what about doing clamping behavior?
KG: there's no Web IDL for it - it's a JS number (double). i32, u32, f32, f16. Should we clamp the f64 to those value ranges? Or throw TypeError?
MM: for uploading vertex buffer we don't look at all uploaded data. If we did this we'd be inconsistent with other data import mechanisms to WebGPU.
KG: that'd lean towards unrestricted floats. Doing this not in Web IDL bindings code, but ourselves. Could warn.
MM: similar discussion in WGSL. Math expression that's nonsensical, should that be an error?
CW: Kelsey's point - constexpr stuff - pipeline overridable consts less dependent on runtime bugs.
KG: less dependent than compile time, more dependent than runtime. Issues like, division by 0 regarding size of a value.
MM: WGSL direction - for constexprs that are nonsensical - there's a point on the CPU where we inspect the values, so do the check there. Similar situation here, we have a place we have to look at the value.
KG: there'll be content that does this some of the time. may make browsers run more inconsistently. If we do the strict thing - developers will come back to us. And they can clamp themselves. I'm satisfied with making this strict.
KN: we expect this conversion to be done in the GPU process. So this'd be a validation error, not type error. Also - we have to check the values no matter what we choose. Static NaNs in overrides - platform compiler could invoke undefined behavior. Have to sanitize the inputs; question is how.
KG: good point.
MM: starting conservative sounds good.
CW: consensus on validation error?
KG / MM: yes.
KN: given that - one other place in the API with values coming in - clear value, converted to WGSL value. Currently spec'd using conversion to unrestricted. For consistency - do both? No strong feeling. We know it's a color. In this situation we don't know what the app will do with it.
MM: consistency's good here. App wanting to clear buffers to NaN - what do you do next? How do you do a blend?
KG: if someone comes back to us later asking for this - we can consider then.
KN: clear color / depth aren't like this. Won't need to sanitize them. Pass in garbage - it'll be fine. Can't pass in NaN due to IDL rules, but can pass in out-of-range values. Those will get clamped.
KG: would argue we should warn. I prefer making it an error.
KN: OK, can change both.
CW: OK, have consensus on this too.

createPipelineAsync() doesn't reject its promise, even if the created pipeline is invalid #3296

CW: not sure where we left off last time. Everything's similar & polyfillable - I think we should keep what we have. No pipeline introspection we need to do in the error case.
MM: fact that we're using Promises but never rejecting them if they don't complete successfully is a problem that we should fix, and that we have the opportunity to fix it now.
KG: is this a problem? Main utility is that you can await on it. Fine pattern, I use it in my own code.
MM: interested where developer wants to know compilation was successful.
KG: what if you awaited for it to be compiled, and then checked a field on it?
MM: possible. Reason I bring this up - we have a mechanism, in the Promise. Histories of Promises rejecting when things break. We would not be following that pattern.
KR: Developers are going to have error handling code, if they want it, for synchronous creation. The value of this is that you find out when it’s ready. I think we should leave this alone so the error handling between the promise and non-promise versions are the same.
GT: depending on your POV - Fetch doesn't reject Promises. Have to look at the error code.
MM:
KN: depends on what you mean by failure.
KG: 404 means "successfully received 404" - different from network error.
KG: think this needs concrete proposal.
CW: if mod to spec is to say, instead of resolve with ErrorPipeline or null, we reject - sounds fine. I don't have an opinion about that. No change is better IMO, but this is a fine level of change.
KN: I forgot what we last discussed. I agree with both Myles' and Ken's concerns. They're in conflict with each other. Myles has an example of error handling code - kind of nasty where we don't reject the Promise. But I don't like having separate error handling code for Promises and non-Promises. Personally - I think it's better to reject the Promise, because people can write different error handling code for the two cases. The code you'd write to handle errors in async case would get weird, and people would have a hard time doing it correctly.
KN: I proposed eliminating createPipelineAsync, and adding a "ready" attribute. Difference between ready true and false - reportReady(true), impossible to use the pipeline. Propose reshaping API on top of existing semantics. No 'join' operation where we stop the queue and wait for pipeline creation to finish. There's a FR for this, but not easy to implement. Not proven it's necessary. Idea here- if not ready, you get an exception. No complex handling code in WebGPU impl except at very front, in bindings code. I think that'd solve both problems. If people don't want to do that then rejecting the Promise sounds good to me.
KG: I really like just having createPipeline always async with a stall if you use it before ready.
CW: proposal would work but I'm concerned about major changes like this at this point. We can do it, but it'll take weeks of impl.
KR: Don’t think we should move to a polling-based API based on experience with KHR_parallel_shader_compile. First thing people asked for was a Promise based API.
MM/BJ: Not polling based. pipeline.ready would be a Promise<void>
KG: ideally I'd want "await compilation result info" or similar. Think it falls nicely out of Kai's proposal, but happy
KN: my proposal fits nicely into that in the future.
MM: would compilation failures cause the pipeline.ready Promise to reject?
KN: my proposal'd be no. To make it more symmetric with non-async version.
BJ: I have concerns about changing the API shape at this stage. Willing to do it if we think it'd address critical problems. Kai said - if you use this variant of createPipeline, and pass the pipeline to e.g. setPipeline before ready has resolved - you'd get an error?
KN: you'd get JS exception. Track state on JS side.
BJ: unfortunate if we're taking initiative to redo this API shape (not convinced we should do this) - concern about createPipelineAsync was no way to join with sync version of the call. Feels to me - if this thing'll synchronously give me back a pipeline handle, and tell me when it's ready to go - but having it fail if you pass it into one of the other APIs - it's a real issue we've heard from other developers. Tragedy to me if we refactor the API, give you back pipeline handle - and then don't do what devs have asked.
KN: could do this after V1, but wouldn't need to do it now. Exception now doesn't mean an exception forever.
KG: I'm more comfortable with validation error than exception.
KN: value of exception - it's more straightforward to implement on top of what we already have. In Chrome we can do it - we can inject validation errors when we want.
KR: I feel that this design is fragile - get an object that may or may not work based on the timing with which you use it. You might do the wrong thing and get a correct answer on some implementations.
KG: Doesn’t have to be racy. We can know whether you awaited the promise.
KR: But don’t know on the JS side whether you’ve passed the point where it’s ready.
MM: imagine pipeline.ready.then(() => ()). Then a few frames later the app assumes Promise is resolved.
KN: my proposal - you can't use the thing at all until JS figures out it's ready. Couldn't race with GPU process, but could race in JS.
KR: I suggest tabling Kai's proposal and sticking with the current API shape.
Discussing rejecting Promise as the consensus.
KG: not ready to reject all proposals.
KN: what do you think the consensus point is, roughly?
KG: would prefer to write it down later today.
KR: we were close to consensus - either reject Promise from createPipelineAsync, or don't reject it. Can we not get back to that simpler state and just make a decision?
More discussion.
KN: is your objection to the proposal to reject - just that you think there's another proposal that'll work better?
KG: think it's not me being the main person saying no to this - think there are couple of people who have other opinions.
KN: I think Ken's absolutely right - you could use the pipeline in a flaky way.
KG: Specifically I can agree directionally [that it’s at-all flakey] but I don’t agree in magnitude: I don’t think it’s a deal-breaker [vs other benefits of this proposal].
CW: can't make more progress on this today. Let's discuss offline.
MM: Ken's opinion on flakiness was convincing to me.
KN: me too.
KG: OK, maybe it's me.

Require effective storage buffer binding size be a multiple of 4 #3477

CW: on D3D12, runtime has validation - when you impl storage buffers using byte-addressed buffers (needed for WebGPU), buffer size has to be multiple of 4 (because of using R32_TYPELESS). Add validation so this is implementable on D3D12.
MM: sounds legit.
CW: only issue - f16, have to write 8 instead of 6.
MM: counterproposal is to round up silently?
CW: yes. Array length in WGSL would be computed incorrectly in HLSL. Array length passed as additional thing in root signature. Dynamic array of f16 - HLSL would see length of 4 instead of 3.
MM: not lot of sympathy - Metal has to pass these lengths out-of-band too - but a little sympathy.
CW: can we add the validation of a multiple of 4?
MM: particularly interested in Microsoft's opinions.
RC: no informed opinion about this issue, but will attempt to get one from others.
CW: let's revisit next week.
MM: if no strong opinions - fine to make the change.
CW: sounds good.

mapAsync(WRITE) zero-fills the relevant region of the buffer #2926

MM: was implementing buffer mapping. Should the zero-filling of ARrayBuffers exposed to JS happen in mapAsync, or getMappedRange? Did add'l investigation - the spec doesnt' say to do zero-filling at all. Says data in GPU buffer should be visible in the ArrayBuffer, When mapping for write, can read data in the GPU buffer. Short moment of horror. 2 ways to impl - copy data from GPU buffer to ArrayBuffer (e.g. on discrete GPU) - harmful. App's told us they want to write. Copying back - unnecessary. Just in case they accidentally read some data.
MM: other impl strategy - don't copy the data, have shadow buffer on CPU. Copy of buffer on the GPU. Maintain with another validation rule. If buffer map writable, can only get data into it via map_write. No storage buffer.
MM: second impl strategy fro mapWrite - take shadow buffer, return pointers to it. JS does reading/writing - then copy to real GPU buffer.
MM: downside - you have another buffer, same size as GPU buffer, same lifetime. Unfortunate.
MM: third option - what I thought behavior would be - ArrayBuffers exposed to JS would be zero-filled. Map for writing, look at it - returns 0. Someone has to set the zeros. Either GPU does blit, or GPU has to run memset.
MM: interesting design space. Wrote long post. Makes judgment claim - doing zero-fill on GPU with parallel DMA, cost would be smaller than 2-3x memory use for each buffer.
KG: thanks for writing this up. Useful. Worried about - having zero-fill be fast is imp't, but zero-filling is memory bandwidth cost. Direction my concerns come from.
BJ: MM and I discussed at Editors' meeting. Feels like path where you're returning the content of the buffer itself - though we can narrowly make assumptions (write-only, will never change aside from JS) - allows for an optimization path. Uses more memory in favor of doing readbacks. Keeping shadow buffers around to avoid readbacks. Maybe good solution for some devices, but not for all. Unfortunate to have API design funneling you into only one optimization path. If you have policy of zeroing out buffers - feels more flexible in terms of optimization opportunities. Changes general optimization path to favor memory usage over write bandwidth.
CW: out of time. Thank you Myles for the summary. You have great understanding of all these things having looked at them so much now. +1 on what Brandon said. Caveat - main issue is, triply mapped buffers may be hard to get working in all cases. Optimizing for that case is scary. Haven't tried making it work. Microsoft's warned us about difficulties. There are tradeoffs. Continue discussion next week.
MM: for context - we're expecting every mapWrite buffer on Apple Silicon to be triply mapped. Also on Intel. And maybe on other systems.

Frame advancement of canvases that aren't visible #3295

Passive Fingerprinting Surface #3101

Agenda for next meeting

Require effective storage buffer binding size be a multiple of 4 #3477

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2022 09 28

GPU Web 2022-09-28

Tentative agenda

Attendance

Administrivia

CTS Update

“Tacit Resolution” Queue

Override constant should not provided by pipeline as NaN or Inf #3435

createPipelineAsync() doesn't reject its promise, even if the created pipeline is invalid #3296

Require effective storage buffer binding size be a multiple of 4 #3477

mapAsync(WRITE) zero-fills the relevant region of the buffer #2926

Frame advancement of canvases that aren't visible #3295

Passive Fingerprinting Surface #3101

Agenda for next meeting

Clone this wiki locally