GPU Web 2020 06 08

Chair: Corentin

Scribe: Ken / Austin

Location: Google Meet

Tentative agenda

Clarify t2t copy same texture #823
Development-only API features #814
Copying of depth24plus formats #652 #789
Texture view format reinterpretation #744
Add minimumBufferSize to BGL entry #678 (if we have numbers)
PR burndown
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Myles C. Maxfield
Google
- Austin Eng
- Brandon Jones
- Corentin Wallez
- Idan Raiter
- Kai Ninomiya
- Ken Russell
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Matijs Toonen
Mehmet Oguz Derin
Timo de Kort

Administration

CW: Please add items to the agenda for the virtual F2F.
Development of API features, how we do extensions, etc. Agenda will be linked in the minutes.
MM: The F2F is scheduled for Monday and Tuesday June 22nd and 23rd. That’s also the first days of WWDC. I’m not actually suggesting moving it, but I wanted to let people know. We can come back to this in a week to decide.
- CW: Okay, we can revisit.
- KR: If we scheduled for the week after with the US holiday on Thurs and Fri, could we compress the schedule a bit? WebGPU Mon/Tue, WebGL Wed?
- CW: Sure that could work, and if we need more time we can schedule the week after.

Clarify t2t copy same texture #823

CW: Jiawei wrote the validation rules for copy t2t, and said “the subresources written to must be disjoint”. There was ambiguity in what that meant. Is it that the actual subresources must be disjoint, or the set of texels written to must be disjoint? Disjoint texels is what Vulkan allows, but D3D12 does not allow copying from one subresource to the same subresource. Do we say we can copy as long as the regions are different, if the baseArrayLayer is different?
MM: Why don’t we take the union of all the restrictions?
CW: the union of the restrictions - because you have to implement copying of multiple array layers with a loop in D3D12 - the restriction with from/to only happens if the copyOffset.z is the same for both of the texture copy views.
MM: can we put that in the spec?
KR: Use cases I’ve heard of is that people want to do mip map generation. I think the D3D12 restriction is fine. That you must copy from one distinct layer to another.
CW: the question Jiawei has if you copy from the left side of array layers 0 and 1 to the right side of array layers 1 and 2? The D3D impl, given that you have to copy with a loop, doesn’t have the problem.
MM: If 0 goes to 1 and 1 goes to 2, then 0 gets broadcast to both.
CW: left side of 0 -> right side of 1, left side of 1 -> right side of 2.
AE: the restriction makes sense to me. Different origin Z to start.
DM: I think we should restrict it farther. Subresources shouldn’t intersect at all, not just the base subresource. Concerned about resource states and image layouts. If same subresource both receives data and is copied from, you end up using suboptimal resource layouts.
CW: in Vulkan you don’t do a loop, and the driver does the optimal thing.
DM: but you have to make sure it’s not DEST_RESOURCE_OPTIMAL and SOURCE_RESOURCE_OPTIMAL.
AE: if the user wanted this would you want them to implement the loop themselves and transition things from transferSrc to Dest, etc.?
CW: maybe allow staggered copies, but not with the same base layer? I don’t see a case where they’d want to copy with the same base offset into the same texture.
DM: it’s a weird use case and I don’t think we want to allow that.
CW: can we resolve that the set of sub resources have to be disjoint?
RC: does this copying let people resize things?
CW: no.
MM: doesn’t that mean you can’t use this function to create mipmaps?
CW: it’s a copy texture to texture, so 1 texel = 1 texel.
MM: so you can’t use this to generate mipmaps.
KR: Guess I was confused about what this supports. Overall, I don’t think we need to worry about copying from the right side to the left side of the same texture level.
MM: Guess we can add the restriction, and if we don’t need it we can remove it. Sounds like there isn’t a compelling reason to not have the restriction.
CW: resolved: let’s have the bigger restriction.

Add minimumBufferSize to BGL entry #678 (we have numbers!)

IR: we implemented minBufferSize with essentially the option to put it in. If user doesn’t supply it, we check at draw. If they do, we check it when bind group creation happens.
IR: we do see a small increase in validation times. Around 16 buffers to check, we see ~25% longer to validate the commands if the user doesn’t give us a min size. It continues to be 25% overhead with more buffers. But the validation time is small compared to overall CPU side. ~30 nanoseconds worst case. For validation, it does add overhead to always check it, but it’s small because validation overhead is small.
MM: is that written down anywhere?
IR: I have a spreadsheet.
MM: could you link to it from the minutes?
CW: I will make a copy of it and put it in publicly visible Drive folder. It's here! (didn't need to make a copy)
MM: measuring % time increase is valuable, but more valuable metric: # draw calls per frame that are possible with options A and B. If those are both high enough that they’re sufficient, we can choose either option.
IR: sounds okay - Austin?
AE: yes, we can post the numbers, and figure out whether we can issue e.g. 20K draw calls per frame.
IR: we’ll follow up.
MM: I think all we need is to divide 16 ms by the # of draw calls we want to issue.
CW: the numbers for a reasonable amount of buffers are pretty small overhead compared to the existing validation overhead. Seems fine then.
JG: do these numbers include the option to specify minBufferSize explicitly?
CW: yes. 25% increase -> blue line on the charts are with minBufferSize specified in BGL. No draw time checks. Red line -> don’t specify it.
JG: it’s satisfactory for me, given that it’s an option, and that it’s a valuable optimization opportunity for engines. If the numbers were really bad, what would we do?
CW: we have to do the validation somehow anyhow.
JG: yes. The question is: if we give you the option to do less validation, is that as fast as if we required the validation all the time? Or, spec’ing minBufferSize all the time? If this is just about saying fast path is you specify minBufferSize, and we have a slow path, then how slow is the slow path.
MM: third option: instead of validation at BG creation time or draw call time, do it in the shader.
KR: Doesn't that imply doing programmable vertex pulling?
KN: this isn’t for vertices but bindings. But only for backends where we can do freeform accesses into buffer. Requires transformation of converting fetches in shader to fetches into buffer, then to the data the shader wants. Structured -> unstructured accesses, so we can do clamping on the accesses. Or do you mean prelude at the top which checks buffer sizes and exits?
JG: it seems in the possibility space, but not as attractive as not requiring minBufferSize, and giving them the tool to further optimize.
KN: I think this does tell us something useful - requiring minBufferSize is probably not worth it, but it’s still a useful tool.
JG: I agree.
DM: the numbers you show say that the per-draw-call validation is almost free. 1/70th of the validation time. Cost overlooked: surprises user may get. Per-draw-call validation - they get a new buffer from somewhere and it breaks. minBufferSize prevents that.
KN: minBufferSize would only move that from command buffer to bind group creation.
DM: bind group creation is like a resource. Not done often. Often it’s initialization / preparation time.
MM: first method is 32,000 draw calls per frame. The second, 31,000. That’s validation time + CPU time. Not sure that’s the right calculation.
IR: it might not apply to do that. Validation time variance was greater than the draw call validation time.
MM: if I were to look at these numbers I’d say there’s not much difference.
KR: what if validation cost goes down in the future?
AE: what if perf of Vulkan translation gets better? That’s the bulk of the overhead right now.
MM: Seems to me there is some performance, but forcing it on the developers doesn’t have enough of a performance increase. Letting the author specify should be optional but still possible.
CW: +1 on that.
AE: me too.
CW: goes with what DM was saying as well.
CW: if minBufferSize is optional, but present, then getBindGroupLayout should probably give you a minBufferSize. If you use getBindGroupLayout, you’re likely to target a single pipeline with it. It also educates people on this.
DM: what do you mean “it needs to give you that”? It gives you that as a number, or a BGL that has it as a property?
CW: if you make a BG with a BGL, you’re likely to use it with a single pipeline. So you should give enough data. (...more...)
AE: getBGL in certain situations, after you’ve compiled the pipeline and didn’t specify minBufferSize in BGL, now that we know it, if you call getBGL do you get minBufferSize?
CW: I was mistaken - pipeline layout would only be affected if you default the pipeline layout.
KN: makes sense. Should we only allow minBufferSize or the one required by the shader? Or do we want values between those two, or values above that?
MM: might as well allow values above it. If they want to be more strict, go ahead.
CW: can be useful for sharing uniform buffer between shaders.
KN: what about numbers between 0 and that required by the shader?
MM: seems a bad idea, pipeline creation should fail. 0 should probably also fail; optional int rather than 0==unspecified.
KN: I’d usually agree but not sure it matters in this case.
MM: it’s a dictionary field. Could be if you omit the value from the dictionary.
DM: we want to include at least 1 element of the unbound array, so 0 wouldn’t make sense for it.
MM: mentioning for philosophical rather than practical concerns.
Resolved optional, pipeline requires minBufferSize >= what they use. minBufferSize is optional (and 0 is not a special value)

Development-only API features #814

JG: There’s been some discussion on some extensions that some of us don’t feel confident we can expose to all websites in general, because they may surface fingerprinting or privacy concerns. We know we can’t expose security concerns, but the degree of privacy is a more flexible value.
- Pipeline statistics, full high-precision timestamps.
JG: Sounds like we want some extensions only for development (in dev tools). To that end, these may not be extensions that are ever going to show up on the public web, so users reading the spec should be warned that they may or may not exist and may only be usable in dev tools. To that end, where do we put them? What do we do with the extensions? Like say not every UA may be interested, so it’ll be in a separate doc. Or if you never see them in a default WebGPU implementation, do we pull them out of the spec?
JG: Assuming for pipeline statistics, some of us don’t want to expose, would a UA not wanting to implement it mean we put it somewhere that is not the main spec?
RC: Same spec, but annotated somehow?
JG: I like that idea
KN: They’re extensions anyway, so they’re always potentially not present. We can add text to say to not expect it to be available on user machines.
RC: How is synchronous validation errors a privacy concern?
KN: It’s not. It’s a potential development feature, so that when there’s a validation error, catch and throw. We’ll probably also have something that’s “breakpoint-on-validation-error”.
MM: TAG has a principle of not exposing whether or not dev tools are open. If we had an API like this, it shouldn’t interact with the other world.
KN: This issue is about things that do interact with the page’s world. Of course we can always do things that are only in dev tools.
JG: By “in devtools”, we mean you have to go into devtools and explicitly click a box to say you want it. I think what the TAG is concerned about is if you open devtools and the page detects it.
MM: Guess it depends on if that checkbox is sticky or not.
KN: What we’re talking about is something that already exists. For example, there’s a disableCache checkbox. Or, if you set a breakpoint, that timing could be detected.
MM: Both are not actually observable. It is possible for things to be slow.
KN: People interested in detecting if devtools are open probably care about 99.9% accuracy, not 100% accuracy.
MM: I feel more strongly about the last percent than you do.
JG: some cats are out of the bag. There are ways to tell if someone’s doing breakpoints through your library. Some of the things we’re discussing here are not as bad as that.
MM: How do you detect that?
JG: Could be introducing out-of-band fetches which checks if things are happening at the right rate. The false positive rate is probably tolerable. If it took 50 seconds to get between two points, it’s probably fine to have that.
KN: that’s what I was thinking.
BJ: there’s a lot of different combinations you can choose in DevTools that surface this information. Chrome and Safari have device emulation. If I see something that says it’s an 8 year old Android phone, but WebGL says you have a GTX 1080, I can guess it’s DevTools mucking with things. Niche situation, but the point is there are dozens of these you can look for which tell you that DevTools is open. Don’t think you can completely mask this.
MM: alternative is no new API, but features in the DevTools that surface this information.
JG: want to clarify - are we having meta-discussion about whether this is a blocker for further discussion, or whether individual UAs should take one stance or another? If we’re trying to convince each other about whether UAs should offer up new extensions when something’s enabled in DevTools, that seems to be a UA concern.
MM: The answer to that is fuzzy. At least our browser will strive to implement the whole spec, or at least as much of it as we can. We don’t want to have rules about what checkboxes we have in our devtools.
JG: I don’t want to have spec text that requires those either.
CW: concern is: some dev-only extensions could expose more fingerprinting possibilities, like pipeline statistics queries.
JG: this might be a good question then - Myles, you’d strive to implement the whole spec. One thing that might happen - one extension that your UA’s privacy & security teams say to not offer. At least not by default. Does that mean you wouldn’t want that extension in the core spec?
MM: yes.
JG: OK, good to know.
MM: Not saying the feature shouldn’t exist. Just saying putting something in the spec that makes fingerprinting easier, thumbs down.
KN: Question is more about: if for any reason, you choose to not implement some extension, would you want us to keep that out of the core spec.
MM: difference is the reason. If reason is fingerprinting, we wouldn’t want it, but if it’s because of a missing Metal feature for example, we’d plan to talk with the Metal team.
JG: I feel like we’ve made some progress and we can use the remaining time for something else.

PR burndown

Add validation for GPUDevice.createTexture() #799
- Skipped for this week
Add BC formats. #812
- Not mergeable because of Travis
- CW: might be some extensions that can’t be in the core spec for various reasons, but we want most of the extensions to be there.
- MM: just has to be very clear that it’s optional.
- KN: yes, that’s clear.
- CW: Kai suggested we could put things in hideable divs and spans.
- MM: that would be sufficient.
- KN: I’ll look at that someday. Probably not super urgent. Fundamentally they’re optional. Devs have to write code to turn these on - won’t happen by accident. Will figure out how to impl this in the spec document at some point.
- JG: the comment here is good but could be better if it said “is available and enabled”. To reiterate in these cases may help with Myles’ concern and might help in general.
- KN: I agree.
- MM: sounds good.
- CW: I’ll rebase, fix the error.
- MM: if we’re going to press the merge button can we file an issue about clarifying the spec?
- CW: we’re going to have a list of the formats and what they do. Would additional text there saying these are optional, and enabled with this extension be OK? Or blocking this on hiding divs?
- MM: a separate prose section below the IDL describing what’s optional would be fine.
Query API: Timestamp Query #771
- CW: lots of discussion about fingerprinting. Shouldn’t matter for the shape of the API.
- MM: for fingerprinting we’d apply the same rules as for CPU timers, just in a compute shader.
- CW: And I think you need a compute shader in most cases anyway.
- MM: at least one of the backends needs a compute shader to convert from ticks to nanoseconds.
- CW: any concern with this PR? Adds writeTimestamp to various encoders.
- JG: think sounds fine.
- RC: fine to me.
- CW: let’s merge this and iterate more.
- MM: looks like this is an optional extension. I thought the group wanted some way to measure time that wasn’t an extension?
- CW: yes, you can get the time it took to run a GPU command buffer.
- MM: that’s work we agree on but not in the IDL?
- CW: Hao is supposed to take care of proposing something.
Documenting render pass creation and descriptors #831
- JG: already merged.
Allow BufferSource in writeBuffer/writeTexture #832
- CW: BufferSource is ArrayBuffer, ArrayBufferView, and typed arrays. Question about whether buffer offsets are in elements or bytes.
- MM: if you write a byteSize that is not a multiple of the elements, do you get an error?
- KN: yes.
- MM: seems elements is better than, if it’s both prior art and more safe.
- KN: Not exactly prior art. WebGL only allows ArrayBufferViews which means there is always an element size. BufferSource allows ArrayBuffers as well. We could say the element size for ArrayBuffers is bytes. However, always writing in bytes makes the units match for the bufferOffset, and it matches the units of GPU texture to buffer and buffer to buffer copies, which are also in bytes.
- MM: I understand why buffer-to-buffer is in bytes, but why texture-to-buffer?
- KN / CW: You have the bytes per row and start offset.
- DM: if we don’t allow BufferSource here, can user always get ArrayBuffer from the typed array and pass it to us?
- KN: It’s always possible to create an ArrayBufferView from an ArrayBuffer or vice versa, but there’s a couple of complications. if you have an ABV and want an AB, then you need to pull out the buffer and offset from the AB, and then add your own to it. In the reverse direction, there’s garbage if you don’t already have an ABV.
- JG: think it’s tolerable boilerplate to say it should always be an ArrayBuffer.
- KN: definitely doable, but don’t think there’s a strong reason to do it. Think we should just allow it.
- JG: except for the element stride stuff.
- KN: yes.
- JG: let’s keep discussing this.
- CW: yes, on the issue.
~~Fix handling of storageTextureFormat in createBindGroup #838~~
- Just a spec bug (review if you want)
~~Disallow use of irrelevant GPUBindGroupLayoutEntry members #839~~
- Merged already (review if you want)
~~Remove argument in endOcclusionQuery #840~~
- Merged already (review if you want)

Agenda for next meeting

copying and sampling of depth formats
texture format reinterpretation
PR burndown
KN: fair number of topics under Needs-Discussion. Not sure if any need to float to the top for next week.
CW: I’ll look at Needs-Discussion. If Editors want to bubble some up, please add to agenda.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Web 2020 06 08

Tentative agenda

Attendance

Administration

Clarify t2t copy same texture #823

Add minimumBufferSize to BGL entry #678 (we have numbers!)

Development-only API features #814

PR burndown

Agenda for next meeting

Clone this wiki locally