Minutes 2018 05 23

GPU Web 2018-05-23

Chair: Corentin

Scribe: Dean

Location: Google Hangout

Minutes from last meeting

TL;DR

Validation rules for implicit barriers
- Sub-resources
  - Track sub-resources (per-slice, per-mip)of textures independently.
    - Need to talk about resource aspects at some point (depth vs. stencil vs. color)
  - Do not track sub-ranges for buffers.
  - Discussion about how texture views are expose (separate object or “child textures” like Metal)
- Write-write hazards
  - Strong desire to prevent write-write hazards in the general case.
  - Difficult point is about UAVs because draw-calls can be reordered but splitting render-passes kills tiling optimizations.

Tentative agenda

More on validation rules for implicit barriers.
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Myles C. Maxfield
Google
- Corentin Wallez
- Victor Miura
Microsoft
- Chas Boyd
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
- Markus Siglreithmaier
Yandex
- Kirill Dmitrenko
Elviss Strazdiņš
Joshua Groves

Admin updates

CW: Reviews of the CLA are still in progress

CW: Hoping that in a couple of weeks everyone is agreed.

Validation rules for implicit barriers

CW: Last time we agreed that we need at least read/write memory hazards validated out. There was the hazard with UAV and sub-resources (mip level) that still needed discussion. Has there been any thoughts?
DM: The latter seems easy. Track them like Vulkan and D3D12. Do range tracking. Instead of working with a single resource, you work with a set.
CW: OK. We thought something similar. As an impl detail, we don’t need to remember things for most textures, because they are just used for sampling. It seems it would be fast enough.
DM: That aligns well with the generic advice we were given from AMD. Work with a small number of resources that can change state.
CW: Do we want to track ranges of buffers separately?
MOZILLA: No.
JG: There isn’t a big advantage to using a single mega-buffer. Just use smaller ones. We could also come back to this later.
CW: Yeah, we could possibly create sub-buffers from buffers, or sub-ranges, later.
MM: The other point to make is that when WebGPU says to make a buffer, we don’t have to always pass it on to the driver. We could reuse an existing buffer, or implement our own suballocator.
MM: In other words, if we require the user to make lots of small buffers, we can handle it ourselves.
CW: At least for impl on Vulkan and D3D, we’ll have to work on our allocator.
RC: I agree.
DM: This would mean that you can’t use half a buffer for reading, and the other half for writing.
CW: Correct. If developers complain, we can go back to it.
KD: I’m worried about performance here if the WebGPU impl isn’t allocating at the time the user asks for it. Streaming content would want allocations to happen instantly.
CW: Good point. Memory allocation will be expensive, but shouldn’t be too bad.
DM: We discussed this during the F2F, and the conclusion was that we’re not providing API for re-allocation. It doesn’t seem straightforward to go any further right now.
CW: We can easily measure this when we have implementations. For Chrome it will never be a problem, but that’s not typical use. We should measure the distribution for allocation times on other implementations.
CW: A user could also create a buffer on a worker.
JG: I don’t think that will help.
JG: If we don’t want to allow sub-view ranges, and if people want a few buffers around, and want to create and destroy them, then we might need to provide API to manage it rather than relying on re-use.
CW: Not quite what I meant. I just suggested that applications could hide that cost by performing it off-thread.
CW: Sub-resources for textures -> YES
CW: Sub-resources for buffers -> NO
MM: What are the rules? Are we adding a new API object?
CW: Per slice and per mip-level. No new API object.
DM: It is still exposed to the user when they have to create a bind-group.
CW: They are exposed inside the creation of ImageViews. You specifiy a base slice, range and mip level.
MM: So we are having a new object ImageView?
CW: Yes. Maybe we’ll have a BufferView in the future, but not for MVP.
MM: If I remember, our threading discussion involved buffer views.
JG: I think that was one of the theories. Immutable views could be sharable.
CW: That’s my recollection too.
MM: So we are having a new API object, it just isn’t tracked for hazards?
MM: What types are in the API?
CW: WebGPUTexture that is the full texture, all slices and mip-levels. When you want to use that texture inside a shader, you create a view. The hazard detection is done per-view.
MM: Is it the texture or the view that is hazard object?
CW: It is the texture itself but the texture view encodes which range of the texture is read from / written to.
DM: In Metal, there isn’t a view, there is a parent texture that is conceptually a view.
CW: Can it be recursive?
MM: Yes.
JG: I think in Metal, a MTLTexture is really a view. You only hold on to views.
CW: We should decide whether we want to explain it this way too.
DM: The other place image resources are specified is copy operations - image to image and buffer to image.
CW: That’s specified when we set up the render targets.
CW: I can’t remember how Metal treats this.
DG: We might have to track this. I can remember instances where we wanted read-only depth but mutable stencil.
MM: Doesn’t seem impossible.
MM: I don’t know how this will be implemented. If it involves a custom shader with swizzling, then that would be bad.
JG: The idea is as little hidden work as possible.
CW: Do we want write/write hazards?
JG: Also no.
JG: I don’t see why you’d remove read/write hazards, but leave write/write.
CW: What if you want to copy from one into multiple?
MM: Mulitiple passes
JG: Or barriers.
JG: You either choose one writer or many readers at the same time.
DM: There are use cases where write/write is important. Accumulation buffers.
MM: We covered this last week. It can be solved by adding another attachment or splitting it up into multiple passes.
CW: I think DM is saying we might want to do something different for UAVs.
DM: I don’t. I want to treat write/write hazards to be allowed.
JG: I think that term is misleading.
CW: I think an example is order-independent transparency.
JG: I get the idea.
[missed a bit]
JG: I just mean that I wouldn’t call this write/write. If you are doing an associative operation, then it isn’t contentious, and should be ok.
MM: Our model is that hazard detection is that it happens at submission time. And that fits Jeff’s model.
CW: My feeling that preventing w/w is ok in most cases. But since starting and stopping render passes is expensive, we shouldn’t insert barriers automatically in render passes but allow the UAV footgun instead.
MM: You example above requires reading from the buffer also. That’s not a w/w hazard, it’s a r/w hazard (that is on atomics so ok)
CW: My feeling is that if they go into UAVs they are getting a footgun.
JG: I would prefer to not do it and just warn in a debug mode. Don’t make the API harder to use just to eliminate half the possible errors.
JG: I think this requires more thought.
MM: Proposals - 1. Disallow 2. Allow everything 3. Add machinery that no-one has thought of yet, to work out what hazards are not real
CW: My proposal is 1 but allowed for UAVs
DM: My proposal is 2 but… (missed this)
DM: (describes why 3 won’t work)
MM: All these lists are in one big buffer. We can’t know if there are any collisions.
CW: You’re asking that the impl doesn’t need to do analysis to work out if there are problems.
CW: I think someone invented an AppendBuffer, but it didn’t work.
DM: They are not truly associative.
CW: Try to give it more thought. Write some more formal definitions of our proposals.
DM: If you exclude UAVs, then your own test case wouldn’t work efficiently anyway.
CW: Right, we should just think about this case.
RC: You’re saying that if you put a bunch of write commands, we don’t check. But as soon as you do a read, then the machinery kicks in.
CW: Inside a render pass I’d like to use the same r/w UAV between draw calls
MM: In a world where D3D can re-order draw calls, and we’re not inserting barriers inside passes, then how would this work?
JG: We’d have to work out what barriers to insert.
CW: You can use UAVs as much as you want inside a path - no hazard detection. And you’re on your own.
CW: These APIs have extremely predictable parts. But fine-grained parallelizism is controlled by the developer.
MM: If we accept this, then we either have to say draw calls inside a path can be re-ordered (which makes it very hard for the author), or the D3D backend will have to insert barriers.
DM: Myles, they can’t be re-ordered.
RC: If two vertex shaders have UAV access, order is not guaranteed. But for pixel access it is.
DM: Outside the UAV it is strictly ordered.
RC: Correct.
MM: WebGPU has to swallow that and make it part of the API, or D3D has to enforce the order.
CW: Even Metal has some parts that do not define an order
CW: Without allowing this, write access to UAVs is basically useless.
MM: Unless you do multiple passes
CW: That’s expensive
CW: (describes tiling GPU and UAV access).
CW: Splitting them up breaks tiling optimizations. I don’t think any API has a way around it.
MM: OK, so we have to go with “we swallow it” and spec it as part of the API
RC: You mean allow unchecked writes. Developer must order them by render pass.
RC: If we can determine that two shaders are writing to the same thing, we insert barriers.
MM: The WebGPU runtime can work out if a resource is read or written in a draw call.
MM: What do you do when there is a conflict?
MM: 1 Insert a barrier
MM: 2. Return an error and make the developer fix it
MM: We proposed 1 right from the start, but people disagreed.
RC: But I think we changed our mind at the F2F.
CW: Inserting a barrier would involve flushing the tiler. We really want one render pass to correspond to one actual hardware pass. It seems like we can’t do this.
RC: What does Metal do? Does it look at the shader and put in a barrier.
MM: Nope. Same as D3D. Anything goes. You put in barriers by making new passes.
MM: Which means to do this, Metal backends would be making more passes.
CW: Outside render passes, Metal does everything it can for you w.r.t barriers. Inside a pass, it does nothing
CW: It even does more and adds barriers inside compute passes, just not in render passes.
MM: The design for the types of problems that hit this is …
DM: I’m not convinced
MM: Right. This isn’t an automatic conversion.
CW: Let’s continue that next week.

ISV/IHV Questions Feedback

CW: Let’s do this next week too.
JG: Just wanted to give people a change to provide feedback.

Agenda for next meeting

Sync rules
Formal definitions of proposal
Gather more feedback (e.g. CW to talk to mrdoob)
Discussion about the testing/programming language

Minutes 2018 05 23

GPU Web 2018-05-23

Minutes from last meeting

TL;DR

Tentative agenda

Attendance

Admin updates

Validation rules for implicit barriers

ISV/IHV Questions Feedback

Agenda for next meeting

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!