-
Notifications
You must be signed in to change notification settings - Fork 357
Minutes 2018 05 23
Corentin Wallez edited this page Jan 4, 2022
·
1 revision
Chair: Corentin
Scribe: Dean
Location: Google Hangout
- Validation rules for implicit barriers
- Sub-resources
- Track sub-resources (per-slice, per-mip)of textures independently.
- Need to talk about resource aspects at some point (depth vs. stencil vs. color)
- Do not track sub-ranges for buffers.
- Discussion about how texture views are expose (separate object or “child textures” like Metal)
- Track sub-resources (per-slice, per-mip)of textures independently.
- Write-write hazards
- Strong desire to prevent write-write hazards in the general case.
- Difficult point is about UAVs because draw-calls can be reordered but splitting render-passes kills tiling optimizations.
- Sub-resources
- More on validation rules for implicit barriers.
- Agenda for next meeting
- Apple
- Dean Jackson
- Myles C. Maxfield
- Google
- Corentin Wallez
- Victor Miura
- Microsoft
- Chas Boyd
- Rafael Cintron
- Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
- Markus Siglreithmaier
- Yandex
- Kirill Dmitrenko
- Elviss Strazdiņš
- Joshua Groves
CW: Reviews of the CLA are still in progress
CW: Hoping that in a couple of weeks everyone is agreed.
- CW: Last time we agreed that we need at least read/write memory hazards validated out. There was the hazard with UAV and sub-resources (mip level) that still needed discussion. Has there been any thoughts?
- DM: The latter seems easy. Track them like Vulkan and D3D12. Do range tracking. Instead of working with a single resource, you work with a set.
- CW: OK. We thought something similar. As an impl detail, we don’t need to remember things for most textures, because they are just used for sampling. It seems it would be fast enough.
- DM: That aligns well with the generic advice we were given from AMD. Work with a small number of resources that can change state.
- CW: Do we want to track ranges of buffers separately?
- MOZILLA: No.
- JG: There isn’t a big advantage to using a single mega-buffer. Just use smaller ones. We could also come back to this later.
- CW: Yeah, we could possibly create sub-buffers from buffers, or sub-ranges, later.
- MM: The other point to make is that when WebGPU says to make a buffer, we don’t have to always pass it on to the driver. We could reuse an existing buffer, or implement our own suballocator.
- MM: In other words, if we require the user to make lots of small buffers, we can handle it ourselves.
- CW: At least for impl on Vulkan and D3D, we’ll have to work on our allocator.
- RC: I agree.
- DM: This would mean that you can’t use half a buffer for reading, and the other half for writing.
- CW: Correct. If developers complain, we can go back to it.
- KD: I’m worried about performance here if the WebGPU impl isn’t allocating at the time the user asks for it. Streaming content would want allocations to happen instantly.
- CW: Good point. Memory allocation will be expensive, but shouldn’t be too bad.
- DM: We discussed this during the F2F, and the conclusion was that we’re not providing API for re-allocation. It doesn’t seem straightforward to go any further right now.
- CW: We can easily measure this when we have implementations. For Chrome it will never be a problem, but that’s not typical use. We should measure the distribution for allocation times on other implementations.
- CW: A user could also create a buffer on a worker.
- JG: I don’t think that will help.
- JG: If we don’t want to allow sub-view ranges, and if people want a few buffers around, and want to create and destroy them, then we might need to provide API to manage it rather than relying on re-use.
- CW: Not quite what I meant. I just suggested that applications could hide that cost by performing it off-thread.
- CW: Sub-resources for textures -> YES
- CW: Sub-resources for buffers -> NO
- MM: What are the rules? Are we adding a new API object?
- CW: Per slice and per mip-level. No new API object.
- DM: It is still exposed to the user when they have to create a bind-group.
- CW: They are exposed inside the creation of ImageViews. You specifiy a base slice, range and mip level.
- MM: So we are having a new object ImageView?
- CW: Yes. Maybe we’ll have a BufferView in the future, but not for MVP.
- MM: If I remember, our threading discussion involved buffer views.
- JG: I think that was one of the theories. Immutable views could be sharable.
- CW: That’s my recollection too.
- MM: So we are having a new API object, it just isn’t tracked for hazards?
- MM: What types are in the API?
- CW: WebGPUTexture that is the full texture, all slices and mip-levels. When you want to use that texture inside a shader, you create a view. The hazard detection is done per-view.
- MM: Is it the texture or the view that is hazard object?
- CW: It is the texture itself but the texture view encodes which range of the texture is read from / written to.
- DM: In Metal, there isn’t a view, there is a parent texture that is conceptually a view.
- CW: Can it be recursive?
- MM: Yes.
- JG: I think in Metal, a MTLTexture is really a view. You only hold on to views.
- CW: We should decide whether we want to explain it this way too.
- DM: The other place image resources are specified is copy operations - image to image and buffer to image.
- CW: That’s specified when we set up the render targets.
- CW: I can’t remember how Metal treats this.
- DG: We might have to track this. I can remember instances where we wanted read-only depth but mutable stencil.
- MM: Doesn’t seem impossible.
- MM: I don’t know how this will be implemented. If it involves a custom shader with swizzling, then that would be bad.
- JG: The idea is as little hidden work as possible.
- CW: Do we want write/write hazards?
- JG: Also no.
- JG: I don’t see why you’d remove read/write hazards, but leave write/write.
- CW: What if you want to copy from one into multiple?
- MM: Mulitiple passes
- JG: Or barriers.
- JG: You either choose one writer or many readers at the same time.
- DM: There are use cases where write/write is important. Accumulation buffers.
- MM: We covered this last week. It can be solved by adding another attachment or splitting it up into multiple passes.
- CW: I think DM is saying we might want to do something different for UAVs.
- DM: I don’t. I want to treat write/write hazards to be allowed.
- JG: I think that term is misleading.
- CW: I think an example is order-independent transparency.
- JG: I get the idea.
- [missed a bit]
- JG: I just mean that I wouldn’t call this write/write. If you are doing an associative operation, then it isn’t contentious, and should be ok.
- MM: Our model is that hazard detection is that it happens at submission time. And that fits Jeff’s model.
- CW: My feeling that preventing w/w is ok in most cases. But since starting and stopping render passes is expensive, we shouldn’t insert barriers automatically in render passes but allow the UAV footgun instead.
- MM: You example above requires reading from the buffer also. That’s not a w/w hazard, it’s a r/w hazard (that is on atomics so ok)
- CW: My feeling is that if they go into UAVs they are getting a footgun.
- JG: I would prefer to not do it and just warn in a debug mode. Don’t make the API harder to use just to eliminate half the possible errors.
- JG: I think this requires more thought.
- MM: Proposals - 1. Disallow 2. Allow everything 3. Add machinery that no-one has thought of yet, to work out what hazards are not real
- CW: My proposal is 1 but allowed for UAVs
- DM: My proposal is 2 but… (missed this)
- DM: (describes why 3 won’t work)
- MM: All these lists are in one big buffer. We can’t know if there are any collisions.
- CW: You’re asking that the impl doesn’t need to do analysis to work out if there are problems.
- CW: I think someone invented an AppendBuffer, but it didn’t work.
- DM: They are not truly associative.
- CW: Try to give it more thought. Write some more formal definitions of our proposals.
- DM: If you exclude UAVs, then your own test case wouldn’t work efficiently anyway.
- CW: Right, we should just think about this case.
- RC: You’re saying that if you put a bunch of write commands, we don’t check. But as soon as you do a read, then the machinery kicks in.
- CW: Inside a render pass I’d like to use the same r/w UAV between draw calls
- MM: In a world where D3D can re-order draw calls, and we’re not inserting barriers inside passes, then how would this work?
- JG: We’d have to work out what barriers to insert.
- CW: You can use UAVs as much as you want inside a path - no hazard detection. And you’re on your own.
- CW: These APIs have extremely predictable parts. But fine-grained parallelizism is controlled by the developer.
- MM: If we accept this, then we either have to say draw calls inside a path can be re-ordered (which makes it very hard for the author), or the D3D backend will have to insert barriers.
- DM: Myles, they can’t be re-ordered.
- RC: If two vertex shaders have UAV access, order is not guaranteed. But for pixel access it is.
- DM: Outside the UAV it is strictly ordered.
- RC: Correct.
- MM: WebGPU has to swallow that and make it part of the API, or D3D has to enforce the order.
- CW: Even Metal has some parts that do not define an order
- CW: Without allowing this, write access to UAVs is basically useless.
- MM: Unless you do multiple passes
- CW: That’s expensive
- CW: (describes tiling GPU and UAV access).
- CW: Splitting them up breaks tiling optimizations. I don’t think any API has a way around it.
- MM: OK, so we have to go with “we swallow it” and spec it as part of the API
- RC: You mean allow unchecked writes. Developer must order them by render pass.
- RC: If we can determine that two shaders are writing to the same thing, we insert barriers.
- MM: The WebGPU runtime can work out if a resource is read or written in a draw call.
- MM: What do you do when there is a conflict?
- MM: 1 Insert a barrier
- MM: 2. Return an error and make the developer fix it
- MM: We proposed 1 right from the start, but people disagreed.
- RC: But I think we changed our mind at the F2F.
- CW: Inserting a barrier would involve flushing the tiler. We really want one render pass to correspond to one actual hardware pass. It seems like we can’t do this.
- RC: What does Metal do? Does it look at the shader and put in a barrier.
- MM: Nope. Same as D3D. Anything goes. You put in barriers by making new passes.
- MM: Which means to do this, Metal backends would be making more passes.
- CW: Outside render passes, Metal does everything it can for you w.r.t barriers. Inside a pass, it does nothing
- CW: It even does more and adds barriers inside compute passes, just not in render passes.
- MM: The design for the types of problems that hit this is …
- DM: I’m not convinced
- MM: Right. This isn’t an automatic conversion.
- CW: Let’s continue that next week.
- CW: Let’s do this next week too.
- JG: Just wanted to give people a change to provide feedback.
- Sync rules
- Formal definitions of proposal
- Gather more feedback (e.g. CW to talk to mrdoob)
- Discussion about the testing/programming language