This demonstrates using DirectStorage in a bulk-load scenario by loading multiple models measuring the time and CPU usage while loading.
This demo is based on MiniEngine - making use of its glTF 2.0 support.
Install Visual Studio 2019 or higher.
Open the following Visual Studio solution and build
Samples\BulkLoadDemo\BulkLoadDemo.sln
BulkLoadDemo [-dir <directory>] [-model <filename>] [-gpu-decompression {0|1}] [-debug {0|1}]
The demo can operate in one of three modes:
-
Default: by default, any
.marc
files in the same directory as the executable are loaded. Multiple instances of these are loaded in order to ensure that there's enough work to do. -
Single Model: The
-model
command-line argument can be used to specify a single.marc
file to load. This mode is mostly useful for debugging a single model. -
Directory: The
-dir
command-line argument can specify a directory. All.marc
files within this directory are candidates for loading.
GPU decompression is enabled by default, but it can be explicitly enabled/disabled using the -gpu-decompression
argument.
The D3D12 debug layer can be explicitly enabled or disabled using the -debug
argument.
Close the window, or press Escape, to exit the demo.
MiniEngine uses .mini
files to serialize data from a .gltf file. This demo uses M
ini Arc
hive files, that contain the serialized data as well as the textures required for a .gltf file. .marc
files can be generated using the MiniArchive tool.
MiniArchive [-gdeflate|-zlib] [-stagingbuffersize=X] [-bc] source.gltf dest.marc
Assets can be compressed using GDeflate or Zlib. Since individual DirectStorage requests cannot use more than the staging buffer size, MiniArchive needs to know when it must break a single request into multiple requests. The -stagingbuffersize
argument controls this. The default is 256 MiB (which is what BulkLoadDemo sets the staging buffer size to).
Passing -bc
will cause the textures to be converts to one of the BCn formats.
Also included is a powershell script, convert.ps1
. This is handy for converting all gltf files under a particular directory. It assumes that the Release build of MiniArchive.ese has been built. Usage:
./convert.ps1 <srcDir> <destDir> <additional arguments...>
The additional arguments are passed directly to MiniArchive. The filenames are generated from the GLTF name, with additional disambiguation added if there are multiple GLTF files with the same name but in different directory. Example usage:
./convert.ps1 c:\gltfs c:\marcs -gdeflate -bc
The will find all of the gltf files under c:\gltfs
and convert them to .marc
files under c:\marcs
. The files will be compressed using gdeflate and the textures will be BCn compressed.
Some sample assets can be found at https://github.com/KhronosGroup/glTF-Sample-Models/tree/master/2.0. Note, however, that some of these samples use features that are not supported by BulkLoadDemo and can sometimes result in Device Removals. There are many collections of free GLTF files available across the internet.
This demo can be seen as an exercise in converting an existing code base to one that makes good use of DirectStorage.
MiniEngine already had support for reading glTF 2.0 files, and for loading/saving a binary serialized version of it. Details of this file format can be found in Model/ModelLoader.h. This provided a good starting point when designing the marc file format.
See BulkLoadDemo/MarcFileFormat.h for the details of the file format.
Some considerations for the design:
-
Design to avoid dependent reads: DirectStorage works best when there are many outstanding IO operations. For this reason, the file format is designed so that as much information as possible is already available. For example, the
struct Header
contains everything needed to load the UnstructuredGpuData, CpuMetaData and CpuData. -
Include textures in the same file as the rest of the model. As a design point all of the assets for a single glTF file are stored inside a single file. This makes the files easier to manage and avoids needing to access the file system to locate textures. Texture names are only stored for debugging purposes and instead are reference by index. See
struct TextureMetadata
andstruct Material
. -
Separate out data destined for system memory and memory destined for different types of GPU memory. This allows us to use the appropriate DirectStorage request destination types for loading each bit of data. See
struct Header
the various regions within it. -
Arrange textures in the file according to GetCopyableFootprints, making them suitable for loading using a DSTORAGE_REQUEST_DESTINATION_MULTIPLE_SUBRESOURCES request. Although this means including padding in the file, the padding compresses really well. See
WriteTexture
in MiniArchive/main.cpp. -
Separate out fixed size vs variable sized data; minimizing the amount of fixed size data. Fixed sized data needs to be loaded uncompressed and needs to contain at least enough information to know how to load the compressed data. This is why
struct Header
is separate from the other structs. -
Centralize metadata about textures, arranged so that information required for memory / descriptor allocation can be stored without needing to load the entire model. See
struct CpuMetadataHeader
for this, note thatTextureDescs
is arranged in a format suitable for passing directly toID3D12Device4::GetResourceAllocationInfo1
.
While some amount of thought upfront helped with the initial design, various changes were made as the loading code was written. For example, separating TextureDescs from the rest of the TextureMetadata happened while writing the loading code.
MiniArchive's implementation is relatively straightforward. It works by first asking MiniEngine's Model code to generate ModelData for a glTF asset. It then collects this data, and the textures, to write out the final archive file.
There are three main classes involved in this demo. These classes divide the responsability like this:
-
BulkLoadDemo - this has the main render/update loop in it, as well as the overall logic for the demo. On startup, BulkLoadDemo determines which files are going to take part in the demo and adds them to MarcFileManager. Then, every 10 seconds, it unloads the current set of files, picks the next set to load and loads them.
-
MarcFileManager - this manages all the MarcFiles, keeping track of the overall state of loading a set of files - such as whether or not all the files are ready to load, or if the current set of files has been loaded. This object also owns the single large ID3D12Heap that all the Model's resources are placed in.
-
MarcFile - this is a single MarcFile. This stores the metadata for the file along with the
Model
instance if the file is loaded.
On startup, BulkLoadDemo
tells MarcFileManager
about files that might be loaded by calling MarcFileManager::Add
. This creates a new MarcFile
instances and calls MarcFile::StartMetadataLoad
on it, which then issues a single request to load the header. A Win32 event is used to detect when this load has completed, and a Threadpool Wait is used to configure a callback when this has happened.
Note: there's potential for improving this - as it is now,
MarcFile
is pretty standalone. However, if we know that it'll always be loaded in a set of other MarcFiles then we could have a single event to indicate that all metadata has been loaded, instead of having one per-file.
Once the header has completed loading it can be validated and then the CPU metadata region can be loaded. This region is variable size, and compressed, so we need the data in the header in order to load it.
When the CPU metadata has finished loading it is fixed up (offsets are converted to pointers) and some device-specific calculations are performed (eg getting the resource allocation information for the textures and creating a CPU-visible descriptor heap upfront.)
MarcFileManager provides an ID for each file that is added to it; BulkLoadDemo stores a vector of these IDs. When it is time to load a new set, this vector is shuffled and passed to MarcFileManager::SetNextSet
.
SetNextSet
then calls TryStartLoad
for each file. This will determine if there's enough room in the heap for all the model's GPU data. Regions in the heap and the descriptor heap are then allocated, and MarcFile::StartContentLoad
begins the content loading process.
Content load can immediately issue all the requests required to load the CPU data, unstructured GPU data and textures. The essentially becomes two batches (one for the system memory queue and another for the GPU queue).
Once both the CPU and GPU data has finished loading a final round of fixups can be applied.
Before unloading a model we need to be sure that the model's resources are no longer in use by the GPU. Once we can be certain that the GPU isn't / won't be referencing these resources we can release them.
DirectStorage does not natively support ZLib. Instead, this demo uses the custom decompression feature to integrate ZLib decompression. See BulkLoadDemo/DStorageLoader.cpp for details on how custom decompression is implemented in this demo.
Below is an annotated screenshot of a PIX timing capture taken of BulkLoadDemo starting up, loading a number of GDeflate compressed assets.