Provide a mechanism to persist cache metadata, to improve read latency on recovery.

# Background
SlateDB is adding caching as described in #15 and #9 . These caches would reduce read latency.  Applications that have a large state, and a smaller hot key would benefit from cache. Cache would be empty on recovery, and until the cache is warm read latency would be high. 

# High level proposal
Persisting metadata about cache and proactively filling in the cache could improve read latency. 
SlateDB has immutable SSTs, and SSTs have multiple blocks. Upon recovery, set of SSTs that are part of the DB would be loaded to `db_state` as described in the [manifest design doc](https://github.com/slatedb/slatedb/blob/main/docs/0001-manifest.md). `(SST Id, Block)` would likely be one of the index for the cache. If we persist `[AppId, [CachedBlock(SST Id, Block)]]` as metadata, new writers can optionally use the metadata to asynchronously fill the cache. `AppId` in this case is an opaque id provided by user. Same DB could have multiple readers, each with different `AppId`.
We could extend [manifest file structure](https://github.com/slatedb/slatedb/blob/main/docs/0001-manifest.md#file-structure) to add optional metadata, and add pointers to this metadata in the manifest to keep the manifest file manageable. This could belong to `Snapshot`.

This issue should cover
1. Design, including the perf parameters (DB size, key count, hot key count, read latency with and without cache on recovery) and back of envelope calculations.
2. Implementation.
3. Updates to `db_bench` for the new benchmark.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a mechanism to persist cache metadata, to improve read latency on recovery. #169

Background

High level proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development