Create an in-memory block cache #15

criccomini · 2024-04-23T21:19:40Z

Frequently accessed SST blocks should be cached in memory. This should dramatically improve read performance. Let's start with a simple LRU cache. See mini-lsm for an example.

gesalous · 2024-04-25T07:27:06Z

I have some preliminary ideas regarding issues #9 and #15 that I would like to discuss.
Specifically, is SlateDB intended to target VM instances with local ephemeral NVMe storage with high IOPs and capacity, or should it work with VM instances with ephemeral storage with low IOPs and capacity?

Since I am new to AWS pricing and instance specifications, I would appreciate some insights into the capabilities of the ephemeral storage service of VMs in AWS.
Could anyone share detailed information on:

The IOPS performance,
Throughput, and
Capacity of these NVMe SSDs?

Thanks

criccomini · 2024-04-25T16:02:40Z

My opinion is that slatedb should be built to work well in both scenarios. If NVMe (or local SSD) is available, it should be leveraged to decrease latency (without sacrificing consistency) by caching SSTs locally on disk (#9). Similarly, slatedb should use memory to cache frequently accessed SST blocks.

For SST disk caching, I think we'll want to expose a config to define how much space to allocate to the disk cache. This is similar to how RocksDB-cloud works with their persistent secondary cache (PSC). The in-memory cache (this GH issue) should function similarly. Indeed, this is how RocksDB's own block cache works:

Block cache is where RocksDB caches data in memory for reads. User can pass in a Cache object to a RocksDB instance with a desired capacity (size).

Users determine the size.

This will allow users to configure slatedb according to their environment.

w.r.t. performance numbers, I think it's pretty dependent on the instances you select (EBS, NVMe, SSD, and so on). This is why I think exposing the knobs and allowing users to configure slatedb is a better approach than making assumptions about what the environment is capable of.

criccomini · 2024-04-25T17:00:04Z

@gesalous I've tried to clarify things more here: #20

criccomini · 2024-04-25T17:10:11Z

@gesalous I've assigned this one to you. Feel free to unassign yourself if you decide not to work on it. :) Thanks!

gesalous · 2024-04-28T20:06:10Z

Should we define the API of the block cache before delving into design details? The zero version could be as simple as an LRU cache with a queue and a hash table. I could proceed with a zero version of the API. What do you think?

criccomini · 2024-04-29T05:40:10Z

💯 This is what I was thinking. Basically just an LRU map from (sst id, block id) to block. iirc, mini-LSM uses some third party crate to get this data structure.

EDIT: It's moka.

criccomini · 2024-04-29T16:03:30Z

Here's the line in mini-lsm where they declare the BlockCache using moka:

https://github.com/skyzh/mini-lsm/blob/main/mini-lsm/src/lsm_storage.rs#L28

criccomini · 2024-05-08T16:13:18Z

@gesalous checking in: you still interested in this one?

gesalous · 2024-05-08T21:52:24Z

Yes, I read today the design manifest in detail. I ll get back with a design sketch.

criccomini · 2024-05-09T00:06:10Z

Great, thanks for the update! 😄

flaneur2020 · 2024-08-18T11:07:02Z

i'm interested in this issue. can you assign it to me if it hasn't been scheduled yet?

i'm mostly available in the weekends. I'll unassign myself if I become unavailable anymore : )

criccomini · 2024-08-18T18:40:41Z

@flaneur2020 Thanks for checking in! @pragmaticanon has been talking about working on this one. We're discussing it on Discord if you want to join the conversation.

@rodesai pointed out that it might make more sense to start with SST caching on disk (#9), since that one gives us the OS (in-memory) page cache for free. I'm inclined to agree with him. Perhaps you want to take #9 first?

pragmaticanon · 2024-08-18T20:23:51Z

Taking up this issue as per discussion on discord.

criccomini added the enhancement New feature or request label Apr 23, 2024

This was referenced Apr 25, 2024

Define use cases, features, and requirements for slatedb #20

Closed

Design manifest persistence #14

Closed

criccomini assigned gesalous Apr 25, 2024

criccomini mentioned this issue May 10, 2024

Implement an SST iterator #48

Merged

criccomini mentioned this issue Jun 2, 2024

proposal for slatedb compaction #62

Merged

criccomini unassigned gesalous Aug 16, 2024

criccomini mentioned this issue Aug 18, 2024

Cache SSTs locally #9

Closed

criccomini assigned pragmaticanon Aug 18, 2024

pragmaticanon mentioned this issue Aug 19, 2024

Feat/in memory block cache #137

Merged

vigneshc mentioned this issue Aug 31, 2024

Provide a mechanism to persist cache metadata, to improve read latency on recovery. #169

Open

criccomini closed this as completed in 5cce0fb Sep 6, 2024

criccomini closed this as completed in #137 Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an in-memory block cache #15

Create an in-memory block cache #15

criccomini commented Apr 23, 2024

gesalous commented Apr 25, 2024

criccomini commented Apr 25, 2024

criccomini commented Apr 25, 2024

criccomini commented Apr 25, 2024

gesalous commented Apr 28, 2024

criccomini commented Apr 29, 2024 •

edited

Loading

criccomini commented Apr 29, 2024

criccomini commented May 8, 2024

gesalous commented May 8, 2024

criccomini commented May 9, 2024

flaneur2020 commented Aug 18, 2024 •

edited

Loading

criccomini commented Aug 18, 2024

pragmaticanon commented Aug 18, 2024

Create an in-memory block cache #15

Create an in-memory block cache #15

Comments

criccomini commented Apr 23, 2024

gesalous commented Apr 25, 2024

criccomini commented Apr 25, 2024

criccomini commented Apr 25, 2024

criccomini commented Apr 25, 2024

gesalous commented Apr 28, 2024

criccomini commented Apr 29, 2024 • edited Loading

criccomini commented Apr 29, 2024

criccomini commented May 8, 2024

gesalous commented May 8, 2024

criccomini commented May 9, 2024

flaneur2020 commented Aug 18, 2024 • edited Loading

criccomini commented Aug 18, 2024

pragmaticanon commented Aug 18, 2024

criccomini commented Apr 29, 2024 •

edited

Loading

flaneur2020 commented Aug 18, 2024 •

edited

Loading