#lsm-tree #key-value-store #async #async-api #store-key #api-calls

nightly lsm

An implementation of log-structured merge trees in pure Rust

2 unstable releases

0.4.1 Jul 25, 2024
0.4.0 Jul 24, 2024
0.1.0 Mar 20, 2021

#1455 in Database interfaces

MIT license

240KB
5.5K SLoC

Modular, Asynchronous Implementation of a Log-Structured Merge Tree

ci-badge license-badge crates-badge

Note: While this implementation is used by us and has not caused major problems, we do not recommend it yet production environments. Please use the leveldb or rocksdb crate for this purpose.

This implementation does not aim to reimplement LevelDB. The major differences are:

  • Separation of keys and values: Values can be stored seperately to increase compaction speed as outlined in the WiscKey paper
  • Concurrent compaction: Multiple threads can compact at the same time for higher write throughput
  • Async-support: All API calls are exposed as async functions
  • io_uring-support: For async file system access on Linux. Optional and still considered experimental.
  • Bloom filters for faster lookups

Supported Platfomrs and Architectures

Currently, the code is only tested with Linux on x86 machines, but it should run on most systems supported by the Rust compiler.

On-Disk Format

LSM stores data using zerocopy and (when possible) mmap to achieve high performance. The implementation does not account for endianness so on-disk formats are not portable. Replication across machines should be handled at a different layer of the system. However, we may add a converter tool in the future or an endianess feature flag if needed.

Planned Features

  • FLSM: Like PebblesDB LSM-rs will fragment the keyspace to reduce write amplification and increase compaction speed
  • Custom sorting functions
  • More modularity and configuration options

Feature Flags

  • snappy-compression: Use the snappy format to compress data on disk (enabled by default)
  • bloom-filters: Add bloom filters to data blocks for more efficient searching. (enabled by default)
  • async-io: Use tokio_uring for I/O instead of that of the standard library. Note, that this only works recent version of the Linux kernel. (disabled by default)
  • wisckey: Store keys and values separately. This usually results in higher throughput with slightly higher CPU-usage. (disabled by default)

Synchronous API

This crate exposes an async API intended to be used with Tokio or a similar runtime. Alternatively, you can use the lsm-sync crate included in this repo, which internally uses Tokio but expose a synchronous API.

Sort Order

You need to serialize your data in a way that its byte representation maintains the same ordering as the unserialized data. For example, you may want to use big endian encoding so that numerical values are ordered correctly.

Usage

You can create or open a new databse instance as shown below.

use lsm::{Database, Params};

// Set options here, such as the location of the database files
let params = Params {
    db_path,
    ..Default::default()
};

// Instantiate database
let database = Database::new_with_params(SM, params)
    .await
    .expect("Failed to create database instance");

To write to the database use the get call. Note that the crate only supports writing byte vectors. (De-)serialization is supposed to happen at another layer.

let key = String::from("mykey").into_bytes();
let value = String::from("hello world").into_bytes();

database.put(key, value).await.expect("Writing to database failed");

When reading, LSM will return a reference to the data to avoid copying.

let value_ref = database.get(&key).await.expect("Reading failed");

// Returns a slice to the data
let data: &[u8] = value_ref.get_value();

// Assuming the put from above workd, this will print "hello world"
println!("{}", std::str::from_utf(data).unwrap());

Please refer to the tests for more examples to how to use the crate.

Tests

This library ships with several tests. We provide a justfile for convenience:

just test #runs all tests for all configurations
just lint #runs cargo clippy

Notes on io-uring

Currently, the io-uring feature relies on tokio-uring-executor, a simplistic multi-threaded wrapper around tokio-uring. Eventually tokio-uring will support multiple threads natively and this workaround will be removed.

I would also like to add support for more mature io_uring runtimes such as gloomio but only have limited time to work on this crate. Help is very welcome.

Similar Crates

This is an incomplete list of crates that provide similar functionality. Please reach out if you know of others to add.

LSM trees

  • rust-rocksdb: Rust bindings for RocksDB
  • leveldb: Rust bindings for LevelDB
  • wickdb: Rust re-implementation of vanilla LevelDB
  • agatedb: A WiscKey implementation in Rust for TiKV

Other Key-Value Stores

These differ significantly in their approach but also provide a key-value store abstraction

Dependencies

~6–16MB
~181K SLoC