OrioleDB beta7: Benchmarks

December 1, 2024 · 7 min read

Creator of OrioleDB

PostgreSQL contributor

OrioleDB is a storage extension for PostgreSQL which uses PostgreSQL's pluggable storage system. Designed as a drop-in replacement for PostgreSQL's existing Heap storage, OrioleDB aims to overcome scalability bottlenecks and fully utilize modern hardware capabilities. By integrating seamlessly with PostgreSQL, it offers improved performance, efficiency, and scalability without sacrificing the robustness and reliability that PostgreSQL is known for.

Today we’re releasing OrioleDB version beta7. This marks a significant step in delivering a high-performance, next-generation storage engine for Postgres users. OrioleDB is designed to extract the full potential of modern hardware, offering superior performance, efficiency, and scalability.

OrioleDB design choices

OrioleDB is built from the ground up to leverage modern hardware, reduce maintenance needs, and enhance distributed capabilities. The key technical decisions forming the foundation of OrioleDB are:

Elimination of Buffer Mapping and Lock-less Page Reading: In OrioleDB, in-memory pages are directly linked to storage pages, eliminating the need for buffer mapping and its associated bottlenecks. Additionally, in-memory page reading is performed without atomic operations, enabling lock-less access. Together, these design choices significantly elevate vertical scalability for PostgreSQL.
MVCC Based on the UNDO Log Concept: OrioleDB employs a Multi-Version Concurrency Control (MVCC) mechanism based on an undo log. Old versions of tuples are evicted into undo logs—forming undo chains—rather than causing bloat in the main storage system. Page-level undo records allow the system to promptly reclaim space occupied by deleted tuples. Combined with page merging, these mechanisms eliminate bloat in most cases. As a result, dedicated vacuuming of tables is unnecessary, removing a common cause of system performance degradation and database outages.
Copy-on-Write Checkpoints and Row-Level WAL: OrioleDB utilizes copy-on-write checkpoints to provide structurally consistent snapshots of data at all times. This approach is friendly to modern SSDs and enables row-level Write-Ahead Logging (WAL). Row-level WAL is easy to parallelize (already implemented), compact, and suitable for active-active multi-master configurations (planned).

Benchmarking OrioleDB vs PostgreSQL Heap

To illustrate the performance characteristics of OrioleDB we used TPC-C benchmarking, a complex test simulating real database workloads that is considered a modern standard in database applications.

TPC-C Warehouses

TPC-C measures data volume in “warehouses”: 1 warehouse takes about 100 MB. We ran the benchmarks at various sizes to measure the performance of OrioleDB compared to default Postgres heap tables.

Benchmark: 100 warehouses

100 warehouses mainly highlights the WAL-insertion bottleneck. OrioleDB tested 2.3x faster then heap:

Benchmarked on a 64-core virtual machine c7g.metal with 20Gb of RAM allocated as shared buffers.

Benchmark: 500 warehouses

500 warehouses highlights the shared memory cache bottleneck. OrioleDB tested 5.5x faster than heap:

Benchmarked on a 64-core virtual machine c7g.metal with 20Gb of RAM allocated as shared buffers.

Benchmark: 1000 warehouses

OrioleDB implements index-organized tables providing better data locality when compared with Postgres’s default heap-organized tables. In cases where data do not fit in the OS’s memory cache this optimization improves throughput by reducing disk IO. This is visible in the results of TPC-C with 1000 warehouses, which did not fit in OS cache memory on the target hardware.

Larger than OS memory cases would require a lot of IO and typically it’s good to use local SSD with low latency and high throughput for big clusters like this (100GB with often updates of all data). So we did in the test.

OrioleDB tested 2.7x faster than heap:

Benchmarked on a 64-core virtual machine c7gd.metal with 20Gb of RAM allocated as shared buffers.

Other optimizations

There are additional OrioleDB improvement, not covered by these benchmarks, but still important:

Builtin compression reducing storage footprint up to 5x in typical cases,
UNDO log, which avoids bloat and eliminates the need for VACUUM,
Removal of the sub-transaction hazard.

Try it yourself

Install OrioleDB

You can install OrioleDB by following the instructions on GitHub for building from source or using Docker.

Alternatively, you can run OrioleDB on Supabase. OrioleDB was released in Public Alpha on the Supabase platform for testing purposes and is not suitable for production use cases yet. Read the blog post for more details.
Use OrioleDB for Your Tables

To utilize OrioleDB for your tables, add USING orioledb to your table's Data Definition Language (DDL) statement:
```
CREATE TABLE my_table (
    id SERIAL PRIMARY KEY,
    data TEXT
) USING orioledb;
```
For more detailed instructions, refer to the OrioleDB Getting Started Guide.

Alternatively, you can set the default_table_access_method configuration parameter to orioledb in PostgreSQL. This will make OrioleDB the default storage engine for all new tables:
```
SET default_table_access_method = 'orioledb';
```
For more information, see the PostgreSQL documentation on default_table_access_method.
Perform Testing

Feel free to test OrioleDB using your own workload or by employing well-known benchmarking tools like go-tpc, HammerDB, or others. These tools can help you assess the performance and scalability of OrioleDB under various scenarios.

Our Vision for OrioleDB

We're not just building a better storage engine; we're envisioning the future of PostgreSQL in the cloud era.

The Go-To PostgreSQL Storage Engine

We envision OrioleDB becoming the default choice for PostgreSQL, replacing Heap. There's precedent for such a shift in the database world. MySQL's adoption of InnoDB and MongoDB's switch to WiredTiger over MMAPv1 are prime examples of storage engines transforming their respective ecosystems.

Decoupled Storage and Compute

We are implementing OrioleDB bottomless storage with an S3 integration. This decoupled architecture allows for virtually unlimited storage capacity while enabling scalable compute resources. It paves the way for more flexible and resilient database deployments.

Open-Source Serverless PostgreSQL

We're committed to keeping OrioleDB fully open-source and PostgreSQL-licensed. Our goal is to provide a storage engine that is fast, serverless, self-hostable, and pure Postgres, in contrast "PostgreSQL-compatible" databases like Amazon Aurora. A modern storage engine without vendor lock-in.

Hybrid Workloads with Columnar Indexes

We plan to introduce a columnar index type, enabling efficient handling of analytical workloads alongside transactional operations. This hybrid approach means you no longer need separate systems for OLTP (online transactional processing) and OLAP (online analytical processing) workloads.

Multi-Master Replication

Our roadmap includes support for multi-master configurations, enhancing availability and fault tolerance. This will allow for read and write operations across multiple nodes, improving performance and resilience.

Current Limitations

While OrioleDB offers significant advantages, it's important to be aware of its current limitations existing at the moment of beta7 release.

Index Support: Currently, only B-tree indexes are supported, which means features like pg_vector's HNSW indexes are not yet available. We are actively working on an Index Access Method bridge to support all index types available for heap storage.
Prepared Transactions: Currently, prepared transactions are not supported for transactions involving OrioleDB tables.
REINDEX CONCURRENTLY: This command is currently not supported for indexes over OrioleDB tables.

The work on the items above will be completed before the project is released in General Availability (GA) in 2025. A full list of limitations can be found in the docs.

Get Involved

We're excited about the possibilities OrioleDB brings to the PostgreSQL ecosystem and invite you to join us on this journey. Please try OrioleDB and share your experience and your expectations for the new functionality in GitHub issues and discussions.

If you are a PostgreSQL provider and you want to support OrioleDB on your platform, reach out: we’d love to have more design partners. The OrioleDB team would love to collaborate with other hosting providers.

OrioleDB design choices​

Benchmarking OrioleDB vs PostgreSQL Heap​

TPC-C Warehouses​

Benchmark: 100 warehouses​

Benchmark: 500 warehouses​

Benchmark: 1000 warehouses​

Other optimizations​

Try it yourself​

Our Vision for OrioleDB​

The Go-To PostgreSQL Storage Engine​

Decoupled Storage and Compute​

Open-Source Serverless PostgreSQL​

Hybrid Workloads with Columnar Indexes​

Multi-Master Replication​

Current Limitations​

Get Involved​