DE EN

EclipseStore TAKES CONTROL OF CONCURRENCY

Markus Kett

In the world of data management, ensuring data integrity and consistency is paramount, especially for mission-critical applications in finance, healthcare, and other sensitive sectors. This is where the concept of ACID comes in.

ACID: The Bedrock of Reliable Data

ACID stands for Atomicity, Consistency, Isolation, and Durability. These properties are essential for any database system that handles transactions. Let’s break down what each means:

  • Atomicity: An operation within a transaction is either all successful or all fails. This ensures that the data remains in a consistent state.
  • Consistency: The database enforces data integrity rules. Transactions may not leave the data in an invalid state.
  • Isolation: Transactions execute as if they were isolated from each other. Changes made in one transaction are invisible to other ongoing transactions until the first one is committed.
  • Durability: Once a transaction is committed, the changes are permanently persisted in storage. This ensures that data is not lost even in case of system crashes or power failures.

EclipseStore: Speed Meets Simplicity

EclipseStore is a revolutionary Java persistence framework to build cutting-edge Java in-memory database applications. It offers a blazing-fast performance by keeping and managing an object graph in RAM for real-time data access and manipulation. This object graph acts as an in-memory database, with a root object serving as the entry point. The object graph, mostly structured like a tree, can be designed freely by using any Java type. Complex circular references are also supported. Any object reachable from the root object can be persistently stored on disk for durability.

Searching and filtering data on an object graph is a breeze thanks to the Java Streams API. However, for these operations to work flawlessly, data needs to be pre-loaded into RAM before any search or filter can be applied. The introduction of MicroStream’s GigaMap simplified this process by automating the pre-loading of lazy data, making it very efficient and user-friendly.

The Concurrency MYSTERY: Java’s Power, Developer’s Pain

While EclipseStore excelled in speed and simplicity, a long-standing challenge remained: concurrency control.

Databases Care for Concurrency – Don’t Be Fooled

As database systems are ACID compliant and provide transactions, it’s often assumed that databases handle concurrency seamlessly, while developers have to care for concurrency on the object graph in memory when using EclipseStore. The reality is more nuanced. While databases offer robust mechanisms to ensure data integrity and consistency, they are not a silver bullet. Developers still play a crucial role in ensuring data consistency and integrity, as well as concurrency, especially in complex applications.

BEGIN TRANSACTION;
    INSERT INTO Customers (Name, Lastname) VALUES ('John', 'Doe');
COMMIT TRANSACTION;

Developers have to choose the right isolation level, because this can have significantly impact performance and data consistency. Understanding and leveraging different locking strategies, such as optimistic and pessimistic locking, is essential to avoid deadlocks and optimize performance. Efficiently written queries can minimize contention and improve scalability, even in high-traffic environments. In some cases, application-level synchronization might be necessary to guarantee data integrity, particularly when dealing with complex data structures or custom business logic.

Only by carefully considering these factors, developers benefit from the database features to build robust and scalable applications. However, it’s important to recognize that database systems, while powerful, are not a magic solution to all concurrency problems. Developers must still actively participate in designing and implementing effective concurrency control strategies.

EclipseStore, initially, relied solely on Java’s concurrency features. This meant developers had to take on the responsibility of implementing concurrency control measures themselves. Managing concurrency effectively can be a complex and error-prone endeavor. However, Java offers a wide range of options for concurrency control, including synchronized blocks, reentrant locks, and more.

Synchronized Blocks

Synchronized blocks are the simplest concept for coordinating access to shared resources among multiple threads. It ensures that only one thread can execute a critical section of code at a time, preventing race conditions and data corruption.

How it works:

  1. Synchronization Block: When a thread enters a Synchronized block, it acquires a lock on the object associated with the block.
  2. Exclusive Access: While a thread holds the lock, no other thread can enter the block. This ensures exclusive access to the shared resource.
  3. Lock Release: Once the thread finishes executing the synchronized block, it releases the lock, allowing other threads to acquire it.

Imagine an object graph in your application containing a list of people. Multiple parts of your application (threads) need to access this list at the same time. Now, let’s say you want to add new people to this list from an external source. If one thread is busy adding new people to the list (importing), and another thread tries to read or modify the same list at the same time, it can lead to inconsistencies or errors. This is because the list might be in an intermediate state while the import is happening. To prevent these problems, the thread that is importing needs a way to temporarily stop other threads from accessing the list until the import is finished.

void importPersons(List<Person> persons) {
      synchronized(persons)
      {
            Person importedPerson = this.importNextPerson();
            while(importedPerson != null) {
                  persons.add(importedPerson);
            }
      }
}

If we introduce a PersonsManager and go through this manager any time we add, remove, or modify a person, the code becomes even more simple.

synchronized void importPersons(List<Person> persons) {
      Person importedPerson = this.importNextPerson();
      while(importedPerson != null) {
            persons.add(importedPerson);
      }
}

The Problem with Long-Running Synchronized Blocks:

If a synchronized block takes a long time to execute (like 5 minutes for an import), it can significantly impact the performance of other threads that need to access the shared resource. This is because all other threads will be blocked until the synchronized block completes.

A Better Approach: Using a Local List

To avoid this bottleneck, we can introduce a temporary, local list within the importer thread. This local list is exclusive to the importer, so there’s no need for synchronization while building it. The importer can take as long as it needs to populate this local list without affecting other threads.

The Final Merge: Once the import is complete and the local list is fully populated, we can then:

  1. Acquire the lock: Briefly lock the global list to prevent concurrent modifications.
  2. Copy references: Instead of copying the entire dataset (which would be inefficient, especially for large datasets), we can simply copy the references from the local list to the global list. This is a much faster operation.
  3. Release the lock: Once the references are copied, we release the lock on the global list.
void importPersons(List<Person> persons) {
      List<Person> importedPersons = this.importAllPersons();
      
      synchronized(persons) {
            persons.addAll(importedPersons);
      }
}

Defensive Copy:

If you want to search on such a list without blocking it, you can also turn the tables and use a defensive copy. Only the search thread accesses this and the lock is then released again.

void searchInPersons(List<Person> persons) {
      List<Person> defensiveCopy = new ArrayList<>();
      
      synchronized(persons) {
            defensiveCopy.addAll(persons);
      }
      
      for(Person p : defensiveCopy) {
            if(p.height == 1.75f) {
                  System.out.println(p);
            }
      }
}

As you can see, with synchronized blocks, concurrency handling is not rocket science. With a well-thought-out application design, all requirements can be implemented elegantly and with EclipseStore the possibilities of Java can be used without restrictions. However, synchronized blocks also come with various disadvantages:

  1. Performance Overhead: Synchronized blocks can significantly impact performance, especially in high-concurrency scenarios. Acquiring and releasing locks incurs overhead, slowing down execution.
  2. Potential for Deadlocks: If multiple threads acquire locks on different objects in a circular manner, a deadlock can occur, where each thread waits for the other to release the lock, leading to a standstill.
  3. Coarse-Grained Locking: Synchronizing large blocks of code can limit concurrency, as only one thread can access the shared resource at a time. This can lead to reduced performance and scalability.
  4. Complexity: Using synchronized blocks can make code more complex and harder to understand, especially when dealing with multiple synchronized methods and objects.

To mitigate these drawbacks, it’s often recommended to use more fine-grained synchronization techniques like Reentrant Locks or concurrent data structures.

public class Customers {
    private final transient ReentrantReadWriteLock reentrantLock = new ReentrantReadWriteLock();
    private final List<Customer> customers = new ArrayList<>();

    public void addCustomer(Customer customer) {
        WriteLock writeLock = this.reentrantLock.writeLock();
       
        writeLock.lock();
        try {
            customers.add(customer);
            Application.storageManager().store(customers);
        } finally {
            writeLock.unlock();
        }
    }

    public void traverseCustomers(Consumer<Customer> consumer) {
        ReadLock readLock = this.reentrantLock.readLock();
        readLock.lock();
        try {
            customers.forEach(consumer);
        } finally {
            readLock.unlock();
        }
    }
}

At this point, we save ourselves the treatment of Reentrant locks and other concurrency concepts that Java offers, because EclipseStore now offers the solution that many developers have been longing for.

EclipseStore’s New Locking API Simplifies Concurrency

Recognizing the importance of user concerns, the EclipseStore team decided to take a significant step forward. A brand new locking API has been added to the Eclipse Serializer project, which forms the core dependency of EclipseStore. This new API introduces locking functionality, enabling developers to easily manage concurrent access to the object graph. The locking API will be available with the upcoming version 2.1.

Leveraging the Power of Reentrant Locks

The new locking API leverages the robust mechanism of Reentrant Locks to ensure thread safety and data consistency. By using reentrant locks, developers can avoid common concurrency pitfalls like deadlocks and race conditions. This simplifies the process of managing concurrent access to the object graph, making it more accessible to a wider range of developers.

Simplified Concurrency Control

With the new locking API, developers can now handle concurrency in EclipseStore with a level of simplicity that rivals traditional database systems. The API provides intuitive mechanisms to define read and write operations, ensuring that concurrent access to shared data is managed efficiently. This makes it easier to build reliable and scalable applications using EclipseStore.

From Java Complexity to Database-like Transactions

The introduction of the locking API in Eclipse Serializer represents a major turning point for EclipseStore. Developers can now handle concurrency with a similar approach to writing transactions for database systems. This significantly simplifies development and reduces the risk of concurrency errors.

1. Using LockedExecutor

    public class Customers {
    
     private final transient LockedExecutor executor = LockedExecutor.New();
    
     private final List<Customer> customers = new ArrayList<>();
    
    
     public void addCustomer(Customer c) {
    
       executor.write(() -> {
    
         this.customers.add(c);
    
         Application.storageManager().store(this.customers); // Assuming storageManager is thread-safe
    
       });
    
     }
    
    
     public void traverseCustomers(Consumer<Customer> consumer) {
    
       executor.read(() -> this.customers.forEach(consumer));
    
     }
    
    }
    • LockedExecutor provides read and write locks for critical sections
    • Keep Application.storageManager().store(…) calls in locked section as well

    2. LockedScope:

    public class Customers extends LockScope {
      private final List<Customer> customers = new ArrayList<>();
    
      public void addCustomer(Customer c) {
        write(() -> {
          customers.add(c);
          Application.storageManager().store(customers); // Assuming storageManager is thread-safe
        });
      }
    
      public void traverseCustomers(Consumer<Customer> consumer) {
        read(() -> customers.forEach(consumer)); // Lock only for read operation if necessary
      }
    }
    • LockScope holds an LockedExecutor internally and provides delegate methods for reads and writes
    • Similar to the LockedExecutor approach, we can use write for adding customers and storing them.
    • For traverseCustomers, we can use read if locking is necessary for iterating over the list while maintaining consistency. However, since the forEach method creates a copy of the list internally, locking might be unnecessary if the data is not expected to change concurrently.

    EclipseStore: Unlocking ACID Compliance for Business-Critical Applications

    For developers working with high-performance data storage and retrieval, Eclipsestore has always been a compelling choice. Its focus on raw speed and efficient data structures made it ideal for applications demanding fast searches and data access. With introducing a powerful new locking API, Eclipsestore now boasts full ACID compliance. This means data integrity and consistency are guaranteed, placing Eclipsestore on par with traditional database systems regarding transaction safety.

    EclipseStore is now ready for the big leagues: banking, finance, healthcare, and other sectors where sensitive data needs top-notch security and reliability, alongside lightning-fast performance.

    This milestone marks a significant advancement for EclipseStore and its community. By simplifying concurrency control and ensuring ACID compliance, EclipseStore builds trust and opens doors for broader adoption across a wider range of applications. The future looks bright for EclipseStore, empowering developers to build fast, resilient, and reliable data-driven applications.

    Eclipsestore is an open-source project under the Eclipse Foundation (www.eclipsestore.io). Its core technology, the Eclipse Serializer, is also an open-source project available as a standalone library. The locking API will be released with the upcoming version 2.1. With Eclipsestore, you get the best of both worlds: blazing-fast performance, robust ACID compliance, and a supportive open-source community.

    Total
    0
    Shares
    Previous Post

    The AI Mona Lisa Challenge: Precision and Security Adjustments for Your CI/CD Pipeline

    Next Post

    My Top 10 Principles for getting the best from AI (for developers)

    Related Posts