Ayende @ Rahien

Oct 19 2023

.NET Rocks: Data Sharding with Oren Eini

time to read 1 min | 122 words

Tags:

You can listen to me talk to Carl & Richard on RavenDB Sharding here.

What is data sharding, and why do you need it? Carl and Richard talk to Oren Eini about his latest work on RavenDB, including the new data sharding feature. Oren talks about the power of sharding a database across multiple servers to improve performance on massive data sets. While a sharded database is typically in a single data center, it is possible to distribute the shards across multiple locations. The conversation explores the advantages and disadvantages of the different approaches, including that you might not need it today, but it's great to know it's there when you do!

This episode was recorded a while ago, and just went live.

Oct 13 2023

ChallengeFastest node selection metastable error state–answer

time to read 2 min | 290 words

Tweet Share Share 1 comments

Tags:

In the previous post, I showed a very simple request router that would always select the fastest node. That worked for a long while, until it didn’t, and the challenge is figuring out why.

As it turns out, the issue is a simple one of spooky action at a distance. Here is what happens. Assume that we have three servers and 10 clients. Each server is sized to handle 4 clients. So far, so good, the system has the capacity to spare.

The problem is in the manner in which clients will detect which is the fastest node in the cluster. The only thing that is considered is the state of the node at the time of selection. At that time, we may end up with all the nodes selecting one particular node as the fastest.

In other words, we have three servers, two of them have no clients talking to them and one of the servers has all the clients talking to it. That results in that node going down, obviously. The clients would then react appropriately, and select a new node to talk to. All of them would do that, find the fastest node, and… bring it down as well. Rinse & repeat.

The issue can be stated as Time Of Check vs Time Of Use, but also as a race condition, where all individual nodes end up doing a synchronized “wave” operation that kills the system.

How do you prevent this?

You introduce randomness into the system. You don’t test the status once, but re-check on a regular basis so you can respond to shifting load. You should also introduce randomness into the process. So the nodes won’t all do this exactly at the same time and end up in the same position.

Oct 12 2023

ChallengeFastest node selection metastable error state

time to read 1 min | 186 words

Tweet Share Share 12 comments

Tags:

Side note: Current state in Israel right now is bad. I’m writing this blog post as a form of escapism so I can talk about something that makes sense and follow logic and reason. I’ll not comment on the current status otherwise in this area.

Consider the following scenario. We have a bunch of servers and clients. The clients want to send requests for processing to the fastest node that they have available. But the algorithm that was implemented has an issue, can you see what this is?

To simplify things, we are going to assume that the work that is being done for each request is the same, so we don’t need to worry about different request workloads.

The idea is that each client node will find the fastest node (usually meaning the nearest one) and if there is enough load on the server to have it start throwing errors, it will try to find another one. This system has successfully spread the load across all servers, until one day, the entire system went down. And then it stayed down.

Can you figure out what is the issue?

Oct 02 2023

RavenDB version 6.0 is now live

time to read 9 min | 1678 words

Tweet Share Share 2 comments

Tags:

Today marks a very long journey for RavenDB as we release version 6.0 into the wild. This completes a multi-year effort on our part, which I’m really excited to share with you.

In version 6.0 the RavenDB team had a number of goals, with the primary one being, as always, providing a simple and powerful platform for your data and documents.

The full list of changes in the release is here and you can already download the binaries of the new release to try it out.

Below you’ll find more information about some of the highlights for the new release, but the executive summary for the technically inclined is:

Deep integration with Kafka & RabbitMQ - allowing to send and consume messages directly from RavenDB.
Sharding support allows you to run massively large databases easily and cheaply.
Corax is a new indexing engine for RavenDB that can provide a 10x performance improvement for both indexing and queries.

With this new release we are also announcing a pricing update for our commercial customers. You can find more details at the bottom of this post.

Before we get to the gist of the changes, I want to spend a minute talking about this release. One of the reasons that we took so long to craft this release was an uncompromising eye toward ease of use, performance, and quality of experience.

It may sound strange to talk about “quality of experience” when discussing a database engine, we aren’t talking about a weekend trip or a romantic getaway, after all. The reason RavenDB exists is that we found other databases to be complicated and delicate beasts, and we wanted to create a database that would be a joy to work with. Something that you could throw your data and queries to and would just work.

With version 6.0, I’m happy to say that we managed to keep on doing just that while providing you with a host of new features and capabilities. Without further ado, allow me to tell you about the highlights of this release.

Sharding

Sharding is the process of splitting your data among multiple nodes, to allow you to scale beyond the terabyte range. RavenDB actually had this feature, as far back as 2010, when the 1.0 version came out. I’m speaking about this today because we have decided to revamp the entire approach to sharding in the new release. Sharding in RavenDB version 6.0 consists of selecting the sharded option on database creation time, and everything else is handled by the cluster. Splitting the data across the nodes, balancing load and giving you the feeling that you are working on a unified database is all our responsibility.

A chief goal of the new sharding feature in RavenDB was to make sure that you don’t have to be aware that you are running in a sharded environment. In fact, you can even switch the backend to a sharded database behind your application’s back, with no modifications to the application.

Your systems can take advantage of the sharding feature, but they aren’t forced to. This makes it easier when you need to scale because you don’t need an expensive rewrite or stop the world migration to move to the new architecture.

RavenDB, with or without sharding, offers you the same capabilities. Features such as indexing, map/reduce, ETL tasks, or subscriptions all work in the same way regardless of how you choose to store your data. A successful metric for us is that after you switch over to sharding, the only thing that you’ll notice is that you can now scale your systems even more. And, of course, you could do that easily, safely, and quickly.

I recorded a webinar showcasing our new sharding approach that you can watch.

Corax Indexing Engine

Corax indexing engine is now available in version 6.0. We have been working on that since 2014, and as you can imagine, it hasn’t been a simple -to-solve challenge. Corax has a single task, to be able to do everything that we use Lucene for, but do that faster. In fact, to do things much faster.

RavenDB has used Lucene as its indexing engine from the start, and Lucene has been the industry benchmark for search engines for a long time. In the context of RavenDB, we have optimized Lucene heavily for the past 15 years. When building Corax, it wasn’t sufficient to match what Lucene can do but to also significantly outperform it.

With version 6.0, we now offer both Corax and Lucene as indexing engines for RavenDB. You can choose to use either engine (globally or per index). Upgrading from an older version of RavenDB, you’ll keep on using the Lucene engine but can create new indexes with Corax.

Corax outperforms Lucene on a wide range of scenarios by an order of magnitude. This includes both indexing time and query latency. Corax also manages to be far faster than Lucene while utilizing less CPU and memory. One of the ways it does that is by preparing, in advance, the relevant data structures that Lucene needs to compute at runtime.

Corax indexes tend to consume about 10% - 20% more disk space than Lucene, as a trade for offering lowered memory usage, far faster querying performance, and reduced indexing time.

In the same manner as sharding, switching to the new indexing engine will get you performance improvements, with no difference in capabilities or features.

Kafka & RabbitMQ Integration

Kafka & RabbitMQ integration has been extended to version 6.0. RavenDB already supported writing to Kafka and RabbitMQ and we have now extended this capability to allow you to read messages from Kafka and RabbitMQ and turn them into documents.

This capability provides complete support for queuing infrastructure integration. RavenDB can work with existing pipelines and easily expose its data for the rest of the organization as well as consume messages from the queue and create or update documents accordingly.

Such documents are then available for querying, aggregation, etc. This makes it trivial to use a low-code solution to quickly and efficiently plug your system into Kafka and RabbitMQ. In the style of all RavenDB features, we take upon ourselves all the complexity inherent in such a solution. You only need to provide the connection string and the details about what you want to pull, and the full management of pulling or pushing data, handling distributed state, failover, and monitoring is the responsibility of the RavenDB cluster.

Data Archiving

Data archiving is a new capability in RavenDB, aiming to help users who deal with very large amounts of data. While a business wants to retain all its data forever, it typically works primarily with recent data. That is typically the data from the last year or two. As time goes by, old data accumulates, and going through irrelevant data consumes computing and memory resources from the database.

RavenDB version 6.0 offers a way to mark a document with an attribute, marking when RavenDB should move that document to archive mode. Aside from telling RavenDB at what time we should archive a document, there is nothing further that you need to do.

When the time comes to archive the document, RavenDB will change the document mode, which will have a number of effects. The document itself is retained, it is not deleted. If you aren’t already using document compression, the archived document is stored in a compressed manner, to reduce disk usage further.

Subscriptions and indexes will ignore archive documents and unless explicitly requested, archived documents will remain on the disk and consume no memory resources. Archiving reduces the resources consumed by archived documents while keeping them available if you need to look them up.

You can also construct indexing that would operate over archived and regular documents as well if you still want to allow queries over archived data. In this way, you can retain your entire dataset in the main system of record but still operate a significantly smaller chunk of that during normal operations.

Performance, observability, and usability enhancements have also been on the menu for the version 6.0 release. There are far too many individual features and improvements to count them all in this format. I encourage you to check the full release details on the website.

You are probably well aware of the drastic changes that happened in the business environment in the last few years. As part of adjusting to this new reality, we are updating our pricing to allow us to keep delivering top-notch solutions and support to our valued customers.

The new pricing policy is effective starting from January 1st, 2024.

What does this mean to you right now? Absolutely nothing until January 1st, 2024. All subscription renewals in 2023 keep their current price point.

An Early Birds benefit for our existing customers: renew your 2024 subscription while locking your 2023 price point.
* Available for Cloud and on-premises Customers.

In the weeks ahead, we'll provide detailed updates about these changes and ensure a smooth transition. Our goal is to empower you to make optimized choices about your RavenDB subscription. Your satisfaction remains our priority.

Our team is ready to assist you. Please reach out to us at [email protected] for any questions you have.

Finally, I would like to thank the RavenDB team for their hard work in delivering this version.

Our goal for this release was to deliver you a lot more while making sure that you’ll not be exposed to the underlying complexity of the problems that we are solving for you.

I believe that as you try out the new release, you’ll see how successful we are in providing you with an excellent database platform to take your systems to new heights.

And with that, I am left only with encouraging you to try out the new version. And let us know what you think about it.

Oren Eini

Oren Eini

CEO of RavenDB

.NET Rocks: Data Sharding with Oren Eini

ChallengeFastest node selection metastable error state–answer

ChallengeFastest node selection metastable error state

RavenDB version 6.0 is now live

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed