Sharding by prefix



Why use prefixed sharding

Control over data placement: Prefixed sharding offers customized data distribution by allowing you to assign data to specific shards. This facilitates tailored data partitioning aligned with your business logic and enhances the organization of your data management.

Geographical Data Grouping: For applications serving users from different regions, storing data on a shard in a region near the end user is beneficial, as it can improve access speed and reduce latency. Additionally, laws and regulations may mandate that data be stored within specific geographical boundaries. Business requirements might also necessitate keeping data close to the audience to enhance query performance and user experience.

Optimized query performance: Prefixed Sharding eliminates the need to query all shards in the cluster by allowing queries to target specific shards containing the relevant documents. Grouping data on the same shard enhances query performance and reduces latency, particularly for regional operations.

Scalability: As your system grows and you add more servers, prefixed sharding simplifies managing increased data volumes and traffic. You can add servers and shards as needed, allowing for controlled scaling that maintains both performance and reliability.

Overall performance: By grouping related data on the same shard, prefixed sharding optimizes data storage and access patterns, reduces network latency for region-specific operations, and enhances overall system performance in distributed database environments.

Overview

Configure shards:

  • To store documents on a specific shard, define the target shard for each document ID prefix that you add.
    Documents with IDs matching any of the defined prefixes will be routed to the corresponding shard.
    Learn how to add prefixes below.

  • For example, you can define that all documents with an ID starting with users/us/ will be stored in shard #0.
    Consequently, the following sample documents will be stored in shard #0:

    • users/us/1
    • users/us/2
    • users/us/washington
    • users/us/california/company/department

Configure multiple shards:

  • You can assign multiple shards to the same prefix.

  • For example, both shard #1 and shard #2 can be assigned to prefix users/us/.
    In this case, any document with that prefix will be stored in either shard #1 or shard #2.

Prefix rules:

  • The maximum number of prefixes that can be defined is 4096.

  • The prefix string that you define must end with either the / character or with -.
    e.g. users/us/ or users/us-.

  • Prefixes are case-insensitive.
    RavenDB will treat /users/us/ and /users/US/ as equivalent document prefixes.
    Therefore, documents such as "/users/us/1" and "/users/US/2" will be routed according to the same rule.

  • RavenDB prioritizes the most specific prefix over more general ones.
    For example, if you configure /users/us/ to shard #0 and /users/us/florida/ to shard #1, then:

    • Document "/users/us/123" will be stored in shard #0.
    • Document "/users/us/florida/123" will be stored in shard #1, even though it also matches the /users/us/ prefix.

Bucket management

When you define a sharded database:
RavenDB reserves 1,048,576 buckets for the entire database. Each shard is assigned a range of buckets from this set. Any document added to the database is processed through a hashing algorithm, which determines the bucket number where the document will reside. The initial bucket distribution for a sharded database with 3 shards will be:

  • Buckets assigned to shard #0: [0 .. 349,524]
  • Buckets assigned to shard #1: [349,525 .. 699,049]
  • Buckets assigned to shard #2: [699,050 .. 1,048,575]

When you configure prefixes for sharding:
RavenDB will reserve an additional range of 1,048,576 buckets for each prefix you add, on top of the buckets already reserved. So now, if we add prefixes users/us/ for shard #0 and users/asia/ for shard #1, we get:

  • Additional buckets assigned to shard #0 [1048576 .. 2097151] for documents with prefix users/us/
  • Additional buckets assigned to shard #1 [2097152 .. 3,145,727] for documents with prefix users/asia/

When creating a new document with an ID that matches any of the defined prefixes -
the hashing algorithm is applied to the document ID, but the resulting bucket number is limited to the set of buckets reserved for that prefix, thereby routing the document to be stored in the chosen shard.

When creating a new document with an ID that does Not match any predefined prefix -
the resulting hashed bucket number could fall into any of the 3 shards.


The reserved buckets ranges are visible in the database record in the Studio.
Navigate to Settings > Database Record and expand the "Sharding > BucketRanges" property:

"Bucket ranges"

Bucket ranges across the shards

Adding prefixes via Studio

You can define prefixes when creating a sharded database via the Studio.

"Create a new database"

Create a new database

  1. From the database list view, click New database to create a new database.
  2. Enter a name for the new database.
  3. Click Next to proceed to the sharding settings on the next screen.

In this example, we define a sharded database with 3 shards, each having a replication factor of 2:

"Enable sharding"

Enable sharding

  1. Turn on the Enable Sharding toggle.
  2. Set the number of shards you want.
  3. Turn on the Add prefixes for shards toggle.
  4. Set the replication factor and other options as desired, and then click Next to proceed to define the prefixes.

Add the prefixes and specify their destination shards:

"|Define prefixes"

Define prefixes

  1. Enter a prefix. The prefix string must end with either / or -.
  2. Select the target shard. Multiple shards can be selected for the same prefix.
  3. Click Add prefix to add additional prefixes.
  4. Click Quick Create to complete the process using the default settings for the remaining configurations and create the new database. Or, click Next to configure additional settings.

New documents will be stored in the matching shards:

"Document list view"

Documents are stored in the requested shards

  1. Documents with prefix users/us/ are stored in shard #0.
  2. Documents with prefix users/asia/ are stored in shard #1.
  3. Documents with an ID that does Not match any prefix will be stored on any of the 3 shards.
    e.g. document users/uk/london/clients/1 is stored on shard #2 since no matching prefix was defined.

Adding prefixes via Client API

Using the Client API, you can add prefixes when creating the database or after database creation:

Add prefixes when creating a database

// Define the database record:
// ===========================
var databaseRecord = new DatabaseRecord
{
    // Provide a name for the new database
    DatabaseName = "SampleDB",

    // Set the sharding topology configuration
    // Here each shard will have a replication factor of 2 nodes
    Sharding = new ShardingConfiguration
    {
        Shards = new Dictionary<int, DatabaseTopology>
        {
            [0] = new() { Members = new List<string> { "A", "B" } },
            [1] = new() { Members = new List<string> { "A", "C" } },
            [2] = new() { Members = new List<string> { "C", "B" } }
        }
    }
};

// Define prefixes and their target shard/s:
// =========================================
databaseRecord.Sharding.Prefixed =
[
    new PrefixedShardingSetting
    {
        Prefix = "users/us/",
        // Assign a SINGLE shard for the prefix
        Shards = [0]
    },
    new PrefixedShardingSetting
    {
        Prefix = "users/asia/",
        // Can assign MULTIPLE shards for a prefix
        Shards = [1, 2]
    }
];

// Deploy the new database to the server: 
// ======================================
store.Maintenance.Server.Send(new CreateDatabaseOperation(databaseRecord));

// You can verify the sharding configuration that has been created:
// ================================================================
var record = store.Maintenance.Server.Send(new GetDatabaseRecordOperation(store.Database));

var shardingConfiguration = record.Sharding;
var numberOfShards = shardingConfiguration.Shards.Count;     // 3
var numberOfPrefixes = shardingConfiguration.Prefixed.Count; // 2
// Define the database record:
// ===========================
var databaseRecord = new DatabaseRecord
{
    // Provide a name for the new database
    DatabaseName = "SampleDB",

    // Set the sharding topology configuration
    // Here each shard will have a replication factor of 2 nodes
    Sharding = new ShardingConfiguration
    {
        Shards = new Dictionary<int, DatabaseTopology>
        {
            [0] = new() { Members = new List<string> { "A", "B" } },
            [1] = new() { Members = new List<string> { "A", "C" } },
            [2] = new() { Members = new List<string> { "C", "B" } }
        }
    }
};

// Define prefixes and their target shard/s:
// =========================================
databaseRecord.Sharding.Prefixed =
[
    new PrefixedShardingSetting
    {
        Prefix = "users/us/",
        // Assign a SINGLE shard for the prefix
        Shards = [0]
    },
    new PrefixedShardingSetting
    {
        Prefix = "users/asia/",
        // Can assign MULTIPLE shards for a prefix
        Shards = [1, 2]
    }
];

// Deploy the new database to the server: 
// ======================================
await store.Maintenance.Server.SendAsync(new CreateDatabaseOperation(databaseRecord));

// You can verify the sharding configuration that has been created:
// ================================================================
var record = await store.Maintenance.Server.SendAsync(new GetDatabaseRecordOperation(store.Database));

var shardingConfiguration = record.Sharding;
var numberOfShards = shardingConfiguration.Shards.Count;     // 3
var numberOfPrefixes = shardingConfiguration.Prefixed.Count; // 2

Add prefixes after database creation

  • Use AddPrefixedShardingSettingOperation to add a prefix to your sharding configuration after the database has been created.

  • In this case, you can only add prefixes that do not match any existing document IDs in the database.
    An exception will be thrown if a document with an ID that starts with the new prefix exists in the database.

// Define the prefix to add and its target shard/s
var shardingSetting = new PrefixedShardingSetting
{
    Prefix = "users/eu/",
    Shards = [2]
    // Can assign multiple shards, e.g.: Shards = [2, 3]
};

// Define the add operation:
var addPrefixOp = new AddPrefixedShardingSettingOperation(shardingSetting);

// Execute the operation by passing it to Maintenance.Send
store.Maintenance.Send(addPrefixOp);
// Define the prefix to add and its target shard/s
var shardingSetting = new PrefixedShardingSetting
{
    Prefix = "users/eu/",
    Shards = [2]
    // Can assign multiple shards, e.g.: Shards = [2, 3]
};

// Define the add operation:
var addPrefixOp = new AddPrefixedShardingSettingOperation(shardingSetting);

// Execute the operation by passing it to Maintenance.SendAsync
await store.Maintenance.SendAsync(addPrefixOp);

Removing prefixes

  • Use DeletePrefixedShardingSettingOperation to remove a prefix from your sharding configuration.

  • You can only delete prefixes that do not match any existing document IDs in the database.
    An exception will be thrown if a document with an ID that starts with the specified prefix exists in the database.

// Define the delete prefix operation,
// Pass the prefix string
var deletePrefixOp = new DeletePrefixedShardingSettingOperation("users/eu/");

// Execute the operation by passing it to Maintenance.Send
store.Maintenance.Send(deletePrefixOp);
// Define the delete prefix operation,
// Pass the prefix string
var deletePrefixOp = new DeletePrefixedShardingSettingOperation("users/eu/");

// Execute the operation by passing it to Maintenance.SendAsync
await store.Maintenance.SendAsync(deletePrefixOp);

Updating shard configurations for prefixes

  • Use UpdatePrefixedShardingSettingOperation to modify the shards assigned to an existing prefix.

  • Unlike when defining prefixes for the first time,
    the following rules must be observed when updating an existing prefix configuration:

    • When adding a shard to an existing prefix configuration,
      RavenDB does not automatically reallocate buckets to the newly added shard.
      Therefore, after assigning a new shard to a prefix, manual bucket re-sharding is required.
      You must manually move some buckets initially reserved for this prefix from the existing shards to the new shard; otherwise, documents matching the prefix will not be stored on the added shard.

    • When removing a shard from an existing prefix configuration,
      you must first manually move the buckets from the removed shard to the other shards that are assigned to this prefix.

  • In the below example, in addition to shard #2 that was configured in the previous example,
    we are adding shard #0 as a destination for documents with prefix users/eu/.

// Define the shards configuration you wish to update for the specified prefix
var shardingSetting = new PrefixedShardingSetting
{
    Prefix = "users/eu/",
    // Adding shard #0 to the previous prefix configuration
    Shards = [0, 2]
};

// Define the update operation:
var updatePrefixOp = new UpdatePrefixedShardingSettingOperation(shardingSetting);

// Execute the operation by passing it to Maintenance.Send
store.Maintenance.Send(updatePrefixOp);
// Define the shards configuration you wish to update for the specified prefix
var shardingSetting = new PrefixedShardingSetting
{
    Prefix = "users/eu/",
    // Adding shard #0 to the previous prefix configuration
    Shards = [0, 2]
};

// Define the update operation:
var updatePrefixOp = new UpdatePrefixedShardingSettingOperation(shardingSetting);

// Execute the operation by passing it to Maintenance.SendAsync
await store.Maintenance.SendAsync(updatePrefixOp);

Querying selected shards by prefix

  • Storing documents on specific shards allows you to query only those shards directly,
    avoiding unnecessary trips to other shards by the orchestrator.

  • Use method ShardContext together with ByPrefix or ByPrefixes to specify which shard/s to query.
    Only the shard/s assigned the specified prefix/prefixes will be queried.
    Note: An exception will be thrown if the specified prefix is not already defined in the database.

  • The provided prefix does not need to match the prefix of any specific document you are querying;
    it is just used to determine which shard(s) to query.
    (Remember that a single prefix can be assigned to multiple shards.)

  • You can designate which shard/s to query by combining both selecting a shard by prefix
    and selecting a shard by document ID.

  • See the following examples:

Query a selected shard by prefix:

// Query for 'User' documents from shard/s assigned to a specific prefix:
// ======================================================================
var userDocs = session.Query<User>()
     // Call 'ShardContext' to select which shard/s to query
     // RavenDB will query only the shard/s assigned to prefix 'users/'
    .Customize(x => x.ShardContext(s => s.ByPrefix("users/")))
     // The query predicate
    .Where(x => x.Name == "Joe")
    .ToList();

// Variable 'userDocs' will include all documents of type 'User'
// that match the query predicate and reside on the shard/s assigned to prefix 'users/'.

// Query for 'Company' documents from shard/s assigned to a specific prefix:
// =========================================================================
var companyDocs = session.Query<Company>()
     // This example shows that the prefix doesn't need to match the document type queried 
    .Customize(x => x.ShardContext(s => s.ByPrefix("users/")))
    .Where(x => x.Address.Country == "US")
    .ToList();

// Variable 'companyDocs' will include all documents of type 'Company'
// that match the query predicate and reside on the shard/s assigned to prefix 'users/'.

// Query for ALL documents from shard/s assigned to a specific prefix:
// ===================================================================
var allDocs = session.Query<object>() // query with <object>
    .Customize(x => x.ShardContext(s => s.ByPrefix("users/")))
    .ToList();

// Variable 'allDocs' will include ALL documents that reside on
// the shard/s assigned to prefix 'users/'.
// Query for 'User' documents from shard/s assigned to a specific prefix:
// ======================================================================
var userDocs = await asyncSession.Query<User>()
    .Customize(x => x.ShardContext(s => s.ByPrefix("users/")))
    .Where(x => x.Name == "Joe")
    .ToListAsync();

// Query for 'Company' documents from shard/s assigned to a specific prefix:
// =========================================================================
var companyDocs = await asyncSession.Query<Company>()
    .Customize(x => x.ShardContext(s => s.ByPrefix("users/")))
    .Where(x => x.Address.Country == "US")
    .ToListAsync();

// Query for ALL documents from shard/s assigned to a specific prefix:
// ===================================================================
var allDocs = await asyncSession.Query<object>()
    .Customize(x => x.ShardContext(s => s.ByPrefix("users/")))
    .ToListAsync();
// Query for 'User' documents from shard/s assigned to a specific prefix:
// ======================================================================
var userDocs = session.Advanced.DocumentQuery<Company>()
    .ShardContext(s => s.ByPrefix("users/"))
    .WhereEquals(x => x.Name, "Joe")
    .ToList();

// Query for 'Company' documents from shard/s assigned to a specific prefix:
// =========================================================================
var companyDocs = session.Advanced.DocumentQuery<Company>()
    .ShardContext(s => s.ByPrefix("users/"))
    .WhereEquals(x => x.Address.Country, "US")
    .ToList();

// Query for ALL documents from shard/s assigned to a specific prefix:
// ===================================================================
var allDocs = session.Advanced.DocumentQuery<Company>()
    .ShardContext(s => s.ByPrefix("users/"))
    .ToList();
// Query for 'User' documents from shard/s assigned to a specific prefix:
// ======================================================================
from "Users"
where Name == "Joe"
{ "__shardContext" : { "Prefixes": ["users/"] }}

// Query for 'Company' documents from shard/s assigned to a specific prefix:
// =========================================================================
from "Companies"
where Address.Country == "US"
{ "__shardContext" : { "Prefixes": ["users/"] }}

// Query for ALL documents from shard/s assigned to a specific prefix:
// ===================================================================
from @all_docs
where Address.Country == "US"
{ "__shardContext" : { "Prefixes": ["users/"] }}

Query selected shards by prefix:

// Query for 'User' documents from shard/s assigned to the specified prefixes:
// ===========================================================================
var userDocs = session.Query<User>()
    // Call 'ShardContext' to select which shard/s to query
    // RavenDB will query only the shard/s assigned to prefixes 'users/us/' or 'users/asia/'
    .Customize(x => x.ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"])))
    // The query predicate
    .Where(x => x.Name == "Joe")
    .ToList();

// Variable 'userDocs' will include all documents of type 'User'
// that match the query predicate and reside on the shard/s
// assigned to prefix 'users/us/' or prefix 'users/asia/'.

// Query for 'Company' documents from shard/s assigned to the specified prefixes:
// ==============================================================================
var companyDocs = session.Query<Company>()
     // This example shows that the prefixes don't need to match the document type queried 
    .Customize(x => x.ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"])))
    .Where(x => x.Address.Country == "US")
    .ToList();

// Variable 'companyDocs' will include all documents of type 'Company'
// that match the query predicate and reside on the shard/s
// assigned to prefix 'users/us/' or prefix 'users/asia/'.

// Query for ALL documents from shard/s assigned to the specified prefixes:
// ========================================================================
var allDocs = session.Query<object>() // query with <object>
    .Customize(x => x.ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"])))
    .ToList();

// Variable 'allDocs' will include all documents reside on the shard/s
// assigned to prefix 'users/us/' or prefix 'users/asia/'.
// Query for 'User' documents from shard/s assigned to the specified prefixes:
// ===========================================================================
var userDocs = await asyncSession.Query<User>()
    .Customize(x => x.ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"])))
    .Where(x => x.Name == "Joe")
    .ToListAsync();

// Query for 'Company' documents from shard/s assigned to the specified prefixes:
// ==============================================================================
var companyDocs = await asyncSession.Query<Company>()
    .Customize(x => x.ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"])))
    .Where(x => x.Address.Country == "US")
    .ToListAsync();

// Query for ALL documents from shard/s assigned to the specified prefixes:
// ========================================================================
var allDocs = await asyncSession.Query<object>()
    .Customize(x => x.ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"])))
    .ToListAsync();
// Query for 'User' documents from shard/s assigned to the specified prefixes:
// ===========================================================================
var userDocs = session.Advanced.DocumentQuery<User>()
    .ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"]))
    .WhereEquals(x => x.Name, "Joe")
    .ToList();

// Query for 'Company' documents from shard/s assigned to the specified prefixes:
// ==============================================================================
var companyDocs = session.Advanced.DocumentQuery<Company>()
    .ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"]))
    .WhereEquals(x => x.Address.Country, "US")
    .ToList();

// Query for ALL documents from shard/s assigned to the specified prefixes:
// ========================================================================
var allDocs = session.Advanced.DocumentQuery<Company>()
    .ShardContext(s => s.ByPrefixes(["users/us/", "users/asia/"]))
    .ToList();
// Query for 'User' documents from shard/s assigned to the specified prefixes:
// ===========================================================================
from "Users"
where Name == "Joe"
{ "__shardContext": { "Prefixes": ["users/us/", "users/asia/"] }}

// Query for 'Company' documents from shard/s assigned to the specified prefixes:
// ==============================================================================
from "Companies"
where Address.Country == "US"
{ "__shardContext": { "Prefixes": ["users/us/", "users/asia/"] }}

// Query for ALL documents from shard/s assigned to the specified prefixes:
// ========================================================================
from @all_docs
where Address.Country == "US"
{ "__shardContext": { "Prefixes": ["users/us/", "users/asia/"] }}

Query selected shard by prefix and by document ID:

// Query for 'Company' documents from shard/s assigned to a prefix & by document ID:
// =================================================================================
var companyDocs = session.Query<Company>()
    .Customize(x => x.ShardContext(s => 
        s.ByPrefix("users/us/").ByDocumentId("companies/1")))
    .Where(x => x.Address.Country == "US")
    .ToList();

// Variable 'companyDocs' will include all documents of type 'Company'
// that match the query predicate and reside on:
// * the shard/s assigned to prefix 'users/us/'
// * or the shard containing document 'companies/1'.
// Query for 'Company' documents from shard/s assigned to a prefix & by document ID:
// =================================================================================
var companyDocs = await asyncSession.Query<Company>()
    .Customize(x => x.ShardContext(s => 
        s.ByPrefix("users/us/").ByDocumentId("companies/1")))
    .Where(x => x.Address.Country == "US")
    .ToListAsync();
// Query for 'Company' documents from shard/s assigned to a prefix & by document ID:
// =================================================================================
var companyDocs = session.Advanced.DocumentQuery<Company>()
    .ShardContext(s => 
        s.ByPrefix("users/us/").ByDocumentId("companies/1"))
    .WhereEquals(x => x.Address.Country, "US")
    .ToList();
// Query for 'Company' documents from shard/s assigned to a prefix & by document ID:
// =================================================================================
from "Companies"
where Address.Country == "US"
{ "__shardContext": { "DocumentIds": ["companies/1"], "Prefixes": ["users/us/"] }}

Prefixed sharding vs Anchoring documents

Anchoring documents:
With anchoring documents, you can ensure that designated documents will be stored in the same bucket
(and therefore the same shard), but you cannot specify which exact bucket it will be.
So documents can be grouped within the same shard, but the exact shard cannot be controlled.

Prefixed sharding:
With prefixed sharding, you control which shard a document is stored in.
However, you cannot specify the exact bucket within that shard.

Applying both methods:
When both methods are applied, prefixed sharding takes precedence and overrides anchoring documents.
For example:

Given:

  • Using prefixed sharding, you assign shard #0 to store all documents with prefix users/us/.
  • Your database already includes a document with ID companies/456 stored in shard #1.

Now:

  • Using the anchoring documents technique, you create document users/us/123$companies/456,
    with the intention that both users/us/123 and companies/456 will reside in the same bucket.
  • In which shard will this new document be stored ?

The result:

  • Since prefixed sharding is prioritized, this new document will be stored in shard #0,
    even though the suffix suggests it should be stored in shard #1, where the Companies document resides.