Modeling hierarchical structures in RavenDB

time to read 5 min | 829 words

The question pops up frequently enough and is interesting enough for a post. How do you store a data structure like this in Raven?

The problem here is that we don’t have enough information about the problem to actually give an answer. That is because when we think of how we should model the data, we also need to consider how it is going to be accessed. In more precise terms, we need to define what is the aggregate root of the data in question.

Let us take the following two examples:

image image

As you can imagine, a Person is an aggregate root. It can stand on its own. I would typically store a Person in Raven using one of two approaches:

Bare references Denormalized References
{
  "Name": "Ayende",
  "Email": "[email protected]",
  "Parent": "people/18",
  "Children": [
        "people/59",
        "people/29"
  ]
}
{
  "Name": "Ayende",
  "Email": "[email protected]",
  "Parent": { "Name": "Oren", "Id": "people/18"},
  "Children": [
        { "Name": "Raven", "Id": "people/59"},
        { "Name": "Rhino", "Id": "people/29"}
  ]
}

The first option is bare references, just holding the id of the associated document. This is useful if I only need to reference the data very rarely. If, however, (as is common), I need to also show some data from the associated documents, it is generally better to use denormalized references, which keep the data that we need to deal with from the associated document embedded inside the aggregate.

But the same approach wouldn’t work for Questions. In the Question model, we have utilized the same data structure to hold both the question and the answer. This sort of double utilization is pretty common, unfortunately. For example, you can see it being used in StackOverflow, where both Questions & Answers are stored as posts.

The problem from a design perspective is that in this case a Question is not a root aggregate in the same sense that a Person is. A Question is a root aggregate if it is an actual question, not if it is a Question instance that holds the answer to another question. I would model this using:

{
   "Content": "How to model relations in RavenDB?",
   "User": "users/1738",
   "Answers" : [
      {"Content": "You can use.. ", "User": "users/92" },
      {"Content": "Or you might...", "User": "users/94" },
   ]
}

In this case, we are embedding the children directly inside the root document.

So I am afraid that the answer to that question is: it depends.