NHibernateStreaming large result sets

time to read 6 min | 1122 words

Note: I am not feeling very well for the past week or so, which is why I am posting so rarely.

NHibernate is meant to be used in an OLTP system, as such, it is usually used in cases where we want to load a relatively small amount of data from the database, work with it and save it back. For reporting scenarios, there are better alternatives, usually (and before you ask, any reporting package will do. Right tool for the job, etc).

But there are cases where you want to do use NHibernate in reporting scenarios nonetheless. Maybe because the reporting requirements aren’t enough to justify going to a separate tool, or because you want to use what you already know. It is in those cases where you tend to run into problems, because you violate the assumptions that were made while building NHibernate.

Let us imagine the following use case, we want to print a list of book names to the user:

using (ISession s = OpenSession())
{
    var books = s.CreateQuery("from Book")
        .List<Book>();

    foreach (var book in books)
    {
        Console.WriteLine(book.Name);
    }
}

There are several problems here:

  • We query on a large result set without a limit clause.
  • We read a lot of data into memory.
  • We only start processing the data after it was completely read from the database.

What I would like to see is something like this:

while(dataReader.Read())
{
     Console.WriteLine(dataReader.GetString("Name"));
}

This still suffer from the problem of reading a large result set, but we will consider this a part of our requirements, so we’ll just have to live with it. The data reader code has two major advantages, it uses very little memory, and we can start processing the data as soon as the first row loads from the database.

How can we replicate that with NHibernate?

Well, as usual with NHibernate, it is only a matter of finding the right extension point. In this case, the List method on the query also has an overload that accepts an IList parameter:

image

That make it as simple as implementing our own IList implementation:

public class ActionableList<T> : IList
{
    private Action<T> action;

    public ActionableList(Action<T> action)
    {
        this.action = action;
    }

    public int Add(object value)
    {
        action((T)value);
        return -1;
    }

    public bool Contains(object value)
    {
        throw new NotImplementedException();
    }

    // ...
}

And now we can call it:

using (ISession s = OpenSession())
{
    var books = new ActionableList<Book>(book => Console.WriteLine(book.Name));
    s.CreateQuery("from Book")
        .List(books);

}

This will have the exact same effect as the pervious NHibernate code, but it will start printing the results as soon as the first result loads from the database. We still have the problem of memory consumption, though. The session will keep track of all the loaded objects, and if we load a lot of data, it will eventually blow out with an out of memory exception.

Luckily, NHibernate has a ready made solution for this, the stateless session. The code now looks like this:

using (IStatelessSession s = sessionFactory.OpenStatelessSession())
{
    var books = new ActionableList<Book>(book => Console.WriteLine(book.Name));
    s.CreateQuery("from Book")
        .List(books);

}

The stateless session, unlike the normal NHibernate session, doesn’t keep track of loaded objects, so the code here and the data reader code are essentially the same thing.

More posts in "NHibernate" series:

  1. (19 Nov 2010) Complex relationships
  2. (27 Jun 2010) Streaming large result sets
  3. (25 May 2009) Why do we need to specify the query type twice?
  4. (20 Mar 2009) Avoid identity generator when possible
  5. (26 Mar 2007) Nullable DateTime Issues
  6. (08 Jan 2007) Fetching multiply collections in one roundtrip