In my previous post, we dealt with how to model Auctions and Products, this time, we are going to look at how to model bids.
Before we can do that, we need to figure out how we are going to use them. As I mentioned, I am going to use Ebay as the source for “application mockups”. So I went to Ebay and took a couple of screen shots.
Here is the actual auction page:
And here is the actual bids page.
This tells us several things:
- Bids aren’t really accessed for the main page.
- There is a strong likelihood that the number of bids is going to be small for most items (less than a thousand).
- Even for items with a lot of bids, we only care about the most recent ones for the most part.
This is the Auction document as we have last seen it:
{ "Quantity":15, "Product":{ "Name":"Flying Monkey Doll", "Colors":[ "Blue & Green" ], "Price":29, "Weight":0.23 }, "StartsAt":"2011-09-01", "EndsAt":"2011-09-15" }
The question is where are we putting the Bids? One easy option would be to put all the bids inside the Auction document, like so:
{ "Quantity":15, "Product":{ "Name":"Flying Monkey Doll", "Colors":[ "Blue & Green" ], "Price":29, "Weight":0.23 }, "StartsAt":"2011-09-01", "EndsAt":"2011-09-15", "Bids": [ {"Bidder": "bidders/123", "Amount": 0.1, "At": "2011-09-08T12:20" } ] }
The problem with such an approach is that we are now forced to load the Bids whenever we want to load the Auction, but the main scenario is that we just need the Auction details, not all of the Bids details. In fact, we only need the count of Bids and the Winning Bid, it will also fail to handle properly the scenario of High Interest Auction, one that has a lot of Bids.
That leave us with few options. One of those indicate that we don’t really care about Bids and Auction as a time sensitive matter. As long as we are accepting Bids, we don’t really need to give you immediate feedback. Indeed, this is how most Auction sites work. They give you a cached view of the data, refreshing it every 30 seconds or so. The idea is to reduce the cost of actually accepting a new Bids to the minimum necessary. Once the Auction is closed, we can figure out who actually won and notify them.
A good design for this scenario would be a separate Bid document for each Bid, and a map/reduce index to get the Winning Bid Amount and Big Count. Something like this:
{"Bidder": "bidders/123", "Amount": 0.1, "At": "2011-09-08T12:20", "Auction": "auctions/1234"} {"Bidder": "bidders/234", "Amount": 0.15, "At": "2011-09-08T12:21", "Auction": "auctions/1234" } {"Bidder": "bidders/123", "Amount": 0.2, "At": "2011-09-08T12:22", "Auction": "auctions/1234" }
And the index:
from bids in docs.Bids select new { Count = 1, bid.Amount, big.Auction } select result from results group result by result.Auction into g select new { Count = g.Sum(x=>x.Count), Amount = g.Max(x=>x.Amount), Auction = g.Key }
As you can imagine, due to the nature of RavenDB’s indexes, we can cheaply insert new Bids, without having to wait for the indexing to work. And we can always display the last calculated value of the Auction, including what time it is stable for.
That is one model for an Auction site, but another one would be a much stringer scenario, where you can’t just accept any Bid. It might be a system where you are charged per bid, so accepting a known invalid bid is not allowed (if you were outbid in the meantime). How would we build such a system? We can still use the previous design, and just defer the actual billing for a later stage, but let us assume that this is a strong constraint on the system.
In this case, we can’t rely on the indexes, because we need immediately consistent information, and we need it to be cheap. With RavenDB, we have the document store, which is ACIDly consistent. So we can do the following, store all of the Bids for an Auction in a single document:
{ "Auction": "auctions/1234", "Bids": [ {"Bidder": "bidders/123", "Amount": 0.1, "At": "2011-09-08T12:20", "Auction": "auctions/1234"} {"Bidder": "bidders/234", "Amount": 0.15, "At": "2011-09-08T12:21", "Auction": "auctions/1234" } {"Bidder": "bidders/123", "Amount": 0.2, "At": "2011-09-08T12:22", "Auction": "auctions/1234" } ] }
And we modify the Auction document to be:
{ "Quantity":15, "Product":{ "Name":"Flying Monkey Doll", "Colors":[ "Blue & Green" ], "Price":29, "Weight":0.23 }, "StartsAt":"2011-09-01", "EndsAt":"2011-09-15", "WinningBidAmount": 0.2, "BidsCount" 3 }
Adding the BidsCount and WinningBidAmount to the Auction means that we can very cheaply show them to the users. Because RavenDB is transactional, we can actually do it like this:
using(var session = store.OpenSession()) { session.Advanced.OptimisticConcurrency = true; var auction = session.Load<Auction>("auctions/1234") var bids = session.Load<Bids>("auctions/1234/bids"); bids.AddNewBid(bidder, amount); auction.UpdateStatsFrom(bids); session.SaveChanges(); }
We are now guaranteed that this will either succeed completely (and we have a new winning bid), or it will fail utterly, leaving no trace. Note that AddNewBid will reject a bid that isn’t the higher (throw an exception), and if we have two concurrent modifications, RavenDB will throw on that. Both the Auction and its Bids are treated as a single transactional unit, just the way it should.
The final question is how to handle High Interest Auction, one that gather a lot of bids. We didn’t worry about it in the previous model, because that was left for RavenDB to handle. In this case, since we are using a single document for the Bids, we need to take care of that ourselves. There are a few things that we need to consider here:
- Bids that lost are usually of little interest.
- We probably need to keep them around, just in case, nevertheless.
Therefor, we will implement splitting for the Bids document. What does this means?
Whenever the number of Bids in the Bids document reaches 500 Bids, we split the document. We take the oldest 250 Bids and move them to Historical Bids document, and then we save.
That way, we have a set of historical documents with 250 Bids each that no one is ever likely to read, but we need to keep, and we have the main Bids document, which contains the most recent (and relevant Bids. A High Interest Auction might end up looking like:
- auctions/1234 <- Auction document
- auctions/1234/bids <- Bids document
- auctions/1234/bids/1 <- historical bids #1
- auctions/1234/bids/2 <- historical bids #2
And that is enough for now I think, this post went on a little longer than I intended, but hopefully I was able to explain to you both the final design decisions and the process used to reach them.
Thoughts?