It’s very common to model your backend API as a set of endpoints that mirror your internal data model. For example, consider a blog engine, which may have:
GET /users/{id}
: retrieves information about a specific user, where{id}
is the ID of the userGET /users/{id}/posts
: retrieves a list of all posts made by a specific user, where{id}
is the ID of the userPOST /users/{id}/posts
: creates a new post for a specific user, where{id}
is the ID of the userGET /posts/{id}/comments
: retrieves a list of all comments for a specific post, where{id}
is the ID of the postPOST /posts/{id}/comments
: creates a new comment for a specific post, where{id}
is the ID of the post
This mirrors the internal structure pretty closely, and it is very likely that you’ll get to an API similar to this if you’ll start writing a blog backend. This represents the usual set of operations very clearly and easily.
The problem is that the blog example is so attractive because it is inherently limited. There isn’t really that much going on in a blog from a data modeling perspective. Let’s consider a restaurant and how its API would look like:
GET /menu
: Retrieves the restaurant's menuPOST /orders
: Creates a new orderPOST /orders/{order_id}/items
: Adds items to an existing orderPOST /payments
: Allows the customer to pay their bill using a credit card
This looks okay, right?
We sit at a table, grab the menu and start ordering. From REST perspective, we need to take into account that multiple users may add items to the same order concurrently.
That matters, because we may have bundles to take into account. John ordered the salad & juice and Jane the omelet, and Derek just got coffee. But coffee is already included in Jane’s order, so no separate charge for that. Here is what this will look like:
┌────┐┌────┐┌─────┐┌──────────────────────┐ │John││Jane││Derek││POST /orders/234/items│ └─┬──┘└─┬──┘└──┬──┘└─────┬────────────────┘ │ │ │ │ │ Salad & Juice │ │─────────────────────>│ │ │ │ │ │ │ Omelet │ │ │───────────────>│ │ │ │ │ │ │ │ Coffee │ │ │ │────────>│
The actual record we have in the end, on the other hand, looks like:
- Salad & Juice
- Omelet & Coffee
In this case, we want the concurrent nature of separate requests, since each user will be ordering at the same time, but the end result should be the final tally, not just an aggregation of the separate totals.
In the same sense, how would we handle payments? Can we do that in the same manner?
┌────┐┌────┐┌─────┐┌──────────────────┐ │John││Jane││Derek││POST /payments/234│ └─┬──┘└─┬──┘└──┬──┘└────────┬─────────┘ │ │ │ │ │ │ $10 │ │────────────────────────>│ │ │ │ │ │ │ │ $10 │ │ │──────────────────>│ │ │ │ │ │ │ │ $10 │ │ │ │───────────>│
In this case, however, we are in a very different state. What happens in this scenario if one of those charges were declined? What happens if they put too much. What happens if there is a concurrent request to add an item to the order while the payment is underway?
When you have separate operations, you have to somehow manage all of that. Maybe a distributed transaction coordinator or by trusting the operator or by dumb luck, for a while. But this is actually an incredibly complex topic. And a lot of that isn’t inherent to the problem itself, but instead about how we modeled the interaction with the server.
Here is the life cycle of an order:
POST /orders
: Creates a new order – returns the new order id** POST /orders/{order_id}/items
: Adds / removes items to an existing order** POST /orders/{order_id}/submit
: Submits all pending order items to the kitchenPOST /orders/{order_id}/bill
: Close the order, compute the total chargePOST /payments/{order_id}
: Handle the actual payment (or payments)
I have marked with ** the two endpoints that may be called multiple times. Everything else can only be called once.
Consider the transactional behavior around this sort of interaction. Adding / removing items from the order can be done concurrently. But submitting the pending orders to the kitchen is a boundary, a concurrent item addition would either be included (if it happened before the submission) or not (and then it will just be added to the pending items).
We are also not going to make any determination on the offers / options that were selected by the diners until they actually move to the payment portion. Even the payment itself is handled via two interactions. First, we ask to get the bill for the order. This is the point when we’ll compute orders, and figure out what bundles, discounts, etc we have. The result of that call is the final tally. Second, we have the call to actually handle the payment. Note that this is one call, and the idea is that the content of this is going to be something like the following:
{ "order_id": "789", "total": 30.0, "payments": [ { "amount": 15.0, "payment_method": "credit_card", "card_number": "****-****-****-3456", "expiration_date": "12/22", "cvv": "123" }, { "amount": 10.0, "payment_method": "cash" }, { "amount": 5.0, "payment_method": "credit_card", "card_number": "****-****-****-5678", "expiration_date": "12/23", "cvv": "456" } ] }
The idea is that by submitting it all at once, we are removing a lot of complexity from the backend. We don’t need to worry about complex interactions, race conditions, etc. We can deal with just the issue of handling the payment, which is complicated enough on its own, no need to borrow trouble.
Consider the case that the second credit card fails the charge. What do we do then? We already charged the previous one, and we don’t want to issue a refund, naturally. The result here is a partial error, meaning that there will be a secondary submission to handle the remainder payment.
From an architectural perspective, it makes the system a lot simpler to deal with, since you have well-defined scopes. I probably made it more complex than I should have, to be honest. We can simply make the entire process serial and forbid actual concurrency throughout the process. If we are dealing with humans, that is easy enough, since the latencies involved are short enough that they won’t be noticed. But I wanted to add the bit about making a part of the process fully concurrent, to deal with the more complex scenarios.
In truth, we haven’t done a big change in the system, we simply restructured the set of calls and the way you interact with the backend. But the end result of that is the amount of code and complexity that you have to juggle for your infrastructure needs are far more lightweight. On real-world systems, that also has the impact of reducing your latencies, because you are aggregating multiple operations and submitting them as a single shot. The backend will also make things easier, because you don’t need complex transaction coordination or distributed locking.
It is a simple change, on its face, but it has profound implications.