Comments on DBMS Musings: Correctness Anomalies Under Serializable Isolation

Hi Daniel, although I understand the motivation be...

2020-11-05T13:33:32.561-08:00

Hi Daniel, although I understand the motivation behind this interesting post and clearly see the utility in carefully cataloging defects in the behavior, as a DB purist I must say that in my opinion, the definition of serializable in its essence (code away, dear programmers, and don't worry about the concurrency and assume that the system creates an illusion of perfect isolation for you, and as if you were working with a single system, one at a time, and with the perceived behavior of the system that in no way would go against your intuition about the flow of time, causality, etc.) is valid in the sense that any implementations true to the essence of serializable must be anomaly free.

Or, to put it shortly, If a specific implementation of serializable is not anomaly free, then it ain't serializable.

Vladimir and Umesh: I really appreciate your comme...

2019-08-19T15:20:27.976-07:00

Vladimir and Umesh: I really appreciate your comments and suggestions about the naming of the isolation levels. After thinking about this further, I have changed my naming scheme and updated the text and tables accordingly. I think this new naming scheme is much cleaner (and clearer).

Thanks again.

Thanks for writing this post! I too think that th...

2019-07-31T14:29:54.723-07:00

Thanks for writing this post!

I too think that the names are a bit confusing. Here is a thought. Suppose we first name aspects of distributed systems that result in a particular behavior. E.g.,
- One-partition (1P) vs. multi-partition (MP or XP)
- One-leader/master/writer (1W) vs. multi (MW)
- Synchronously copied (SC, because SR for synchronously replicated would be confusing with SR for serializable) vs. asynchronously copied (AC)

Then we can create various isolation levels based on combining these aspects. E.g.,
SC-1SR: avoids stale reads
1W-1SR: avoids immortal writes
1W-1P-1SR: avoids immortal writes and causal reverse. I am not sure if synchronous copying is needed to avoid causal reverse. I am assuming not. If it is, we would write this isolation level as 1W-1P-SC-1SR.

Admittedly, this nomenclature is based on aspects of database design, not what is experienced by the user. But so is "Snapshot Isolation", so may be it is OK.

An alternative is to describe the isolation levels based purely on behavior in a self-evident manner. E.g.,
- Fresh reads: avoids stale reads
- Fresh writes: avoids immortal writes (and perhaps avoids non-linearizable writes in general)
- Externally consistent: avoids causal reversal, even through external channels.
- Causal: avoids causal reversal through internal channels.

Having written this note, it does seem that isolation levels might be best described based on behavior. Then we can map database design aspects to these isolation levels.

Hi Scott, here is another way to explain how a wri...

2019-07-17T16:37:57.287-07:00

Hi Scott, here is another way to explain how a write might become immortal: suppose a distributed system uses global timestamps to serialize transactions, where a global timestamp is made up of the local time at the stamper concatenated with the stamper's uid to break ties (as the less significant part of the timestamp). If the local clock where write W1 is accepted happens to be far ahead of clocks at other nodes, it would be serialized later than writes accepted on other nodes.

Hi Scott --- as far as why banks do transfers in t...

2019-07-11T18:24:03.987-07:00

Hi Scott --- as far as why banks do transfers in two transactions: I'm not in that industry so I can't really explain it. However, the point I'm making in the post is that it is not uncommon to have logical dependencies across transactions. If you have a database system that only guarantees correctness if all logical connections across updates must be present in the same transaction, application developers would find that to be extremely constraining. Sometimes, a later transaction is only submitted because an earlier transaction committed. It is considered a desirable feature for a system to uphold implicit time-based causality.

As far as the blind write: it's not that Danger is being scheduled before the first name change. The first name change happened and was completed. When the second name change comes along, the system is allowed to ignore the second transaction --- and can do this without violating its serializability guarantee. It just pretends that it was scheduled before the other transaction. In reality the the first one happened first, and the second one was ignored. But since that's the same final outcome as the second one being scheduled first, and the first one being scheduled second, the serialization guarantee is not violated.

I am having a hard time understanding the causal r...

2019-07-03T11:59:33.451-07:00

I am having a hard time understanding the causal reverse. First why would you do 2 transfers in 2 transactions? Wouldn't it just make more sense to perform both changes in a single transaction? Technically you could write it in 2 transactions but it seems obviously wrong to do that. I fear that I am missing something in this discussion. I also have a problem with the immortal write as I don't see why Danger is always being scheduled before the name change to Daniel. Maybe I have used read commited for too long with Oracle and that is what is hanging me up but I can't see how this is possible. Does this only occur in Async Multi Master setups with something like Galera Clusters?

OK, thanks for the clarification. It wasn't cl...

2019-06-30T12:41:21.108-07:00

OK, thanks for the clarification. It wasn't clear to me from the prose or the diagrams that you were assuming an external service. As written, it sounds like you're saying that any two causally-related transactions could exhibit this anomaly. But it's important that the database has no knowledge of any linkage between the two transactions (overlapping keys, foreign key relationships, "transfers" table that tracks progress of the external transfer, etc.).

I agree that "asynchronous serializable"...

2019-06-29T19:50:01.070-07:00

I agree that "asynchronous serializable" and "partitioned serializable" are not perfectly named. I will think about this further and perhaps update the post with better names.

The example in the post is correct. If you look ca...

2019-06-29T19:37:29.970-07:00

The example in the post is correct. If you look carefully at the figure, you will see that there are no overlapping keys. Causal reverse is absolutely possible in this example.

As far as why there is no bank account C in the example, that is because the clearing account if often not maintained by the same database system as the customer account balances. In this example, the DB (e.g. CockroachDB) only sees the changes to accounts A and B. Some other system maintains account C. Yes, that's an external system. But that's the point!! To assume that all relevant data for an application is all stored in one DB is extremely naive. In real life, relevant data lives everywhere, and you can't control external coordination.

The causal reverse is actually more likely to happen in real life than in the research lab.

Hi Daniel, your CockroachDB example is incorrect. ...

2019-06-29T16:14:55.014-07:00

Hi Daniel, your CockroachDB example is incorrect. You're forgetting that there's a 3rd bank-owned account involved (I'll refer to that as account C). No causal reverse would occur in this case, because both Txn A and Txn B synchronize on account C. Causal reverse can only happen in CRDB when there are no overlapping read/write or write/write keys between transactions. If there is even one overlapping key, as with account C, then the commit order of those transactions will *always* correspond to wall clock time.

It's actually quite difficult to find a case where the CRDB consistency model could result in real-world problems. Here's a CRDB blog article explaining when/how it can happen: https://www.cockroachlabs.com/blog/consistency-model/. Basically, it requires some form of coordination external to the database (which is presumably how "external consistency" got its name).

Perhaps you could get Kyle Kingsbury (of Jepsen fame) to review your next blog post for correctness?

Hi Daniel, This combined list of anomalies looks ...

2019-06-29T00:57:02.537-07:00

Hi Daniel,

This combined list of anomalies looks a bit complicated because it mixes *logical* isolation levels (ANSI + snapshot), with *physical* properties of specifics systems, such as partitioning and async reads/writes.
For example, not all partitioned systems rely on wall-clock time, so they don't necessarily exhibit "causal reverse". Other systems with asynchronous replication provide means for adjusting their guarantees, e.g. disallow read from backups. At the same time, "STRICT SERIALIZABLE" is logical again.
So what is a practical use of putting logical and physical properties into a single table?