DRBD != fsck != DIX

Every once in a while, we hear of users with corruption in a file system that sits on top of DRBD. That may be easy or tricky to resolve. If you’re lucky, a simple fsck will resolve the corruption. If you’re not quite that lucky, you may have to get out your backups.

But that’s typically not DRBD’s fault. Typically not at all, not in the least bit. DRBD is a block device, and as such it has no idea what rests on top of it. It has no concept of a filesystem, let alone its integrity. That of course is true for any other block device as well. If you have, say, RAID-1, and something corrupts the file system on top of it, then of course that corruption will be happily replicated across both component devices. DRBD is no different, except that its component devices are stored across distinct physical nodes.

And even if everything about your filesystem is logically correct, there’s still the chance that a user fat-fingers rm and nukes all your precious data, and DRBD will happily replicate that too. Just like RAID. In a nutshell: just like RAID, DRBD does not replace backups.

DRBD does bend over backwards in making sure that it is replicating data correctly, catching all sorts of network issues in the process and optionally doing an end-to-end checksum over everything it replicates. It can also immediately detach from a backing device if the latter is acting up in any way and throwing I/O errors. But it can only make sure that it correctly replicates whatever it’s being handed down from above — there is no way for it to second-guess whether that is actually good data.

Likewise, when DRBD reads data, it does so from its underlying block device. And if it happens to be fed garbage from there, there’s nothing it can do about that either (unless the read actually produces an I/O error, in which case we can detach, read transparently from the peer over the network, and all is dandy). So if you have silent data corruption introduced by your controller, or by a disk that’s gone haywire, then it will feed the application garbage. However, and this is a big plus compared to going without DRBD, DRBD gives you the option of switching your application over to another node, with presumably better hardware, where that read corruption does not occur. And you can keep your users happy while you’re fixing the other box with the shot I/O stack.

So no, DRBD does not replace the occasional fsck or whatever other data integrity features your filesystem may come with. DRBD also does not absolve you of adding a BBU (or capacitor-backed flash) to your controller write cache, or of having to turn off your disk write cache (which is always volatile). DRBD also does not protect against dd-ing a bunch of random data somewhere in the middle of the block device causing your filesystem to jump and scream.

Now, if you want complete, end-to-end I/O integrity checking, check out Linux DIX (Data Integrity Extensions), brought to you by a team around Martin Petersen at Oracle. I had the pleasure of sitting in his talk at LinuxCon this year. It’s in Linux as of 2.6.27, check out the project page for details. What’s nice about this is that it’s a Linux first — no other operating system, at this time, is known to have anything comparable.

Leave a comment

  • Design a site like this with WordPress.com
    Get started