What is whisper?
Whisper is a fixed-size database, similar in design to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time.
Why don't you just use RRD?
RRD is great, and initially Graphite did use RRD for storage. Over time though, we ran into several issues inherent to RRD's design.
- RRD can't take updates for a timestamp prior to its most recent update. So for example, if you miss an update for some reason you have no simple way of back-filling your RRD file by telling rrdtool to apply an update to the past. Whisper does not have this limitation, and this makes importing historical data into Graphite way way easier.
- At the time whisper was written, RRD did not support compacting multiple updates into a single operation. This feature is critical to Graphite's scalability.
- RRD doesn't like irregular updates. If you update an RRD but don't follow up another update soon, your original update will be lost. This is the straw that broke the camel's back, since Graphite is used for various operational metrics, some of which do not occur regularly (randomly occuring errors for instance) we started to notice that Graphite sometimes wouldn't display data points which we knew existed because we'd received alarms on them from other tools. The problem turned out to be that RRD was dropping the data points because they were irregular. Whisper had to be written to ensure that all data was reliably stored and accessible.
Why did you totally rewrite RRD? Couldn't you just submit a patch?
I didn't totally rewrite it, I rewrote only a small subset of what RRD does, its basic storage mechanism. Patching RRD would mean hundreds of lines of C code, whereas Whisper is under 500 lines of simple python.
Seriously though, the real reason I didn't simply submit a patch for rrdtool is that whisper's design is incompatible with RRD's feature set. RRD provides the ability to specify an arbitrary update interval, that is you could say that you intend to update your RRD file once every minute, every 10 minutes, whatever. And rrdtool also allows you to configure your RRA's (round-robin-archives) independant of this update interval, so you could have a 1-minute precision archive but an update interval of say, 10 seconds. In this case, RRD will store your updates in a temporary workspace area and after the minute has passed, aggregate them and store them in the archive. Whisper on the other hand mandates that your update interval must be the same as the finest precision archive you configure. So for instance, if your archive configuration is 1-minute precision for 2 hours, then 5-minute precision for a day, your update interval *must* be 1-minute. The reason for this is that whisper inserts your updates *immediately* into your finest precision archive, so another update within the same interval would overwrite the previous value. Basically this just means that the onus of aggregating values to fit in the finest precision archive is on the user, not the database.
How fast is whisper?
Whisper is fast enough. It is slower than rrdtool because whisper is written in python, rrdtool is written in C, go figure. However the differences in speed are quite small. I spent a lot of time optimizing whisper to get as close to rrdtool's performance as I could. Currently update operations take anywhere from 2 to 3 times as long as rrdtool, and fetch operations take anywhere from 2 to 5 times as long. This sounds a lot worse than it is (especially considering it was originally 20x slower for each operation) because in practice the actual difference is measured in hundreds of microseconds (10^-4), so less than a millisecond difference for simple cases.
How does whisper work?
Pretty simplistically. See for yourself, visit http://bazaar.launchpad.net/~graphite-dev/graphite/main/files and click lib, graphite, then whisper.py.