Thrift vs. Protocol Buffers

Google recently released its Protocol Buffers as open source. About a year ago, Facebook released a similar product called Thrift. I’ve been comparing them; here’s what I’ve found:

Thrift Protocol Buffers
Backers Facebook, Apache (accepted for incubation) Google
Bindings C++, Java, Python, PHP, XSD, Ruby, C#, Perl, Objective C, Erlang, Smalltalk, OCaml, and Haskell C++, Java, Python
(Perl, Ruby, and C# under discussion)
Output Formats Binary, JSON Binary
Primitive Types bool
byte
16/32/64-bit integers

double
string
byte sequence
map<t1,t2>
list<t>
set<t>

bool

32/64-bit integers
float
double
string
byte sequence

“repeated” properties act like lists

Enumerations Yes Yes
Constants Yes No
Composite Type struct message
Exception Type Yes No
Documentation So-so Good
License Apache BSD-style
Compiler Language C++ C++
RPC Interfaces Yes Yes
RPC Implementation Yes No
Composite Type Extensions No Yes

Overall, I think Thrift wins on features and Protocol Buffers win on
documentation. Implementation-wise, they’re quite similar. Both use
integer tags to identify fields, so you can add and remove fields
without breaking existing code. Protocol Buffers support
variable-width encoding of integers, which saves a few bytes. (Thrift
has an experimental output format with variable-width ints.)

The major difference is that Thrift provides a full client/server RPC
implementation, whereas Protocol Buffers only generate stubs to use in
your own RPC system.

Update July 12, 2008: I haven’t tested for speed, but from a cursory examination it seems that, at the binary level, Thrift and Protocol Buffers are very similar. I think Thrift will develop a more coherent community now that it’s under Apache incubation. It just moved to a new web site and mailing list, and the issue tracker is active.

31 Replies to “Thrift vs. Protocol Buffers”
  1. Thrift was written by engineers at Facebook who’d worked at Google and missed using Protocol Buffers, hence the expanded feature set.

  2. Helpful comparison !

    I was trying to get more information on Protocol Buffers, and Wikipedia linked me to Thrift.

    I found Protocol Buffers pretty well documented, while on the other hand Thrift looks like an alpha project from the documentation point of view…It is a pity because Thrift seems to have quite a few interesting features. Considering that Thrift has been published like one year ago, I wonder if there is really a community backing it up and any enthusiasm around it ..

    It is quite an important point, because if you start using any of these libraries to communicate between different servers/services, you will probably have to stick to it for a few years …

    In addition, it would be interesting to see if these libraries could be used to store hierarchical information in a database and see what would be the performances/limitations compared with storing XML or JSON. They should work better for a basic load/save strategy but what if you just want a subset of the data, do you still need to read the whole message ? The same applied if you want to redirect a message depending on one particular field without reading the whole message …

  3. I have not compared them in detail myself, but if you believe Google’s description of Protocol Buffers, it’s primary feature is one you have not included in your consideration. They claim that their binary format is (1) especially small and (2) especially quick to serialize and deserialize. I am not particularly familiar with Thrift, but I never had the impression that these particular optimizations were of great importance to that project. The Google version is optimized for the case where you are dealing with truly massive amounts of data (the RARE case where efficiency trumps readability).

    — Michael Chermside

  4. mattrepl: The Facebook employee in question was an intern at Google while in college ;)

    That’s not to say Thrift is a copy of protocol buffers–but given the similarities it’s likely it was inspired by it.

  5. A few quick comments:

    1/ Data size and serialization performance are definitely of great importance to Thrift. Huge data sets are definitely one case where this matters, but don’t forget about high-throughput low-latency services (at Facebook, like Google, every millisecond counts). Thrift is much quicker than typical XML/RESTful service implementations, even with relatively small data sizes. This is one of the primary use cases. When you’re dealing with millions of users and thousands of servers, efficiency really starts to matter.

    2/ We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could (and probably should) be, but we do have an active and enthusiastic community. Thrift is currently being used and contributed to by Powerset, Rapleaf, iMeem, AmieStreet, the reCaptcha project, as well as a number of independent developers.

    3/ Both Thrift and Protocol Buffers are great candidates for serializing data into databases — both are more compact and quicker to read/write than XML/JSON. Another common persistent use case is the storage of replayable logfiles.

    4/ Last point, though Thrift currently only has implementations for binary/JSON, it’s designed so that the encoding format is extensible. Thrift could easily support XML or human-readable ASCII — so the trade-off of efficiency vs. readability is left up to the developer.

  6. > We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could

    Please don’t release code without decent documentation. If you are making effort then do it right. Code of this complexity needs good doc or it’s just a complete waste of time and totally frustrating.

  7. […] Until then, Stuart has provided a nice overview of Thrift’s features by comparing it to Google’s recently released Protocol Buffers.  If you’re interested in either, read Stuart’s article. […]

  8. […] Some interesting links which might be worth checking in more detail: open source projects on facebook wiki, the portal for developers on Facebook code (interesting!), Project Cassandra: Facebook’s Open Source Alternative to Google BigTable, the fact Google recently released its Protocol Buffers as open source, Facebook did it much earlier with Thri…. […]

  9. I would offer that the Thrift community is lacking an official site where a newcomer (to Thrift) can download a running Thrift compiler for their system.

    I very much want to try Thrift. I’ve been looking for a ready-to-run binary of the compiler and C# generators (on Windows/XP) but cannot find them anywhere. The ‘make’ scripts (as documented) are pretty useless in the Windows world.

    I dare say that the barrier to entry is having to build the compiler on one’s own machine before anything else can be done. I bet most people give up and go elsewhere.

    -David

  10. David: Unfortunately, since Thrift has not made an Apache release yet and is in incubation at Apache, the project members are unable to provide official binaries that get distributed. If you had said something on the mailing list, I could have sent you binaries if you’re having difficulty getting up and running. Otherwise, http://wiki.apache.org/thrift/ThriftInstallationWin32 does a decent job of describing how to get going, but could probably be improved.

  11. The license information is swapped in the table. Thrift is Apache while protobuffers are BSD.

  12. […] If Thrift is still very new and lacks documentation – why bother learning it? Especially when there are more mature alternatives like Google’s Protocol Buffer and SOAP. The answer depends on your requirements. I write web-based financial applications. I use PHP as my server-side scripting language, and C++ for the heavy-lifting. As of this writing, Protocol Buffer does not support PHP, and the SOAP protocol has too much overhead for the high-speed finance world. Furthermore, I think Thrift has more potential than both alternatives and will gain in popularity as more people learn it. Also apparently Thrift was developed by Google engineers working at Facebook who missed using Protocol Buffer, so they re-built it and expanded its feature set. You can find more information on the differences between Thrift and Protocol Buffer at: http://stuartsierra.com/2008/07/10/thrift-vs-protocol-buffers. […]

  13. It looks like a great library, but not much good if your dev environment is Visual Studio. Probably you’d need something more Posix and cross-platform (perhaps MinGW). The library relies heavily on Posix threads, which are not available in win32 unless you’re prepared to incorporate a GPL or LGPL library, and I’m not. The protocol buffer library is cross-platform, but then again it doesn’t include a transport mechanism and is therefore less powerful.

Comments are closed.