Elephant Bird for Apache Crunch
Apache Crunch is a Java library for writing, testing, and running MapReduce pipelines. One of Crunch's
goals is to make it easy to write and test pipelines that process complex records containing nested and repeated
data structures, like protocol buffers and Thrift records. This module contains support for Crunch's
PType serialization for
Elephant Bird's ProtobufWritable and ThriftWritable classes, along with
Source, Target,
and SourceTarget implementations to support
Elephant Bird's LzoProtobufBlockInputFormat, LzoThriftBlockInputFormat, LzoProtobufBlockOutputFormat, and
LzoThriftBlockOutputFormat.