Breaking News


Kangaroo is an open-source project from Conductor for writing MapReduce jobs consuming data from Kafka. The introductory post explains Conductor’s use case—loading data from Kafka to HBase by way of a MapReduce job using the HFileOutputFormat. Unlike other solutions which are limited to a single InputSplit per Kafka partition, Kangaroo can launch multiple consumers at different offsets in the stream of a single partition for increased throughput and parallelism.

About davidn

Check Also

Pivotal HAWQ

Pivotal HAWQ

AT A GLANCE Pivotal‘s Hawq is a closed-source product offered as part of their PivotalHD …

Leave a Reply

Your email address will not be published. Required fields are marked *