Wednesday, February 4, 2015

Yahoo open sources Kafka Manager

Kafka is used by many teams across Yahoo. The Media Analytics team uses Kafka in our real-time analytics pipeline. Our Kafka cluster handles a peak bandwidth of more than 20Gbps (of compressed data).

In order to make it simple for our Developers and Service Engineers to maintain our Kafka clusters, we built a web-based tool that we call Kafka Manager. This interface makes it easier to identify topics which are unevenly distributed across the cluster or have partition leaders unevenly distributed across the cluster. It supports management of multiple clusters, preferred replica election, replica re-assignment, and topic creation. It is also great for getting a quick bird’s eye view of the cluster.

In the spirit of Kafka, we built Kafka Manager with Scala. The web console is based on the Play Framework which interacts with an actor based in-memory model built with Akka and Apache Curator. We’ve ported some of the utils from Apache Kafka to work with the Apache Curator framework as well. We use Curator to inspect the state of the cluster from Zookeeper. We also store cluster information and generated assignments in Zookeeper since we don’t expect this information to be large. This avoided introducing another data store.

Find the Git here

Leave a Reply

All Tech News IN © 2011 & Main Blogger .