Monday, June 22, 2015

Gravity Tech: Centralized logging with Apache Kafka

With numerous large enterprise customers (and several smaller ones), recommendation engines in the Gravity SaaS infrastructure generate around a terabyte of application logs a day, scattered on more than fifty hosts. This counts as moderate in today’s “big data” world, but it already exceeds the size where naive approaches to log aggregation and processing (like copy scripts to a central host or grepping on individual hosts) would work effectively: what you would like to do is see and process all logs at once, near real-time. Generally speaking, this is a distributed system, hence it needs a messaging solution.

While messaging systems abound (see for a long list), Apache Kafka is a messaging system designed specifically for log transmission. Apart from the human-readable text lines emitted by applications for debugging purposes (commonly referred to as application logs), the word “log” here refers to any stream of events or updates; for instance user activity streams or snapshots of application metrics. Thus, when Gravity developers were looking for a log aggregation solution, Kafka was a natural choice.

read more here

Leave a Reply

All Tech News IN © 2011 & Main Blogger .