Friday, September 13, 2013

Facebook’s Cassandra paper, annotated and compared to Apache Cassandra 2.0


The release of Apache Cassandra 2.0 is a good point to look back at the past five years of progress after Cassandra’s release as open source. Here, we annotate the Cassandra paper from LADIS 2009 with the new features and improvements that have been added since.

Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

Read more here

Leave a Reply

All Tech News IN © 2011 & Main Blogger .