Thursday, December 19, 2013

Migrating from MongoDB to Cassandra

Given some input piece of data {email, phone, Twitter, Facebook}, find other data related to that query and produce a merged document of that data. This is essentially a curated, recursive federated search, where multiple databases are consulted, reconsulted and their results aggregated and stored into MongoDB. These databases consist of various internal services (including previously searched data) and APIs from providers across the web.

When we first started, we needed to move fast according to what our market wanted. As our customers queried us and we added data, our database continued to grow in size and operations per second.

We were a young startup and made a few crucial mistakes. MongoDB was not a mistake. It let us iterate rapidly and scaled reasonably well. Our mistake was when we decided to put off fixing MongoDB’s deployment, instead vertically scaling to maintain the Person API product while working on other technologies such as the FullContact Address Book and our ‘secret sauce’ behind the Person API and our deduplication technology (look for posts on this later this month).

Eventually MongoDB started to have issues with lock time percentage, even on the generous hardware it already had. MongoDB has what’s known as a shared-exclusive or readers-writer lock over each database. While a write operation is in progress, a read is unable to proceed until the write operation yields the lock. Even then, a queued write is given precedence over a read and can queue reads, leading to a latency spike. As you can imagine, the Person API writes a lot of data (sometimes over 200K for a single document), averaging a 50/50 or 60/40 distribution of read/writes and several million daily writes. We worked hard to eliminate multiple updates by no longer using partial updates and staging documents in Redis, but even this wasn’t enough and our lock percentage continued to climb into the 40-50%’s leaving us with unhappy customers.

Read more here

Leave a Reply

All Tech News IN © 2011 & Main Blogger .