Monday, September 1, 2014

Lambda Architecture at Indix

Indix is a product intelligence platform. We are building the world’s largest product database and APIs to enable brands, retailers and developers to deliver the right product to the right customer at the right place, every time.

During the last two years, we have built a catalog of several million products and billions of price points collected from thousands of e-commerce websites. We collect product data as semi-structured HTML via crawling product pages from these websites. Our parsers extract product attributes from the pages. The resultant structured data is then run through a series of machine learning algorithms to classify and extract deeper product attributes, and products get matched across stores. Our analytics engine uses this data to compute aggregates across multiple dimensions and derive actionable insights. This data is also indexed by our search engine. All this data is then consumed by our apps, API and mobile platforms.

The use cases above pose unique and interesting challenges on our data platform in terms of scale, performance, availability, manageability and cost.

Read more here

Leave a Reply

All Tech News IN © 2011 & Main Blogger .