Rajesh Muppalla article about how they at Indix went about implementing stateless lambda architecture for building the big data pipeline. They have built a catalog of several million products and billions of price points collected from thousands of e-commerce websites.
They use HBase, an open source implementation of Google BigTable to implement their storage layer. Here are some challenges they encountered before they switched to lambdas:
- Operational issues
- Data corruption
- Data loss
- Wrong Choice of MapReduce abstractions
The new architecture can be decomposed into three layers – batch, serving and speed. Lambda architecture is technology and domain agnostic. Some principles lambda architecture imposes:
- Immutability and human fault tolerance
- Complexity isolation
- Enforceable schemes
Good read with supporting charts in the article.
[Read More]