Successful data-driven companies like Uber, Facebook and Amazon rely on real-time analytics. Personalizing customer experiences for e-commerce, managing fleets and supply chains, and automating internal operations all require instant insights on the freshest data. By Dhruba Borthakur.
One of the technologies I founded was open source RocksDB, the high-performance key-value engine used by MySQL, Apache Kafka and CockroachDB. RocksDB’s data format is a mutable data format, which means that you can update, overwrite or delete individual fields in a record. It’s also the embedded storage engine at Rockset, a real-time analytics database I founded with fully mutable indexes.
The article contains good information on:
- Differences between mutable and immutable data
- The historic usefulness of immutability
- The problems with immutable data
- Mutability aids machine learning
- How mutability enables real-time analytics
To sum up, mutability is key for today’s real-time analytics because event streams can be incomplete or out of order. When that happens, a database will need to correct and backfill missing and erroneous data. To ensure high performance, low cost, error-free queries and developer efficiency, your database must support mutability.
[Read More]