Article by Scott Yao and Ping Jin from Uber engineering team. It is about their experience how they proactively manage Uber’s traffic loads based on the criticality of requests, they built QoS Aware Load Management (QALM), a dynamic load shedding framework for incoming requests based on criticality.
Uber platform team owns four services with thousands of hosts, serving peak traffic up to 300,000 requests per second, with more than 450 internal services. Any system of this complexity is likely to experience outages, especially one that has grown so quickly.
Analyzing outages that occurred over a six-month period, we found that 28 percent could have been mitigated or avoided through graceful degradation.
The three most frequent types of failures QoS team observed were due to:
- Inbound request pattern changes, including overload and bad actors
- Resource exhaustion such as CPU, memory, io_loop, or networking resources
- Dependency failures, including infrastructure, data store, and downstream services
You will find the QUALM architecture explained with accompanying charts, load test experiments and more. Well written!
[Read More]