Interesting article by Maks Osowski addressing common problem for enterprises and their need to autoscale their environments based on more than just CPU usage. Horizontal Pod Autoscaler (HPA) on Kubernetes Engine 1.10+ will enable you to configure your deployments to scale horizontally in a variety of ways.
The article then introduces a real life scenario based on Microservices architecture for a global video-streaming company.
The services are then scaled:
- To make sure she meets the service level agreement for the latency
- Horizontally based on the queue length
- Using the new ‘External’ metric type when configuring the Horizontal Pod Autoscaler
- Utilizing Pods custom metric type
- Using an existing feature of the Horizontal Pod Autoscaler to scale based on multiple metrics at the same time
To handle scaledowns correctly, we also make sure to set graceful termination periods of pods that are long enough to allow anything happening on pods to complete.
Example configuration is presented together with explanation. Great read!
[Read More]