How Kubernetes requests and limits really work

Click for: original source

Kubernetes is inarguably an elegant, refined, well-designed edifice of open source enterprise software. It is known. Even so, the internal machinations of this mighty platform tool are shrouded in mystery. Friendly abstractions, like “resource requests” for CPU and memory, hide from view a host of interrelated processes — precise and polished scheduling algorithms, clever transformations of friendly abstractions into arcane kernel features, a perhaps unsurprising amount of math — all conjoining to produce the working manifestations of a user’s expressed intent. By Reid Vandewiele.

By the time you reach the end of this article, you will learn:

  • Big picture view: Layers in the looking glass
    • Pod spec (kube-api)
    • Node status (kubelet)
    • Container configuration for CPU (container runtime)
    • Container configuration for memory (container runtime)
    • Node pressure and eviction (kubelet)

A node becomes “full” and unable to accept additional workloads based on resource requests. The actual CPU or memory used on the node doesn’t matter in deciding whether the node can handle more pods. If you want a node being “full” to mean its actual CPU and memory resources are being used efficiently, you need to make sure CPU and memory requests match up with actual usage. Interesting read!

[Read More]

Tags devops agile cicd app-development kubernetes containers