This is the third tutorial in the Kubernetes Tutorial Series. In the first three we looked at Kubernetes Architecture and Installation, Kubernetes Objects like Deployment and Services, Storage in Kubernetes etc. In this section, we will learn how to manage compute resources like CPU and Memory for containers. Compute resources are measurable quantities that can be requested, allocated, and consumed.
While your Kubernetes cluster might work fine without setting resource requests and limits, you will start running into stability issues as your teams and projects grow. Adding requests and limits to your Pods and Namespaces only takes a little extra effort, and can save you from running into many headaches down the line.
Why should we specify how much CPU and memory (RAM) each Container needs?
When Containers have resource requests specified, the scheduler can make better decisions about which nodes to place Pods on.
When Containers have their limits specified, contention for resources on a node can be handled in a specified manner.
Also, you don’t want a single container running in a Pod to consume all the resources of a node. If this happens, it can lead to performance degradation for the other Pods.
Now let’s dive deeper as to how we can use them in Kubernetes.
Resource Types
CPU and memory are each a resource type. CPU is specified in units of cores, and memory is specified in units of bytes.
CPU: CPU resources are defined in millicores. If your container needs two full cores to run, you would put the value “2000m”. If your container only needs ¼ of a core, you would put a value of “250m”. One thing to keep in mind about CPU requests is that if you put in a value larger than the core count of your biggest node, your pod will never be scheduled. Let’s say you have a pod that needs four cores, but your Kubernetes cluster is comprised of dual core VMs—your pod will never be scheduled.
Memory: Memory resources are defined in bytes. Normally, you give a mebibyte value for memory (this is basically the same thing as a megabyte), but you can give anything from bytes to petabytes. Just like CPU, if you put in a memory request that is larger than the amount of memory on your nodes, the pod will never be scheduled.
Resource requests and limits
Requests and limits are the mechanisms Kubernetes uses to control resources such as CPU and memory. Requests are what the container is guaranteed to get. If a container requests a resource, Kubernetes will only schedule it on a node that can give it that resource. Limits, on the other hand, make sure a container never goes above a certain value. The container is only allowed to go up to the limit, and then it is restricted.
It is important to remember that the limit can never be lower than the request. If you try this, Kubernetes will throw an error and won’t let you run the container.
Requests and limits are on a per-container basis. While Pods usually contain a single container, it’s common to see Pods with multiple containers as well. Each container in the Pod gets its own individual limit and request, but because Pods are always scheduled as a group, you need to add the limits and requests for each container together to get an aggregate value for the Pod.
Let’s look at an example. The following Pod has two Containers. Each Container has a request of 0.25 cpu (250 millicores) and 64MiB (226 bytes) of memory. Each Container has a limit of 0.5 cpu (500 millicores) and 128MiB of memory. You can say in total the Pod has a request of 0.5 cpu and 128 MiB of memory, and a limit of 1 cpu and 256MiB of memory.
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: db
image: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: "password"
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- name: wp
image: wordpress
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
People can easily forget to set the resources requests and limits as specified above for their containers, or a team can set the requests and limits very high and take up more than their fair share of the cluster.
To prevent these scenarios, you can set up ResourceQuotas and LimitRanges at the Namespace level.
Configure Memory and CPU Quotas for a Namespace
ResourceQuota: Set quotas for the total amount of memory and CPU that can be used by all Containers running in a namespace.
Let’s implement this. First create a namespace by running the below command.
kubectl create namespace resource-quota-example
After creating the namespace create the configuration file for a ResourceQuota object. Let’s name this file quota.yaml.
apiVersion: v1
kind: ResourceQuota
metadata:
name: quota-demo
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
Create the ResourceQuota by running the below command
kubectl create -f quota.yaml
Let’s break down what we did. The below conditions will apply to the resource-quota-example namespace.
Every Container must have a memory request, memory limit, cpu request, and cpu limit.
The memory request total for all Containers must not exceed 1 GiB.
The memory limit total for all Containers must not exceed 2 GiB.
The CPU request total for all Containers must not exceed 1 cpu.
The CPU limit total for all Containers must not exceed 2 cpu.
Configure Default, Min, Max Resource Requests and Limits for a Namespace
You can also create a LimitRange in your Namespace. Unlike a Quota, which looks at the Namespace as a whole, a LimitRange applies to an individual container. This can help prevent people from creating super tiny or super large containers inside the Namespace.
If a Container is created in a namespace that has a default memory limit, and the Container does not specify its own memory limit, then the Container is assigned the default memory limit.
Let’s implement this. First create a namespace by running the below command.
kubectl create namespace limit-range-example
After creating the namespace create the configuration file for a LimitRange object. Let’s name this file limitrange.yaml.
apiVersion: v1
kind: LimitRange
metadata:
name: limit-range-demo
spec:
limits:
- default:
cpu: 600m
memory: 100Mib
defaultRequest:
cpu: 600m
memory: 50Mib
max:
cpu: 1000m
memory: 100Mib
min:
cpu: 10m
memory: 100Mib
type: Container
Looking at this example, you can see there are four sections. Setting each of these sections is optional.
null
The default section sets up the default limits for a container in a pod. If you set these values in the limitRange, any containers that don’t explicitly set these themselves will get assigned the default values.
The defaultRequest section sets up the default requests for a container in a pod. If you set these values in the limitRange, any containers that don’t explicitly set these themselves will get assigned the default values.
The max section will set up the maximum limits that a container in a Pod can set. The default section cannot be higher than this value. Likewise, limits set on a container cannot be higher than this value. It is important to note that if this value is set and the default section is not, any containers that don’t explicitly set these values themselves will get assigned the max values as the limit.
The min section sets up the minimum Requests that a container in a Pod can set. The defaultRequest section cannot be lower than this value. Likewise, requests set on a container cannot be lower than this value either. It is important to note that if this value is set and the defaultRequest section is not, the min value becomes the defaultRequest value too.
How Pods with resource requests are scheduled
When you create a Pod, the Kubernetes scheduler selects a node for the Pod to run on. Each node has a maximum capacity for each of the resource types: the amount of CPU and memory it can provide for Pods. The scheduler ensures that, for each resource type, the sum of the resource requests of the scheduled Containers is less than the capacity of the node. Note that although actual memory or CPU resource usage on nodes is very low, the scheduler still refuses to place a Pod on a node if the capacity check fails. This protects against a resource shortage on a node when resource usage later increases, for example, during a daily peak in request rate.
With this you have learnt all about resource allocation for your containers. Next up is how autoscaling takes place in Kubernetes when traffic on your application increases. Kubernetes Tutorial Series: Pod Autoscaling and Horizontal Pod Autoscaling.
Please let me know if you have any queries in the comments section below.