Let’s talk about some of the best practices that should be followed while using Kubernetes.
Kubernetes is an open-source container orchestration platform that automates container deployment, continuous scaling, and de-scaling, container load balancing, and many more things.
Since containerization is being used on a lot of production servers with 1000’s of containers, it becomes very important to manage them well, and that’s what Kubernetes does.
If you are using Kubernetes, you must adopt the best practices for better container orchestration.
Here is a list of some of the Kubernetes best practices you must follow.
#1. Set Resource Requests and Limits
When you are deploying a big application on a production cluster with limited resources where nodes run out of memory or CPU, the application will stop working. This downtime of application can have a huge impact on the business. But you can solve this by having resource requests and limits.
Requests and limits to resources are the mechanisms in Kubernetes to control the usage of resources such as memory and CPU. If one pod consumes all the CPU and memory, the other pods will get starved of resources and will not be able to run the application. Hence, you need to set resource requests and limits on the Pods to increase reliability.
Just FYI, the limit will always be higher than the request. Your container won’t run if your request is higher than the limit defined. You can have requests and limits set for each container in a pod. CPU is defined using millicores, and memory is defined using bytes (megabyte/mebibyte).
Below is an example of setting a limit to 500 millicores CPU and 128 mebibytes, and setting a quota for requests to 300 millicores CPU and 64 mebibytes.
containers: - name: prodcontainer1 image: ubuntu resources: requests: memory: “64Mi” cpu: “300m” limits: memory: “128Mi” cpu: “500m”
#2. Use livenessProbe and readinessProbe
Health checks are very important in Kubernetes.
It provides two kinds of health checks – Readiness probes and Liveness probes.
Readiness probes are used to check if the app is ready to start serving traffic or not. This probe needs to pass in Kubernetes before it starts sending the traffic to the pod running the application inside a container. Kubernetes will stop sending the traffic to the pod until this readiness health check fails.
Liveness probes are used to check if the app is still running (alive) or it has stopped (dead). If the app is running properly, Kubernetes does nothing. If your application is dead, Kubernetes will launch a new pod and run the application in it.
If these checks are not performed properly, the pods might get terminated or will start getting the user requests before even they are ready.
There are three types of probes that can be used for liveness and readiness checks – HTTP, Command, and TCP.
Let me show an example of the most common one that is the HTTP probe.
Here your application will have an HTTP server inside it. When Kubernetes pings a path to the HTTP server and gets an HTTP response, it will mark the application is healthy, otherwise unhealthy.
apiVersion: v1 kind: Pod metadata: name: container10 spec: containers: - image: ubuntu name: container10 livenessProbe: httpGet: path: /prodhealth port: 8080
#3. Build Small Container Images
It is preferred to use smaller container images because it takes less storage and you will be able to pull and build the images faster. Since the size of the image will be smaller, the chances of security attacks will also be lesser.
There are two ways to reduce the container size – using a smaller base image and a builder pattern. Currently, the latest NodeJS base image is of 345 MB, whereas the NodeJS alpine image is of just 28 MB, more than ten times smaller. So, always use the smaller images and add the dependencies required for running your application.
To keep the container images even smaller, you can use a builder pattern. The code is built in the first container and then the compiled code is packaged in the final container without all the compilers and tools required to make the compiled code, making the container image even smaller.
#4. Grant Safe Levels of Access (RBAC)
Having a secure Kubernetes cluster is very important.
The access to the cluster should be well configured. You must define the number of requests per user per second/minute/hour, the concurrent sessions allowed per IP address, the request size, and limit to paths and hostnames. This will help in keeping the cluster secure from DDoS attacks.
Developer and DevOps engineers working on a Kubernetes cluster should have a defined level of access. The role-based access control (RBAC) feature of Kubernetes is useful here. You can use Roles and ClusterRoles to define the access profiles. For the ease of configuring RBAC, you can use open-source rbac-managers available to help you in simplifying the syntax or use Rancher, it provides RBAC by default.
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: cluster-role rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "list"]
Kubernetes Secrets store confidential information such as auth tokens, passwords, and ssh keys. You should never check Kubernetes Secrets on the IaC repository, otherwise, it will be exposed to those on who has access to your git repository.
DevSecOps is a buzzword now which talks about DevOps and Security. The organizations are adopting the trend as they understand the importance of it.
#5. Keep up-to-date
It is recommended to always have the latest version of Kubernetes installed on the cluster.
The latest version of Kubernetes includes new features, updates of previous features, security updates, bug fixes, etc. If you are using Kubernetes with a cloud provider, updating it becomes very easy.
#6. Use Namespaces
Kubernetes ships three different namespaces – default, kube-system, and kube-public.
These namespaces play a very important role in a Kubernetes cluster for organization and security between the teams.
It makes sense to use the default namespace if you are a small team working only just 5-10 microservices. But a rapidly growing team or a large organization will have several teams working on a test or production environment, so each team needs to have a separate namespace for easier management.
If they don’t do so, they may end up accidentally overwriting or disrupting another team’s application/feature without even realizing it. It is suggested to create multiple namespaces and use them to segment your services into manageable chunks.
Here is an example of creating resources inside a namespace:
apiVersion: v1 kind: Pod metadata: name: pod01 namespace: prod labels: image: pod01 spec: containers: - name: prod01 Image: ubuntu
#7. Use Labels
As your Kubernetes deployments grow, they will invariably include multiple services, pods, and other resources. Keeping track of these can become cumbersome. Even more challenging can be describing the Kubernetes how these various resources interact, how you want them to be replicated, scaled, and serviced. Labels in Kubernetes are very helpful in solving these issues.
Labels are key-value pairs that are used to organize items within the Kubernetes interface.
For example, app: kube-app, phase: test, role: front-end. They are used to describe the Kubernetes how various objects and resources within the cluster work together.
apiVersion: v1 kind: Pod metadata: name: test-pod labels: environment: testing team: test01 spec: containers: - name: test01 image: "Ubuntu" resources: limits: cpu: 1
So, you can reduce the pain of Kubernetes production by always labeling the resources and objects.
#8. Audit Logging
To identify threats in the Kubernetes cluster, auditing of logs is very important. Auditing helps to answer questions like what happened, why it happened, who made it happen etc.
All the data related to the requests made to kube-apiserver are stored in a log file called
audit.log. This log file is structured in JSON format.
In Kubernetes, by default, the audit log is stored in
/var/log/audit.log and the audit policy is present at
To enable the audit logging, start the kube-apiserver with these parameters:
Here is a sample
audit.log file configured for logging changes in the pods:
apiVersion: audit.k8s.io/v1 kind: Policy omitStages: - "RequestReceived" rules: - level: RequestResponse resources: - group: "" resources: ["pods"] - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"]
You can always go back and check audit logs in case of any issue in the Kubernetes cluster. It will help you restore the correct state of the cluster.
#9. Apply Affinity Rules (Node/Pod)
There are two mechanisms in Kubernetes for associating pods with the nodes in a better way – Pod and Node affinity. It is recommended to use these mechanisms for better performance.
Using node affinity, you can schedule pods on the nodes based on defined criteria Depending on the pod requirements the matching node is selected and assigned in a Kubernetes cluster.
apiVersion: v1 kind: Pod metadata: name: ubuntu spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 2 preference: matchExpressions: - key: disktype operator: In values: - ssd containers: - name: ubuntu image: ubuntu imagePullPolicy: IfNotPresent
Using pod affinity, you can schedule multiples pods on the same node (for latency improvement) or decide to keep pods on separate nodes (for high availability) to increase the performance.
apiVersion: v1 kind: Pod metadata: name: ubuntu-pod spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - S1 topologyKey: failure-domain.beta.kubernetes.io/zone containers: - name: ubuntu-pod image: ubuntu
After analyzing the workload of your cluster, you need to decide which affinity strategy to use.
#10. Kubernetes Termination
Kubernetes terminates the pods when they are no longer required. You can initiate it through a command or an API call, the selected pods goes in the terminating state and no traffic is sent to those pods. A SIGTERM message is then sent to those pods, after which the pods shut down.
The pods are terminated gracefully, by default the grace period is 30 seconds. If the pods are still running, Kubernetes sends a SIGKILL message which forcefully shuts down the pods. Finally, these pods are removed by Kubernetes from the API server on the master machine.
In case your pods are always taking more than 30 secs, you can increase this grace period to 45 or 60 seconds.
apiVersion: v1 kind: Pod metadata: name: container10 spec: containers: - image: ubuntu name: container10 terminationGracePeriodSeconds: 60
I hope these best practices will help you in better container orchestration using Kubernetes. Go ahead and try implementing these in your Kubernetes cluster for better results.