Kubernetes consists of many loosely coupled components and APIs. Based on environmental differences, you may run into problems where a little bit more attention is required to get everything up and running. Fortunately, Kubernetes provides many ways to point out problems.
In this section, we will learn how to get cluster information in order to troubleshoot potential issues.
How to do it…
Follow these steps to gather cluster information in order to troubleshoot potential issues:
- Create a file dump of the cluster state called cluster-state:
$ kubectl cluster-info dump --all-namespaces \
- Display the master and service addresses:
$ kubectl cluster-info
Kubernetes master is running at https://172.23.1.110:6443
Heapster is running at https://172.23.1.110:6443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://172.23.1.110:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
- Show the resource usage of the us-west-2.compute.internal node:
$ kubectl top node us-west-2.compute.internal
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
us-west-2.compute.internal 42m 2% 1690Mi 43%
- Mark the us-west-2.compute.internal node as unschedulable:
$ kubectl cordon us-west-2.compute.internal
- Safely evict all the pods from the us-west-2.compute.internal node for maintenance:
$ kubectl drain us-west-2.compute.internal
- Mark the us-west-2.compute.internal node as schedulable after maintenance:
$ kubectl uncordon us-west-2.compute.internal
How it works…
This recipe showed you how to quickly troubleshoot common Kubernetes cluster issues.
In step 1, when the kubectl cluster-info command was executed with the --output-directory parameter, Kubernetes dumped the content of the cluster state under a specified folder. You can see the full list using the following command:
$ tree ./cluster-state
│ ├── daemonsets.json
│ ├── deployments.json
│ ├── events.json
│ ├── pods.json
In step 4, we marked the node as unavailable using the kubectl cordon command. Kubernetes has a concept of scheduling applications, meaning that it assigns pods to nodes that are available. If you know in advance that an instance on your cluster will be terminated or updated, you don't want new pods to be scheduled on that specific node. Cordoning means patching the node with node.Spec.Unschedulable=true. When a node is set as unavailable, no new pods will be scheduled on that node.
In step 5, we use, the kubectl drain command to evict the existing pods, because cordoning alone will not have an impact on the currently scheduled pods. Evict APIs take disruption budgets into account. If set by the owner, disruption budgets limit the number of pods of a replicated application that are down simultaneously from voluntary disruptions. If this isn't supported or set, Evict APIs will simply delete the pods on the node after the grace period.
It is also useful to have knowledge of the following information:
- Setting log levels
Setting log levels
When using the kubectl command, you can set the output verbosity with the --v flag, followed by an integer for the log level, which is a number between 0 and 9. The general Kubernetes logging conventions and the associated log levels are described in the Kubernetes documentation at https://kubernetes.io/docs/reference/kubectl/cheatsheet/#kubectl-output-verbosity-and-debugging.
It is useful to get the output details in a specific format by adding one of the following parameters to your command:
- -o=wide is used to get additional information on a resource. An example is as follows:
$ kubectl get nodes -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-192-168-41-120.us-west-2.compute.internal Ready <none> 84m v1.13.8-eks-cd3eb0 192.168.41.120 22.214.171.124 Amazon Linux 2 4.14.133-113.112.amzn2.x86_64 docker://18.6.1
ip-192-168-6-128.us-west-2.compute.internal Ready <none> 84m v1.13.8-eks-cd3eb0 192.168.6.128 126.96.36.199 Amazon Linux 2 4.14.133-113.112.amzn2.x86_64 docker://18.6.1
- -o=yaml is used to return the output in YAML format. An example is as follows:
$ kubectl get pod nginx-deployment-5c689d88bb-qtvsx -oyaml
kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu request for container
As you can see, the output of the -o=yaml parameter can be used to create a manifest file out of an existing resource as well.
- An overview and detailed uses of the kubectl command: https://kubernetes.io/docs/reference/kubectl/overview/
- kubectl cheat sheet: https://kubernetes.io/docs/reference/kubectl/cheatsheet/
- A visual guide on troubleshooting Kubernetes deployments: https://learnk8s.io/a/troubleshooting-kubernetes.pdf
- K9s – the Kubernetes CLI to manage your clusters in style: https://github.com/derailed/k9s