Book Image

The DevOps 2.5 Toolkit

By : Viktor Farcic
Book Image

The DevOps 2.5 Toolkit

By: Viktor Farcic

Overview of this book

Building on The DevOps 2.3 Toolkit: Kubernetes, and The DevOps 2.4 Toolkit: Continuous Deployment to Kubernetes, Viktor Farcic brings his latest exploration of the Docker technology as he records his journey to monitoring, logging, and autoscaling Kubernetes. The DevOps 2.5 Toolkit: Monitoring, Logging, and Auto-Scaling Kubernetes: Making Resilient, Self-Adaptive, And Autonomous Kubernetes Clusters is the latest book in Viktor Farcic’s series that helps you build a full DevOps Toolkit. This book helps readers develop the necessary skillsets needed to be able to operate Kubernetes clusters, with a focus on metrics gathering and alerting with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation. Work with Viktor and dive into the creation of self-adaptive and self-healing systems within Kubernetes.
Table of Contents (9 chapters)
8
What Did We Do?

Alerting on saturation-related issues

Saturation measures fullness of our services and the system. We should be aware if replicas of our services are processing too many requests and being forced to queue some of them. We should also monitor whether usage of our CPUs, memory, disks, and other resources reaches critical limits.

For now, we'll focus on CPU usage. We'll start by opening the Prometheus' graph screen.

 1  open "http://$PROM_ADDR/graph"

Let's see if we can get the rate of used CPU by node (instance). We can use node_cpu_seconds_total metric for that. However, it is split into different modes, and we'll have to exclude a few of them to get the "real" CPU usage. Those will be idle, iowait, and any type of guest cycles.

Please type the expression that follows, and press the Execute button.

 1  sum(rate(
 2    node_cpu_seconds_total...