Book Image

The DevOps 2.2 Toolkit

By : Viktor Farcic
Book Image

The DevOps 2.2 Toolkit

By: Viktor Farcic

Overview of this book

Building on The DevOps 2.0 Toolkit and The DevOps 2.1 Toolkit: Docker Swarm, Viktor Farcic brings his latest exploration of the Docker technology as he records his journey to explore two new programs, self-adaptive and self-healing systems within Docker. The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters is the latest book in Viktor Farcic’s series that helps you build a full DevOps Toolkit. This book in the series looks at Docker, the tool designed to make it easier in the creation and running of applications using containers. In this latest entry, Viktor combines theory with a hands-on approach to guide you through the process of creating self-adaptive and self-healing systems. Within this book, Viktor will cover a wide-range of emerging topics, including what exactly self-adaptive and self-healing systems are, how to choose a solution for metrics storage and query, the creation of cluster-wide alerts and what a successful self-sufficient system blueprint looks like. Work with Viktor and dive into the creation of self-adaptive and self-healing systems within Docker.
Table of Contents (18 chapters)

Preventing the scaling disaster

On the first look, the script we created works correctly. Doesn't it?. I've seen similar scripts in other places, and there is only one thing I have to say. Do not run this pipeline in production!!! It is too dangerous. It can easily crash your entire cluster or make your service disappear. Can you guess why?

Let us imagine the following situation. Prometheus detects that certain threshold is reached (for example, memory utilization, response time, and so on) and send a notification to Alertmanager. It sends a build request to Jenkins which, in turn, scales the service by increasing the number of replicas by one. So far, so good.

What happens if scaling does not resolve the problem? What if the threshold reached in Prometheus persists? After a while, the process will be repeated, and the service will be scaled up one more time. That might...