Book Image

Mastering Kubernetes - Second Edition

By : Gigi Sayfan
Book Image

Mastering Kubernetes - Second Edition

By: Gigi Sayfan

Overview of this book

Kubernetes is an open source system that is used to automate the deployment, scaling, and management of containerized applications. If you are running more containers or want automated management of your containers, you need Kubernetes at your disposal. To put things into perspective, Mastering Kubernetes walks you through the advanced management of Kubernetes clusters. To start with, you will learn the fundamentals of both Kubernetes architecture and Kubernetes design in detail. You will discover how to run complex stateful microservices on Kubernetes including advanced features such as horizontal pod autoscaling, rolling updates, resource quotas, and persistent storage backend. Using real-world use cases, you will explore the options for network configuration, and understand how to set up, operate, and troubleshoot various Kubernetes networking plugins. In addition to this, you will get to grips with custom resource development and utilization in automation and maintenance workflows. To scale up your knowledge of Kubernetes, you will encounter some additional concepts based on the Kubernetes 1.10 release, such as Promethus, Role-based access control, API aggregation, and more. By the end of this book, you’ll know everything you need to graduate from intermediate to advanced level of understanding Kubernetes.
Table of Contents (16 chapters)

Kubernetes components

A Kubernetes cluster has several master components that are used to control the cluster, as well as node components that run on each cluster node. Let's get to know all these components and how they work together.

Master components

The master components typically run on one node, but in a highly available or very large cluster, they may be spread across multiple nodes.

API server

The Kube API server exposes the Kubernetes REST API. It can easily scale horizontally as it is stateless and stores all the data in the etcd cluster. The API server is the embodiment of the Kubernetes control plane.


Etcd is a highly reliable, distributed data store. Kubernetes uses it to store the entire cluster state. In a small, transient cluster, a single instance of etcd can run on the same node as all the other master components, but for more substantial clusters, it is typical to have a three-node or even five-node etcd cluster for redundancy and high availability.

Kube controller manager

The Kube controller manager is a collection of various managers rolled up into one binary. It contains the replication controller, the pod controller, the services controller, the endpoints controller, and others. All these managers watch over the state of the cluster through the API and their job is to steer the cluster into the desired state.

Cloud controller manager

When running in the cloud, Kubernetes allows cloud providers to integrate their platform for the purpose of managing nodes, routes, services, and volumes. The cloud provider code interacts with the Kubernetes code. It replaces some of the functionality of the Kube controller manager. When running Kubernetes with a cloud controller manager, you must set the Kube controller manager flag --cloud-provider to external. This will disable the control loops that the cloud controller manager is taking over. The cloud controller manager was introduced in Kubernetes 1.6 and it is being used by multiple cloud providers already.

A quick note about Go to help you parse the code: The method name comes first, followed by the method's parameters in parentheses. Each parameter is a pair, consisting of a name followed by its type. Finally, the return values are specified. Go allows multiple return types. It is very common to return an error object in addition to the actual result. If everything is OK, the error object will be nil.

Here is the main interface of the cloudprovider package:

package cloudprovider 
import ( 
// Interface is an abstract, pluggable interface for cloud providers. 
type Interface interface { 
    Initialize(clientBuilder controller.ControllerClientBuilder) 
    LoadBalancer() (LoadBalancer, bool) 
    Instances() (Instances, bool) 
    Zones() (Zones, bool) 
    Clusters() (Clusters, bool) 
    Routes() (Routes, bool) 
    ProviderName() string 
    HasClusterID() bool 

Most of the methods return other interfaces with their own method. For example, here is the LoadBalancer interface:

type LoadBalancer interface {
GetLoadBalancer(clusterName string,
service *v1.Service) (status *v1.LoadBalancerStatus,
exists bool,
err error)
EnsureLoadBalancer(clusterName string,
service *v1.Service,
nodes []*v1.Node) (*v1.LoadBalancerStatus, error)
UpdateLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) error
EnsureLoadBalancerDeleted(clusterName string, service *v1.Service) error


kube-scheduler is responsible for scheduling pods into nodes. This is a very complicated task as it requires considering multiple interacting factors, such as the following:

  • Resource requirements
  • Service requirements
  • Hardware/software policy constraints
  • Node affinity and antiaffinity specifications
  • Pod affinity and antiaffinity specifications
  • Taints and tolerations
  • Data locality
  • Deadlines

If you need some special scheduling logic not covered by the default Kube scheduler, you can replace it with your own custom scheduler. You can also run your custom scheduler side by side with the default scheduler and have your custom scheduler schedule only a subset of the pods.


Since Kubernetes 1.3, a DNS service has been part of the standard Kubernetes cluster. It is scheduled as a regular pod. Every service (except headless services) receives a DNS name. Pods can receive a DNS name too. This is very useful for automatic discovery.

Node components

Nodes in the cluster need a couple of components to interact with the cluster master components and to receive workloads to execute and update the cluster on their status.


The Kube proxy does low-level, network housekeeping on each node. It reflects the Kubernetes services locally and can do TCP and UDP forwarding. It finds cluster IPs through environment variables or DNS.


The kubelet is the Kubernetes representative on the node. It oversees communicating with the master components and manages the running pods. This includes the following actions:

  • Downloading pod secrets from the API server
  • Mounting volumes
  • Running the pod's container (through the CRI or rkt)
  • Reporting the status of the node and each pod
  • Running container liveness probes

In this section, we dug into the guts of Kubernetes, explored its architecture (from a very high-level perspective), and supported design patterns, through its APIs and the components used to control and manage the cluster. In the next section, we will take a quick look at the various runtimes that Kubernetes supports.