Kubernetes components

A Kubernetes cluster has several master components that are used to control the cluster, as well as node components that run on each cluster node. Let's get to know all these components and how they work together.

Master components

The master components typically run on one node, but in a highly available or very large cluster, they may be spread across multiple nodes.

API server

The Kube API server exposes the Kubernetes REST API. It can easily scale horizontally as it is stateless and stores all the data in the etcd cluster. The API server is the embodiment of the Kubernetes control plane.

Etcd

Etcd is a highly reliable, distributed data store. Kubernetes uses it to store the entire cluster state. In a small, transient cluster, a single instance of etcd can run on the same node as all the other master components, but for more substantial clusters, it is typical to have a three-node or even five-node etcd cluster for redundancy and high availability.

Kube controller manager

The Kube controller manager is a collection of various managers rolled up into one binary. It contains the replication controller, the pod controller, the services controller, the endpoints controller, and others. All these managers watch over the state of the cluster through the API and their job is to steer the cluster into the desired state.

Cloud controller manager

When running in the cloud, Kubernetes allows cloud providers to integrate their platform for the purpose of managing nodes, routes, services, and volumes. The cloud provider code interacts with the Kubernetes code. It replaces some of the functionality of the Kube controller manager. When running Kubernetes with a cloud controller manager, you must set the Kube controller manager flag --cloud-provider to external. This will disable the control loops that the cloud controller manager is taking over. The cloud controller manager was introduced in Kubernetes 1.6 and it is being used by multiple cloud providers already.

A quick note about Go to help you parse the code: The method name comes first, followed by the method's parameters in parentheses. Each parameter is a pair, consisting of a name followed by its type. Finally, the return values are specified. Go allows multiple return types. It is very common to return an error object in addition to the actual result. If everything is OK, the error object will be nil.

Here is the main interface of the cloudprovider package:

package cloudprovider 
  
import ( 
    "errors" 
    "fmt" 
    "strings" 
  
    "k8s.io/api/core/v1" 
    "k8s.io/apimachinery/pkg/types" 
    "k8s.io/client-go/informers" 
    "k8s.io/kubernetes/pkg/controller" 
) 
  
// Interface is an abstract, pluggable interface for cloud providers. 
type Interface interface { 
    Initialize(clientBuilder controller.ControllerClientBuilder) 
    LoadBalancer() (LoadBalancer, bool) 
    Instances() (Instances, bool) 
    Zones() (Zones, bool) 
    Clusters() (Clusters, bool) 
    Routes() (Routes, bool) 
    ProviderName() string 
    HasClusterID() bool 
}

Most of the methods return other interfaces with their own method. For example, here is the LoadBalancer interface:

type LoadBalancer interface {
    GetLoadBalancer(clusterName string, 
                                 service *v1.Service) (status *v1.LoadBalancerStatus, 
                                                                   exists bool, 
                                                                   err error)
    EnsureLoadBalancer(clusterName string, 
                                       service *v1.Service, 
                                       nodes []*v1.Node) (*v1.LoadBalancerStatus, error)
    UpdateLoadBalancer(clusterName string, service *v1.Service, nodes []*v1.Node) error
    EnsureLoadBalancerDeleted(clusterName string, service *v1.Service) error
}

Kube-scheduler

kube-scheduler is responsible for scheduling pods into nodes. This is a very complicated task as it requires considering multiple interacting factors, such as the following:

Resource requirements
Service requirements
Hardware/software policy constraints
Node affinity and antiaffinity specifications
Pod affinity and antiaffinity specifications
Taints and tolerations
Data locality
Deadlines

If you need some special scheduling logic not covered by the default Kube scheduler, you can replace it with your own custom scheduler. You can also run your custom scheduler side by side with the default scheduler and have your custom scheduler schedule only a subset of the pods.

DNS

Since Kubernetes 1.3, a DNS service has been part of the standard Kubernetes cluster. It is scheduled as a regular pod. Every service (except headless services) receives a DNS name. Pods can receive a DNS name too. This is very useful for automatic discovery.

Node components

Nodes in the cluster need a couple of components to interact with the cluster master components and to receive workloads to execute and update the cluster on their status.

Proxy

The Kube proxy does low-level, network housekeeping on each node. It reflects the Kubernetes services locally and can do TCP and UDP forwarding. It finds cluster IPs through environment variables or DNS.

Kubelet

The kubelet is the Kubernetes representative on the node. It oversees communicating with the master components and manages the running pods. This includes the following actions:

Downloading pod secrets from the API server
Mounting volumes
Running the pod's container (through the CRI or rkt)
Reporting the status of the node and each pod
Running container liveness probes

In this section, we dug into the guts of Kubernetes, explored its architecture (from a very high-level perspective), and supported design patterns, through its APIs and the components used to control and manage the cluster. In the next section, we will take a quick look at the various runtimes that Kubernetes supports.

Mastering Kubernetes - Second Edition

By : Gigi Sayfan

Mastering Kubernetes

By: Gigi Sayfan

Overview of this book