Book Image

Hands-On Microservices with Kubernetes

By : Gigi Sayfan
Book Image

Hands-On Microservices with Kubernetes

By: Gigi Sayfan

Overview of this book

Kubernetes is among the most popular open source platforms for automating the deployment, scaling, and operations of application containers across clusters of hosts, providing a container-centric infrastructure. Hands-On Microservices with Kubernetes starts by providing you with in-depth insights into the synergy between Kubernetes and microservices. You will learn how to use Delinkcious, which will serve as a live lab throughout the book to help you understand microservices and Kubernetes concepts in the context of a real-world application. Next, you will get up to speed with setting up a CI/CD pipeline and configuring microservices using Kubernetes ConfigMaps. As you cover later chapters, you will gain hands-on experience in securing microservices and implementing REST, gRPC APIs, and a Delinkcious data store. In addition to this, you’ll explore the Nuclio project, run a serverless task on Kubernetes, and manage and implement data-intensive tests. Toward the concluding chapters, you’ll deploy microservices on Kubernetes and learn to maintain a well-monitored system. Finally, you’ll discover the importance of service meshes and how to incorporate Istio into the Delinkcious cluster. By the end of this book, you’ll have gained the skills you need to implement microservices on Kubernetes with the help of effective tools and best practices.
Table of Contents (16 chapters)

Kubernetes and microservices – a perfect match

Kubernetes is a fantastic platform with amazing capabilities and a wonderful ecosystem. How does it help you with your system? As you'll see, there is a very good alignment between Kubernetes and microservices. The building blocks of Kubernetes, such as namespaces, pods, deployments, and services, map directly to important microservices concepts and an agile software development life cycle (SDLC). Let's dive in.

Packaging and deploying microservices

When you employ a microservice-based architecture, you'll have lots of microservices. Those microservices, in general, may be developed independently, and deployed independently. The packaging mechanism is simply containers. Every microservice you develop will have a Dockerfile. The resulting image represents the deployment unit for that microservice. In Kubernetes, your microservice image will run inside a pod (possibly alongside other containers). But an isolated pod, running on a node, is not very resilient. The kubelet on the node will restart the pod's container if it crashes, but if something happens to the node itself, the pod is gone. Kubernetes has abstractions and resources that build on the pod.

ReplicaSets are sets of pods with a certain number of replicas. When you create a ReplicaSet, Kubernetes will make sure that the correct number of pods you specify always run in the cluster. The deployment resource takes it a step further and provides an abstraction that exactly aligns with the way you consider and think about microservices. When you have a new version of a microservice ready, you will want to deploy it. Here is a Kubernetes deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.15.4
ports:
- containerPort: 80

The file can be found at https://github.com/the-gigi/hands-on-microservices-with-kubernetes-code/blob/master/ch1/nginx-deployment.yaml.

This is a YAML file (https://yaml.org/) that has some fields that are common to all Kubernetes resources, and some fields that are specific to deployments. Let's break this down piece by piece. Almost everything you learn here will apply to other resources:

  • The apiVersion field marks the Kubernetes resources version. A specific version of the Kubernetes API server (for example, V1.13.0) can work with different versions of different resources. Resource versions have two parts: an API group (in this case, apps) and a version number (v1). The version number may include alpha or beta designations:
apiVersion: apps/v1
  • The kind field specifies what resource or API object we are dealing with. You will meet many kinds of resources in this chapter and later:
kind: Deployment
  • The metadata section contains the name of the resource (nginx) and a set of labels, which are just key-value string pairs. The name is used to refer to this particular resource. The labels allow for operating on a set of resources that share the same label. Labels are very useful and flexible. In this case, there is just one label (app: nginx):
metadata:
name: nginx
labels:
app: nginx

  • Next, we have a spec field. This is a ReplicaSet spec. You could create a ReplicaSet directly, but it would be static. The whole purpose of deployments is to manage its set of replicas. What's in a ReplicaSet spec? Obviously, it contains the number of replicas (3). It has a selector with a set of matchLabels (also app: nginx), and it has a pod template. The ReplicaSet will manage pods that have labels that match matchLabels:
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
...
  • Let's have a look at the pod template. The template has two parts: metadata and a spec. The metadata is where you specify the labels. The spec describes the containers in the pod. There may be one or more containers in a pod. In this case, there is just one container. The key field for a container is the image (often a Docker image), where you packaged your microservice. That's the code we want to run. There is also a name (nginx) and a set of ports:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.15.4
ports:
- containerPort: 80

There are more fields that are optional. If you want to dive in deeper, check out the API reference for the deployment resource at https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.13/#deployment-v1-apps.

Exposing and discovering microservices

We deployed our microservice with a deployment. Now, we need to expose it, so that it can be used by other services in the cluster and possibly also make it visible outside the cluster. Kubernetes provides the Service resource for that purpose. Kubernetes services are backed up by pods, identified by labels:

apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
protocol: TCP
selector:
app: nginx

Services discover each other inside the cluster, using DNS or environment variables. This is the default behavior. But, if you want to make a service accessible to the world, you will normally set an ingress object or a load balancer. We will explore this topic in detail later.

Securing microservices

Kubernetes was designed for running large-scale critical systems, where security is of paramount concern. Microservices are often more challenging to secure than monolithic systems because there is so much internal communication across many boundaries. Also, microservices encourage agile development, which leads to a constantly changing system. There is no steady state you can secure once and be done with it. You must constantly adapt the security of the system to the changes. Kubernetes comes pre-packed with several concepts and mechanisms for secure development, deployment, and operation of your microservices. You still need to employ best practices, such as principle of least privilege, security in depth, and minimizing blast radius. Here are some of the security features of Kubernetes.

Namespaces

Namespaces let you isolate different parts of your cluster from each other. You can create as many namespaces as you want and scope many resources and operations to their namespace, including limits, and quotas. Pods running in a namespace can only access directly their own namespace. To access other namespaces, they must go through public APIs.

Service accounts

Service accounts provide identity to your microservices. Each service account will have certain privileges and access rights associated with its account. Service accounts are pretty simple:

apiVersion: v1
kind: ServiceAccount
metadata:
name: custom-service-account

You can associate service accounts with a pod (for example, in the pod spec of a deployment) and the microservices that run inside the pod will have that identity and all the privileges and restrictions associated with that account. If you don't assign a service account, then the pod will get the default service account of its namespace. Each service account is associated with a secret used to authenticate it.

Secrets

Kubernetes provides secret management capabilities to all microservices. The secrets can be encrypted at rest on etcd (since Kubernetes 1.7), and are always encrypted on the wire (over HTTPS). Secrets are managed per namespace. Secrets are mounted in pods as either files (secret volumes) or environment variables. There are multiple ways to create secrets. Secrets can contain two maps: data and stringData. The type of values in the data map can be arbitrary, but must be base64-encoded. Refer to the following, for example:

apiVersion: v1
kind: Secret
metadata:
name: custom-secret
type: Opaque
data:
username: YWRtaW4=
password: MWYyZDFlMmU2N2Rm

Here is how a pod can load secrets as a volume:

apiVersion: v1
kind: Pod
metadata:
name: db
spec:
containers:
- name: mypod
image: postgres
volumeMounts:
- name: db_creds
mountPath: "/etc/db_creds"
readOnly: true
volumes:
- name: foo
secret:
secretName: custom-secret

The end result is that the DB credentials secrets that are managed outside the pod by Kubernetes show up as a regular file inside the pod accessible through the path /etc/db_creds.

Secure communication

Kubernetes utilizes client-side certificates to fully authenticate both sides of any external communication (for example, kubectl). All communication to the Kubernetes API from outside should be over HTTP. Internal cluster communication between the API server and the kubelet on the node is over HTTPS too (the kubelet endpoint). But, it doesn't use a client certificate by default (you can enable it).

Communication between the API server and nodes, pods, and services is, by default, over HTTP and is not authenticated. You can upgrade them to HTTPS, but note that the client certificate is checked, so don't run your worker nodes on public networks.

Network policies

In a distributed system, beyond securing each container, pod, and node, it is critical to also control communication over the network. Kubernetes supports network policies, which give you full flexibility to define and shape the traffic and access across the cluster.

Authenticating and authorizing microservices

Authentication and authorization are also related to security, by limiting access to trusted users and to limited aspects of Kubernetes. Organizations have a variety of ways to authenticate their users. Kubernetes supports many of the common authentication schemes, such as X.509 certificates, and HTTP basic authentication (not very secure), as well as an external authentication server via webhook that gives you ultimate control over the authentication process. The authentication process just matches the credentials of a request with an identity (either the original or an impersonated user). What that user is allowed to do is controlled by the authorization process. Enter RBAC.

Role-based access control

Role-based access control (RBAC) is not required! You can perform authorization using other mechanisms in Kubernetes. However, it is a best practice. RBAC is based on two concepts: role and binding. A role is a set of permissions on resources defined as rules. There are two types of roles: Role, which applies to a single namespace, and ClusterRole, which applies to all namespaces in a cluster.

Here is a role in the default namespace that allows the getting, watching, and listing of all pods. Each role has three components: API groups, resources, and verbs:

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]

Cluster roles are very similar, except there is no namespace field because they apply to all namespaces.

A binding is associating a list of subjects (users, user groups, or service accounts) with a role. There are two types of binding, RoleBinding and ClusterRoleBinding, which correspond to Role and ClusterRole.

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-reader
namespace: default
subjects:
- kind: User
name: gigi # Name is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role # must be Role or ClusterRole
name: pod-reader # must match the name of the Role or ClusterRole you bind to
apiGroup: rbac.authorization.k8s.io

It's interesting that you can bind a ClusterRole to a subject in a single namespace. This is convenient for defining roles that should be used in multiple namespaces, once as a cluster role, and then binding them to specific subjects in specific namespaces.

The cluster role binding is similar, but must bind a cluster role and always applies to the whole cluster.

Note that RBAC is used to grant access to Kubernetes resources. It can regulate access to your service endpoints, but you may still need fine-grained authorization in your microservices.

Upgrading microservices

Deploying and securing microservices is just the beginning. As you develop and evolve your system, you'll need to upgrade your microservices. There are many important considerations regarding how to go about it that we will discuss later (versioning, rolling updates, blue-green, and canary). Kubernetes provides direct support for many of these concepts out of the box and the ecosystem built on top of it to provide many flavors and opinionated solutions.

The goal is often zero downtime and safe rollback if a problem occurs. Kubernetes deployments provide the primitives, such as updating a deployment, pausing a roll-out, and rolling back a deployment. Specific workflows are built on these solid foundations.
The mechanics of upgrading a service typically involve upgrading its image to a new version and sometimes changes to its support resources and access: volumes, roles, quotas, limits, and so on.

Scaling microservices

There are two aspects to scaling a microservice with Kubernetes. The first aspect is scaling the number of pods backing up a particular microservice. The second aspect is the total capacity of the cluster. You can easily scale a microservice explicitly by updating the number of replicas of a deployment, but that requires constant vigilance on your part. For services that have large variations in the volume of requests they handle over long periods (for example, business hours versus off hours or week days versus weekends), it might take a lot of effort. Kubernetes provides horizontal pod autoscaling, which is based on CPU, memory, or custom metrics, and can scale your service up and down automatically.

Here is how to scale our nginx deployment that is currently fixed at three replicas to go between 2 and 5, depending on the average CPU usage across all instances:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: nginx
namespace: default
spec:
maxReplicas: 5
minReplicas: 2
targetCPUUtilizationPercentage: 90
scaleTargetRef:
apiVersion: v1
kind: Deployment
name: nginx

The outcome is that Kubernetes will watch CPU utilization of the pods that belong to the nginx deployment. When the average CPU over a certain period of time (5 minutes, by default) exceeds 90%, it will add more replicas until the maximum of 5, or until utilization drops below 90%. The HPA can scale down too, but will always maintain a minimum of two replicas, even if the CPU utilization is zero.

Monitoring microservices

Your microservices are deployed and running on Kubernetes. You can update the version of your microservices whenever it is needed. Kubernetes takes care of healing and scaling automatically. However, you still need to monitor your system and keep track of errors and performance. This is important for addressing problems, but also for informing you on potential improvements, optimizations, and cost cutting.

There are several categories of information that are relevant and that you should monitor:

  • Third-party logs
  • Application logs
  • Application errors
  • Kubernetes events
  • Metrics

When considering a system composed of multiple microservices and multiple supporting components, the number of logs will be substantial. The solution is central logging, where all the logs go to a single place where you can slice and dice at your will. Errors can be logged, of course, but often it is useful to report errors with additional metadata, such as stack trace, and review them in their own dedicated environment (for example, sentry or rollbar). Metrics are useful for detecting performance and system health problems or trends over time.

Kubernetes provides several mechanisms and abstractions for monitoring your microservices. The ecosystem provides a number of useful projects too.

Logging

There are several ways to implement central logging with Kubernetes:

  • Have a logging agent that runs on every node
  • Inject a logging sidecar container to every application pod
  • Have your application send its logs directly to a central logging service

There are pros and cons to each approach. But, the main thing is that Kubernetes supports all approaches and makes container and pod logs available for consumption.

Metrics

Kubernetes comes with cAdvisor (https://github.com/google/cadvisor), which is a tool for collecting container metrics integrated into the kubelet binary. Kubernetes used to provide a metrics server called heapster that required additional backends and a UI. But, these days, the best in class metrics server is the open source Prometheus project. If you run Kubernetes on Google's GKE, then Google Cloud Monitoring is a great option that doesn't require additional components to be installed in your cluster. Other cloud providers also have integration with their monitoring solutions (for example, CloudWatch on EKS).