Book Image

Docker Cookbook - Second Edition

By : Ken Cochrane, Jeeva S. Chelladhurai, Neependra K Khare
2 (1)
Book Image

Docker Cookbook - Second Edition

2 (1)
By: Ken Cochrane, Jeeva S. Chelladhurai, Neependra K Khare

Overview of this book

Docker is an open source tool used for creating, deploying, and running applications using containers. With more than 100 self-contained tutorials, this book examines common pain points and best practices for developers building distributed applications with Docker. Each recipe in this book addresses a specific problem and offers a proven, best practice solution with insights into how it works, so that you can modify the code and configuration files to suit your needs. The Docker Cookbook begins by guiding you in setting up Docker in different environments and explains how to work with its containers and images. You’ll understand Docker orchestration, networking, security, and hosting platforms for effective collaboration and efficient deployment. The book also covers tips and tricks and new Docker features that support a range of other cloud offerings. By the end of this book, you’ll be able to package and deploy end-to-end distributed applications with Docker and be well-versed with best practice solutions for common development problems.
Table of Contents (13 chapters)

Introduction

At the very start of the IT revolution, most applications were deployed directly on physical hardware, over the host OS. Because of that single user space, runtime was shared between applications. The deployment was stable, hardware-centric, and had a long maintenance cycle. It was mostly managed by an IT department, and gave much less flexibility to developers. In such cases, the hardware resources were underutilized most of the time. The following diagram depicts such a setup:

Traditional application deployment

For flexible deployments, and in order to better utilize the resources of the host system, virtualization was invented. With hypervisors, such as KVM, XEN, ESX, Hyper-V, and so on, we emulated the hardware for virtual machines (VMs) and deployed a guest OS on each virtual machine. VMs can have a different OS than their host; this means that we are responsible for managing patches, security, and the performance of that VM. With virtualization, applications are isolated at VM level and are defined by the life cycle of VMs. This gives us a better return on our investment and higher flexibility at the cost of increased complexity and redundancy. The following diagram depicts a typical virtualized environment:

Application deployment in a virtualized environment

Since virtualization was developed, we have been moving towards more application-centric IT. We have removed the hypervisor layer to reduce hardware emulation and complexity. The applications are packaged with their runtime environment, and are deployed using containers. OpenVZ, Solaris Zones, and LXC are a few examples of container technology. Containers are less flexible compared to VMs; for example, we cannot run Microsoft Windows on Linux OS as of writing. Containers are also considered less secure than VMs, because with containers, everything runs on the host OS. If a container gets compromised, then it might be possible to get full access to the host OS. It can be a bit too complex to set up, manage, and automate. These are a few of the reasons why we have not seen the mass adoption of containers in the last few years, even though we had the technology. The following diagram shows how an application is deployed using containers:

Application deployment with containers

With Docker, containers suddenly became first-class citizens. All big corporations, such as Google, Microsoft, Red Hat, IBM, and others, are now working to make containers mainstream.

Docker was started as an internal project by dotCloud founder Solomon Hykes. It was released as open source in March 2013 under the Apache 2.0 license. With dotCloud's platform as a service experience, the founders and engineers of Docker were aware of the challenges of running containers. So with Docker, they developed a standard way to manage containers.

Docker uses the operating system's underlying kernel features, which enable containerization. The following diagram depicts the Docker platform and the kernel features used by Docker. Let's look at some of the major kernel features that Docker uses:

Docker platform and the kernel features used by Docker

Namespaces

Namespaces are the building blocks of a container. There are different types of namespace, and each one of them isolates applications from the others. They are created using the clone system call. You can also attach to existing namespaces. Some of the namespaces used by Docker will be explained in the following sections.

The PID namespace

The PID namespace allows each container to have its own process numbering. Each PID forms its own process hierarchy. A parent namespace can see the children namespaces and affect them, but a child can neither see the parent namespace nor affect it.

If there are two levels of hierarchy, then at the top level, we would see the process running inside the child namespace with a different PID. So a process running in a child namespace would have two PIDs: one in the child namespace and the other in the parent namespace. For example, if we run a program on the container.sh container, then we can see the corresponding program on the host as well.

On the container, the sh container.sh process has a PID of 8:

On the host, the same process has a PID of 29778:

The net namespace

With the PID namespace, we can run the same program multiple times in different isolated environments; for example, we can run different instances of Apache on different containers. But without the net namespace, we would not be able to listen on port 80 on each one of them. The net namespace allows us to have different network interfaces on each container, which solves the problem I mentioned earlier. Loopback interfaces would be different in each container as well.

To enable networking in containers, we create pairs of special interfaces in two different net namespaces and allow them to talk to each other. One end of the special interface resides inside the container and the other resides on the host system. Generally, the interface inside the container is called eth0, and on the host system, it is given a random name, such as veth516cc56. These special interfaces are then linked through a bridge (docker0) on the host to enable communication between the containers and the route packets.

Inside the container, you will see something like the following:

$ docker container run -it alpine ash
# ip a

On the host, it would look like the following:

$ ip a

Also, each net namespace has its own routing table and firewall rules.

The IPC namespace

The inter-process communication (IPC) namespace provides semaphores, message queues, and shared memory segments. It is not widely used these days, but some programs still depend on it.

If the IPC resource created by one container is consumed by another container, then the application running on the first container could fail. With the IPC namespace, processes running in one namespace cannot access resources from another namespace.

The mnt namespace

Using only a chroot, you can inspect the relative paths of the system from a chrooted directory/namespace. The mnt namespace takes the idea of chroots to the next level. With the mnt namespace, a container can have its own set of mounted filesystems and root directories. Processes in one mnt namespace cannot see the mounted filesystems of another mnt namespace.

The UTS namespace

With the UTS namespace, we can have different hostnames for each container.

The user namespace

With user namespace support, we can have users who have a nonzero ID on the host, but who can have a zero ID inside the container. This is because the user namespace allows mappings of users and groups IDs per namespace.

There are ways to share namespaces between the host and container, and other containers as well. We'll see how to do this in subsequent chapters.

Cgroups

Control groups (cgroups) provide resource limitations and accounting for containers. The following quote is from the Linux Kernel documentation:

"Control Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour."

In simple terms, they can be compared to the ulimit shell command or the setrlimit system call. Instead of setting the resource limit to a single process, cgroups allow you to limit resources to a group of processes.

Control groups are split into different subsystems, such as CPU, CPU sets, memory block I/O, and so on. Each subsystem can be used independently, or can be grouped with others. The features that cgroups provide are as follows:

  • Resource limiting: For example, one cgroup can be bound to specific CPUs, so that all processes in that group would run on given CPUs only
  • Prioritization: Some groups may get a larger share of CPUs
  • Accounting: You can measure the resource usage of different subsystems for billing
  • Control: You can freeze and restart groups

Some of the subsystems that can be managed by cgroups are as follows:

  • blkio: Sets I/O access to and from block devices, such as disks, SSDs, and so on
  • Cpu: Limits access to CPU
  • Cpuacct: Generates CPU resource utilization
  • Cpuset: Assigns CPUs on a multicore system to tasks in a cgroup
  • Devices: Grants devices access to a set of tasks in a group
  • Freezer: Suspends or resumes tasks in a cgroup
  • Memory: Sets limits on memory use by tasks in a cgroup

There are multiple ways to control work with cgroups. Two of the most popular ones are accessing the cgroup virtual filesystem manually and accessing it with the libcgroup library. To use libcgroup on Linux, run the following command to install the required packages on Ubuntu or Debian:

$ sudo apt-get install cgroup-tools

To install the required packages on CentOS, Fedora, or Red Hat, use the following code:

$ sudo yum install libcgroup libcgroup-tools
These steps are not possible on Docker for Mac and Windows, because you can't install the required packages on those versions of Docker.

Once installed, you can get the list of subsystems and their mount point in the pseudo filesystem with the following command:

$ lssubsys -M

Although we haven't looked at the actual commands yet, let's assume that we are running a few containers and want to get the cgroup entries for a container. To get those, we first need to get the container ID and then use the lscgroup command to get the cgroup entries of a container, which we can get using the following command:

The union filesystem

The union filesystem allows the files and directories of separate filesystems, known as layers, to be transparently overlaid to create a new virtual filesystem. While starting a container, Docker overlays all the layers attached to an image and creates a read-only filesystem. On top of that, Docker creates a read/write layer that is used by the container's runtime environment. You can read the Pulling an image and running a container recipe of this chapter for more details. Docker can use several union filesystem variants, including AUFS, Btrfs, zfs, overlay, overlay2, and DeviceMapper.

Docker also has a virtual file system (VFS) storage driver. A VFS doesn't support copy-on-write (COW) and is not a union filesystem. This means that each layer is a directory on the disk, and each time a new layer is created, it requires a deep copy of its parent layer. For these reasons, it has lower performance and requires more disk space, but it is a robust and stable option that works in every environment.

The container format

Docker Engine combines the namespaces, control groups, and UnionFS into a wrapper called a container format. In 2015, Docker donated its container format and runtime to an organization called the the Open Container Initiative (OCI). The OCI is a lightweight, open-governance structure formed under the Linux Foundation by Docker and other industry leaders. The purpose of the OCI is to create open industry standards around container formats and runtimes. There are currently two specifications: the Runtime Specification and the Image Specification.

The Runtime Specification outlines how to run an OCI runtime filesystem bundle. Docker donated runC, (https://github.com/opencontainers/runc) its OCI-compliant runtime to the OCI, to serve as the reference implementation.

The OCI image format contains the information needed to launch the application on the target platform. The specification defines how to create the OCI image, and what the desired output would look like. The output would consist of an image manifest, a filesystem (layer) serialization, and the image configuration. Docker donated its Docker V2 image format to the OCI to form the basis of the OCI image specification.

There are currently two container engines that support the OCI Runtime and Image Specifications: Docker and rkt.