Book Image

Podman for DevOps

By : Alessandro Arrichiello, Gianni Salinetti
Book Image

Podman for DevOps

By: Alessandro Arrichiello, Gianni Salinetti

Overview of this book

As containers have become the new de facto standard for packaging applications and their dependencies, understanding how to implement, build, and manage them is now an essential skill for developers, system administrators, and SRE/operations teams. Podman and its companion tools Buildah and Skopeo make a great toolset to boost the development, execution, and management of containerized applications. Starting with the basic concepts of containerization and its underlying technology, this book will help you get your first container up and running with Podman. You'll explore the complete toolkit and go over the development of new containers, their lifecycle management, troubleshooting, and security aspects. Together with Podman, the book illustrates Buildah and Skopeo to complete the tools ecosystem and cover the complete workflow for building, releasing, and managing optimized container images. Podman for DevOps provides a comprehensive view of the full-stack container technology and its relationship with the operating system foundations, along with crucial topics such as networking, monitoring, and integration with systemd, docker-compose, and Kubernetes. By the end of this DevOps book, you'll have developed the skills needed to build and package your applications inside containers as well as to deploy, manage, and integrate them with system services.
Table of Contents (19 chapters)
1
Section 1: From Theory to Practice: Running Containers with Podman
7
Section 2: Building Containers from Scratch with Buildah
12
Section 3: Managing and Integrating Containers Securely

Why do I need a container?

This section describes the benefits and the value of containers in modern IT systems, and how containers can provide benefits for both technology and business.

The preceding question could be rephrased as, what is the value of adopting containers in production?

IT has become a fast, market-driven environment where changes are dictated by business and technological enhancements. When adopting emerging technologies, companies are always looking to their Return of Investment (ROI) while striving to keep the Total Cost of Ownership (TCO) under reasonable thresholds. This is not always easy to attain.

This section will try to uncover the most important ones.

Open source

The technologies that power container technology are open source and became open standards widely adopted by many vendors or communities. Open source software, today adopted by large companies, vendors, and cloud providers, has many advantages, and provides great value for the enterprise. Open source is often associated with high-value and innovative solutions – that's simply the truth!

First, community-driven projects usually have a great evolutionary boost that helps mature the code and bring new features continuously. Open source software is available to the public and can be inspected and analyzed. This is a great transparency feature that also has an impact on software reliability, both in terms of robustness and security.

One of the key aspects is that it promotes an evolutionary paradigm where only the best software is adopted, contributed, and supported; container technology is a perfect example of this behavior.

Portability

We have already stated that containers are a technology that enables users to package and isolate applications with their entire runtime environment, which means all the files necessary to run. This feature unlocks one key benefit – portability.

This means that a container image can be pulled and executed on any host that has a container engine running, regardless of the OS distribution underneath. A CentOS or nginx image can be pulled indifferently from a Fedora or Debian Linux distribution running a container engine and executed with the same configuration.

Again, if we have a fleet of many identical hosts, we can choose to schedule the application instance on one of them (for example, using load metrics to choose the best fit) with the awareness of having the same result when running the container.

Container portability also reduces vendor lock-ins and provides better interoperability between platforms.

DevOps facilitators

As stated before, containers help solve the old it works on my machine pattern between development and operations teams when it comes to deploying applications for production.

As a smart and easy packaging solution for applications, they meet the developers' need to create self-consistent bundles with all the necessary binaries and configurations to run their workloads seamlessly. As a self-consistent way to isolate processes and guarantee separation of namespaces and resource usage, they are appreciated by operations teams who are no more forced to maintain complex dependencies constraints or segregate every single application inside VMs.

From this point of view, containers can be seen as facilitators of DevOps best practices, where developers and operators work closer to deploy and manage applications without rigid separations.

Developers who want to build their own container images are expected to be more aware of the OS layer built into the image and work closely with operations teams to define build templates and automations.

Cloud readiness

Containers are built for the cloud, designed with an immutable approach in mind. The immutability pattern clearly states that changes in the infrastructure (be it a single container or a complex cluster) must be applied by redeploying a modified version and not by patching the current one. This helps to increase a system's predictability and reliability.

When a new application version must be rolled out, it is built into a new image and a new container is deployed in place of the previous version. Build pipelines can be implemented to manage complex workflows, from application build and image creation, image registry push and tagging, until deployment in the target host. This approach drastically shortens provisioning time while reducing inconsistencies.

We will see later in this book that dedicated container orchestration solutions such as Kubernetes also provide ways to automate the scheduling patterns of large fleets of hosts and make containerized workloads easy to deploy, monitor, and scale.

Infrastructure optimization

Compared to virtual machines, containers have a lightweight footprint that drives much greater efficiency in the consumption of compute and memory resources. By providing a way to simplify workload execution, container adoption brings great cost savings.

IT resources optimization is achieved by reducing the computational cost of applications; if an application server that was running on top of a virtual machine can be containerized and executed on a host along with other containers (with dedicated resource limits and requests), computing resources can be saved and reused.

Whole infrastructures can be re-modulated with this new paradigm in mind; a bare-metal machine previously configured as a hypervisor can be reallocated as a worker node of a container orchestration system that simply runs more granular containerized applications as containers.

Microservices

Microservice architectures split applications into multiple services that perform fine-grained functions and are part of the application as a whole.

Traditional applications have a monolithic approach where all the functions are part of the same instance. The purpose of microservices is to break the monolith into smaller parts that interact independently.

Monolithic applications fit well into containers, but microservice applications have an ideal match with them.

Having one container for every single microservice helps to achieve important benefits, such as the following:

  • Independent scalability of microservices
  • More defined responsibilities for development teams' cloud access program
  • Potential adoption of different technology stacks over the different microservices
  • More control over security aspects (such as public-facing exposed services, mTLS connections, and so on)

Orchestrating microservices can be a daunting task when dealing with large and articulated architectures. The adoption of orchestration platforms such as Kubernetes, service mesh solutions such as Istio or Linkerd, and tracing tools such as Jaeger and Kiali becomes crucial to achieving control over complexity.

Where do containers come from? Containers' technology is not a new topic in the computer industry, as we will see in the next paragraphs. It has deep roots in OS history, and we'll discover that it could be even older than us!

This section rewinds the tape and recaps the most important milestones of containers in OS history, from Unix to GNU/Linux machines. A useful glance in the past to understand how the underlying idea evolved through the years.

Chroot and Unix v7

If we want to create an events timeline for our travel time in the containers' history, the first and older destination is 1979 – the year of Unix V7. At that time, way back in 1979, an important system call was introduced in the Unix kernel – the chroot system call.

Important Note

A system call (or syscall) is a method used by an application to request something from the OS's kernel.

This system call allows the application to change the root directory of the running copy of itself and its children, removing any capability of the running software to escape that jail. This feature allows you to prohibit the running application access to any kind of files or directory outside the given subtree, which was really a game changer for that time.

After some years, way back in 1982, this system call was then introduced, also in BSD systems.

Unfortunately, this feature was not built with security in mind, and over the years, OS documentation and security literature strongly discouraged the use of chroot jails as a security mechanism to achieve isolation.

Chroot was only the first milestone in the journey towards complete process isolation in *nix systems. The next was, from a historic point of view, the introduction of FreeBSD jails.

FreeBSD jails

Making some steps forward in our history trip, we jump back (or forward, depending on where we're looking from) to 2000, when the FreeBSD OS approved and released a new concept that extends the old and good chroot system call – FreeBSD jails.

Important Note

FreeBSD is a free and open source Unix-like operating system first released in 1993, born from the Berkeley Software Distribution, which was originally based on Research Unix.

As we briefly reported previously, chroot was a great feature back in the '80s, but the jail it creates can easily be escaped and has many limitations, so it was not adequate for complex scenarios. For that reason, FreeBSD jails were built on top of the chroot syscall with the goal of extending and enlarging its feature set.

In a standard chroot environment, a running process has limitations and isolation only at the filesystem level; all the other stuff, such as running processes, system resources, the networking subsystem, and system users, is shared by the processes inside the chroot and the host system's processes.

Looking at FreeBSD jails, its main feature is the virtualization of the networking subsystem, system users, and its processes; as you can imagine, this improves so much the flexibility and the overall security of the solution.

Let's schematize the four key features of a FreeBSD jail:

  • A directory subtree: This is what we already saw also for the chroot jail. Basically, once defined as a subtree, the running process is limited to that, and it cannot escape from it.
  • An IP address: This is a great revolution; finally, we can define an independent IP address for our jail and let our running process be isolated even from the host system.
  • A hostname: Used inside the jail, this is, of course, different from the host system.
  • A command: This is the running executable and has an option to be run inside the system jail. The executable has a relative path that is self-contained in the jail.

One plus of this kind of jail is that every instance has also its own users and root account that has no kind of privileges or permissions over the other jails or the underlying host system.

Another interesting feature of FreeBSD jails is that we have two ways of installing/creating a jail:

  • From binary-reflecting the ones we might install with the underlying OS
  • From the source, building from scratch what's needed by the final application

Solaris Containers (also known as Solaris Zones)

Moving back to our time machine, we must jump forward only a few years, to 2004 to be exact, to finally meet the first wording we can recognize – Solaris Containers.

Important Note

Solaris is a proprietary Unix OS born from SunOS in 1993, originally developed by Sun Microsystems.

To be honest, Solaris Containers was only a transitory naming of Solaris Zones, a virtualization technology built-in Solaris OS, with help also from a special filesystem, ZFS, that allows storage snapshots and cloning.

A zone is a virtualized application environment, built from the underlying operating system, that allows complete isolation between the base host system and any other applications running inside other zones.

The cool feature that Solaris Zones introduced is the concept of a branded zone. A branded zone is a completely different environment compared to the underlying OS, and can container different binaries, toolkits, or even a different OS!

Finally, for ensuring isolation, a Solaris zone can have its own networking, its own users, and even its own time zone.

Linux Containers (LXC)

Let's jump forward four years more and meet Linux Containers (LXC). We're in 2008, when Linux's first complete container management solution was released.

LXC cannot just be simplified as a manager for one of the first container implementations of Linux containers, because its authors developed a lot of the kernel features that now are also used for other container runtimes in Linux.

LXC has its own low-level container runtime, and its authors made it with the goal of offering an isolated environment as close as possible to VMs but without the overhead needed for simulating the hardware and running a brand-new kernel instance. LXC achieves this a goal and isolation thanks to the following kernel functionalities:

  • Namespaces
  • Mandatory access control
  • Control groups (also known as cgroups)

Let's recap the kernel functionalities that we saw earlier in the chapter.

Linux namespaces

A namespace isolates processes that abstract a global system resource. If a process makes changes to a system resource in a namespace, these changes are visible only to other processes within the same namespace. The common use of the namespaces feature is to implement containers.

Mandatory access control

In the Linux ecosystem, there are several MAC implementations available; the most well-known project is Security Enhanced Linux (SELinux), developed by the USA's National Security Agency (NSA).

Important Note

SELinux is a mandatory access control architecture implementation used in Linux operating systems. It provides role-based access control and multi-level security through a labeling mechanism. Every file, device, and directory has an associated label (often described as a security context) that extends the common filesystem's attributes.

Control groups

Control groups (cgroups) is a built-in Linux kernel feature that can help to organize in hierarchical groups various types of resources, including processes. These resources can then be limited and monitored. The common interface used for interacting with cgroups is a pseudo-filesystem called cgroupfs. This kernel feature is really useful for tracking and limiting processes' resources, such as memory, CPU, and so on.

The main and greatest LXC feature coming from these three kernels' functionalities is, for sure, the unprivileged containers.

Thanks to namespaces, MAC, and cgroups, in fact, LXC can isolate a certain number of UIDs and GIDs, mapping them with the underlying operating system. This ensures that a UID of 0 in the container is (in reality) mapped to a higher UID at the base system host.

Depending on the privileges and the feature set we want to assign to our container, we can choose from a vast set of pre-built namespace types, such as the following:

  • Network: Offering access to network devices, stacks, ports, and so on
  • Mount: Offering access to mount points
  • PID: Offering access to PIDs

The next main evolution from LXC (and, without doubt, the one that triggered the success of container adoption) was certainly Docker.

Docker

After just 5 years, back in 2013, Docker arises in the container landscape, and it rapidly became so popular. But what features were used back in those days? Well, we can easily discover that one of the first Docker container engines was LXC!

Just after one year of development, Docker's team introduced libcontainer and finally replaced the LXC container engine with their own implementation. Docker, similar to its predecessor, LXC, requires a daemon running on the base host system to keep the containers running and working properly.

One most notable feature (apart from the use of namespaces, MAC, and cgroups) was, for sure, OverlayFS, an overlay filesystem that helps combine multiple filesystems in just one single filesystem.

Important Note

OverlayFS is a Linux union filesystem. It can combine multiple mount points into one, creating a single directory structure that contains all the underlying files and subdirectories from sources.

At a high level, the Docker team introduced the concept of container images and container registries, which really was the functionality game changer. The registry and image concepts enabled the creation of a whole ecosystem to which every developer, sysadmin, or tech enthusiast could collaborate and contribute with their own custom container images. They also created a special file format for creating brand-new container images (Dockerfile) to easily automate the steps needed for building the container images from scratch.

Along with Docker, there is another engine/runtime project that caught the interest of the communities – rkt.

rkt

Just a few years after Docker's arise, across 2014 and 2015, the CoreOS company (acquired then by Red Hat) launched its own implementation of a container engine that has a very particular main feature – it was daemon-less.

This choice had an important impact: instead of having a central daemon administering a bunch of containers, every container was on its own, like any other standard process we may start on our base host system.

But the rkt (pronounced rocket) project became very popular in 2017 when the young Cloud Native Computing Foundation (CNCF), which aims to help and coordinate container and cloud-related projects, decided to adopt the project under their umbrella, together with another project donated by Docker itself – containerd.

In a few words, the Docker team extracted the project's core runtime from its daemon and donated it to the CNCF, which was a great step that motivated and enabled a great community around the topic of containers, as well as helping to develop and improve rising container orchestration tools, such as Kubernetes.

Important Note

Kubernetes (from the Greek term κυβερνήτης, meaning "helmsman"), also abbreviated as K8s, is an open source container-orchestration system for simplifying the application deployment and management in a multi-hosts environment. It was released as an open source project by Google, but it is now maintained by the CNCF.

Even if this book's main topic is Podman, we cannot mention now and in the following chapters the rising need of orchestrating complex projects made of many containers on multi-machine environments; that's the scenario where Kubernetes rose as the ecosystem leader.

After Red Hat's acquisition of CoreOS, the rkt project was discontinued, but its legacy was not lost and influenced the development of the Podman project. But before introducing the main topic of this book, let's dive into the OCI specifications.

OCI and CRI-O

As mentioned earlier, the extraction of containerd from Docker and the consequent donation to the CNCF motivated the open source community to start working seriously on container engines that could be injected under an orchestration layer, such as Kubernetes.

On the same wave, in 2015, Docker, with the help of many other companies (Red Hat, AWS, Google, Microsoft, IBM, and so on), started a governance committee under the umbrella of the Linux Foundation, the Open Container Initiative (OCI).

Under this initiative, the working team developed the runtime specification (runtime spec) and the image specification (image spec) for describing how the API and the architecture for new container engines should be created in the future.

The same year, the OCI team also released the first implementation of a container runtime adhering to the OCI specifications; the project was named runc.

The OCI defined not only a specification for running standalone containers but also provided the base for linking the Kubernetes layer with the underlying container engine more easily. At the same time, the Kubernetes community released the Container Runtime Interface (CRI), a plugin interface to enable the adoption of a wide variety of container runtimes.

That's where CRI-O jumps to 2017; released as an open source project by Red Hat, it was one of the first implementations of the Kubernetes Container Runtime Interface, enabling the use of OCI compatible runtimes. CRI-O represents a lightweight alternative to using Docker, rkt, or any other engines as Kubernetes' runtime.

As the ecosystem continues to grow, standards and specifications become more and more adopted, leading to a wider container ecosystem. The OCI specifications showed previously were crucial to the development of the runc container runtime, adopted by the Podman project.

Podman

We finally arrive at the end of our time travel; we reached 2017 in the previous paragraph and, in the same year, the first commit of the Podman project was made on GitHub.

The project's name reveals a lot about its purpose – PODMAN = POD MANager. We are now ready to look at the basic definition of a pod in a container's world.

A pod is the smallest deployable computing unit that can be handled by Kubernetes; it can be made of one or more containers. In the case of multiple containers in the same pod, they are scheduled and run side by side in a shared context.

Podman manages containers and containers' images, their storage volumes, and pods made of one or multiple containers, and it was built from scratch to adhere to the OCI standards.

Podman, like its predecessor, rkt, has no central daemon managing the containers but starts them as standard system processes. It also defines a Docker-compatible CLI interface to ease the transition from Docker.

One of the great features introduced by Podman is rootless containers. Usually, when we think about Linux containers, we immediately think about a system administrator that should set up some prerequisites at the OS level to prepare the environment that lets our container get up and running.

Rootless containers can easily run as a normal user, without requiring root. Using Podman with a non-privileged user will start restricted containers without any privileges, such as the user running it.

Without a doubt, Podman introduced greater flexibility and is a highly active project whose adoption grows constantly. Every major release brings many new features; for example, the 3.0 release introduced support for Docker Compose, which was a highly requested feature. This is also a good health metric of the community support.

Let's close the chapter with an overview of the most common container adoption use cases.