The Evolution of Software Development
Along with the evolution of virtualization technology, it's common for companies to use virtual machines (VMs) to manage their software products, either in the public cloud or an on-premises environment. This brings huge benefits such as automatic machine provisioning, better hardware resource utilization, resource abstraction, and more. More critically, for the first time, it employs the separation of computing, network, and storage resources to unleash the power of software development from the tediousness of hardware management. Virtualization also brings in the ability to manipulate the underlying infrastructure programmatically. So, from a system administrator and developer's perspective, they can better streamline the workflow of software maintenance and development. This is a big move in the history of software development.
However, in the past decade, the scope and life cycle of software development have changed vastly. Earlier, it was not uncommon for software to be developed in big monolithic chunks with a slow-release cycle. Nowadays, to catch up with the rapid changes of business requirements, a piece of software may need to be broken down into individual fine-grained subcomponents, and each component may need to have its release cycle so that it can be released as often as possible to get feedback from the market earlier. Moreover, we may want each component to be scalable and cost-effective.
So, how does this impact application development and deployment? In comparison to the bare-metal era, adopting VMs doesn't help much since VMs don't change the granularity of how different components are managed; the entire software is still deployed on a single machine, only it is a virtual one instead of a physical one. Making a number of interdependent components work together is still not an easy task.
A straightforward idea here is to add an abstraction layer to connect the machines with the applications running on them. This is so that application developers would only need to focus on the business logic to build the applications. Some examples of this are Google App Engine (GAE) and Cloud Foundry.
The first issue with these solutions is the lack of consistent development experience among different environments. Developers develop and test applications on their machines with their local dependencies (both at the programming language and operating system level); while in a production environment, the application has to rely on another set of dependencies underneath. And we still haven't talked about the software components that need the cooperation of different developers in different teams.
The second issue is that the hard boundary between applications and the underlying infrastructure would limit the applications from being highly performant, especially if the application is sensitive to the storage, compute, or network resources. For instance, you may want the application to be deployed across multiple availability zones (isolated geographic locations within data centers where cloud resources are managed), or you may want some applications to coexist, or not to coexist, with other particular applications. Alternatively, you may want some applications to adhere to particular hardware (for example, solid-state drives). In such cases, it becomes hard to focus on the functionality of the app without exposing the topological characteristics of the infrastructure to upper applications.
In fact, in the life cycle of software development, there is no clear boundary between the infrastructure and applications. What we want to achieve is to manage the applications automatically, while making optimal use of the infrastructure.
So, how could we achieve this? Docker (which we will introduce later in this chapter) solves the first issue by leveraging Linux containerization technologies to encapsulate the application and its dependencies. It also introduces the concept of Docker images to make the software aspect of the application runtime environment lightweight, reproducible, and portable.
The second issue is more complicated. That's where Kubernetes comes in. Kubernetes leverages a battle-tested design rationale called the Declarative API to abstract the infrastructure as well as each phase of application delivery such as deployment, upgrades, redundancy, scaling, and more. It also offers a series of building blocks for users to choose, orchestrate, and compose into the eventual application. We will gradually move on to study Kubernetes, which is the core of this book, toward the end of this chapter.
Note
If not specified particularly, the term "container" might be used interchangeably with "Linux container" throughout this book.