This section discusses the overview architecture of Docker Swarm. The internal structure of Swarm is described in Figure 3.
Starting with the MANAGER part, you will see a block labeled with Docker Swarm API on the left-side of the diagram. As mentioned previously, Swarm exposes the set of remote APIs similar to Docker, which allows you to use any Docker clients to connect to Swarm. However, the Swarm APIs are slightly different from the standard Docker Remote APIs, as Swarm APIs contains cluster-related information too. For example, running docker info
against Docker Engine will give you the information of the single Engine, but when we call docker info
against a Swarm cluster, we'll also get the number of nodes in the cluster as well as each node's information and health.
The block next to Docker Swarm API is Cluster Abstraction. It is an abstraction layer to allow different kinds of cluster to be implemented as backend of Swarm and share the same set of Docker Remote APIs. Currently we have two cluster backend, the built-in Swarm cluster implementation and the Mesos cluster implementation. Swarm Cluster and Built-in Scheduler blocks represent the built-in Swarm cluster implementation, while the blocks denoted by Mesos Cluster is the Mesos cluster implementation.
The Built-in Scheduler of the Swarm backend comes with a number of Scheduling Strategies. Two strategies are Spread and BinPack, which will be explained in the later chapters. If you're familiar with Swarm, you will note that the Random strategy is missing here. The Random strategy is excluded from the explanation as it is for testing purpose only.
Along with Scheduling Strategies, Swarm employs a set of Scheduling Filters to help screening criteria-unmet nodes out. There are currently six kinds of filter namely, Health, Port, Container Slots, Dependency, Affinity, and Constraint. They are applied to filter when one is scheduling the newly created container in exactly this order.
On the AGENTS part, there are Swarm agents trying to register address of their Engines into the discovery service.
Finally, the centralized piece, DISCOVERY, is to coordinate addresses of the Engines between AGENTS and MANAGER. The agent-based Discovery Service currently uses LibKV, which delegates the discovery function to your key-value store of choices, Consul, Etcd, or ZooKeeper. In contrast, we also can use only Docker Swarm manager without any key-value store. This mode is called agent-less discovery, which are File and Nodes (specify address on the command line).
We will use the agent-less model later in this chapter to create a minimal local Swarm cluster. We'll meet the other discovery services starting in Chapter 2, Discover the Discovery Services and the Swarm Mode architecture in Chapter 3, Meeting Docker Swarm Mode.
Before continuing to other sections, we review some Docker-related terminologies to recall Docker concepts and introduce Swarm keywords.
A Docker Engine is a Docker daemon running on a host machine. Sometimes in the book we'll just refer to it as Engine. We usually start an Engine by calling
docker daemon
via systemd or other start up services.Docker Compose is a tool to describe in YAML how multi-container services must be architected.
Docker stacks are the binary result of creating images of multiple-containers app (described by Compose) instead of single containers.
A Docker daemon is an interchangeable term with Docker Engine.
A Docker client is the client program packed in the same docker executable. For example, when we do
docker run
, we are using the Docker client.Docker networking is a Software-defined Network that links a set of containers in the same network together. By default, we'll use the libnetwork (https://github.com/docker/libnetwork) implementation came with Docker Engine. But you can optionally deploy third-party network drivers of your choices using plugins.
Docker Machine is a tool used to create hosts capable of running Docker Engines called machines .
A Swarm node in Swarm v1 is a machine that is a pre-installed Docker Engine and has a Swarm agent program running alongside. A Swarm node will register itself into a Discovery service.
A Swarm master in Swarm v1 is a machine that is running a Swarm manager program. A Swarm master reads addresses of Swarm nodes from its Discovery service.
A Discovery service is a token-based service offered by Docker or a self-hosted one. For the self-hosted ones, you can run HashiCorp Consul, CoreOS Etcd, or Apache ZooKeeper as key-value stores to serve as the discovery service.
Leader Election is a mechanism done by Swarm Masters to find the primary node. Other master nodes will be in the replica role until the primary node goes down and then the leader election process will start again. As we'll see, the number of Swarm masters should be an odd number.
SwarmKit is a new Kit released by Docker to abstract orchestration. Theoretically, it should be able run any kind of service but in practice so far it orchestrates only containers and sets of containers.
Swarm Mode is the new Swarm, available since Docker 1.12, that integrates SwarmKit into the Docker Engine.
Swarm Master (in Swarm Mode) is a node that manages the cluster: It schedules services, keeps the cluster configuration (nodes, roles, and labels) and ensures that there is a cluster leader.
Swarm Worker (in Swarm Mode) is a node which runs tasks, for example, hosts containers.
Services are abstractions of workloads. For example, we can have a service "nginx" replicated 10 times, meaning that you will have 10 tasks (10 nginx containers) distributed on the cluster and load balanced by Swarm itself
Tasks are the unit of work of Swarms. A task is a container.