Book Image

Native Docker Clustering with Swarm

By : Fabrizio Soppelsa, Chanwit Kaewkasi
Book Image

Native Docker Clustering with Swarm

By: Fabrizio Soppelsa, Chanwit Kaewkasi

Overview of this book

Docker Swarm serves as one of the crucial components of the Docker ecosystem and offers a native solution for you to orchestrate containers. It’s turning out to be one of the preferred choices for Docker clustering thanks to its recent improvements. This book covers Swarm, Swarm Mode, and SwarmKit. It gives you a guided tour on how Swarm works and how to work with Swarm. It describes how to set up local test installations and then moves to huge distributed infrastructures. You will be shown how Swarm works internally, what’s new in Swarmkit, how to automate big Swarm deployments, and how to configure and operate a Swarm cluster on the public and private cloud. This book will teach you how to meet the challenge of deploying massive production-ready applications and a huge number of containers on Swarm. You'll also cover advanced topics that include volumes, scheduling, a Libnetwork deep dive, security, and platform scalability.
Table of Contents (18 chapters)
Native Docker Clustering with Swarm
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Dedication
Preface

Swarm2k and Swarm3k lessons learned


Here's a summary of what you learned from these experiments:

  • For a large set of workers, managers require a lot of CPUs. CPUs will spike whenever the Raft recovery process kicks in.

  • If the leading manager dies, it's better to stop Docker on that node and wait until the cluster becomes stable again with n-1 managers.

  • Keep snapshot reservation as small as possible. The default Docker Swarm configuration will do. Persisting Raft snapshots uses extra CPU.

  • Thousands of nodes require a huge set of resources to manage, both in terms of CPU and network bandwidth. Try to keep services and the managers' topology geographically compact.

  • Hundreds of thousand tasks require high memory nodes.

  • Now, a maximum of 500-1000 nodes are recommended for stable production setups.

  • If managers seem to be stuck, wait; they'll recover eventually.

  • The advertise-addr parameter is mandatory for Routing Mesh to work.

  • Put your compute nodes as close to your data nodes as possible. The overlay...