Book Image

Mastering Ceph

By : Nick Fisk
Book Image

Mastering Ceph

By: Nick Fisk

Overview of this book

Mastering Ceph covers all that you need to know to use Ceph effectively. Starting with design goals and planning steps that should be undertaken to ensure successful deployments, you will be guided through to setting up and deploying the Ceph cluster, with the help of orchestration tools. Key areas of Ceph including Bluestore, Erasure coding and cache tiering will be covered with help of examples. Development of applications which use Librados and Distributed computations with shared object classes are also covered. A section on tuning will take you through the process of optimisizing both Ceph and its supporting infrastructure. Finally, you will learn to troubleshoot issues and handle various scenarios where Ceph is likely not to recover on its own. By the end of the book, you will be able to successfully deploy and operate a resilient high performance Ceph cluster.
Table of Contents (12 chapters)

What is Ceph?

Ceph is an open source, distributed, scale-out, software-defined storage (SDS) system, which can provide block, object, and file storage. Through the use of the Controlled Replication Under Scalable Hashing (CRUSH) algorithm, Ceph eliminates the need for centralized metadata and can distribute the load across all the nodes in the cluster. Since CRUSH is an algorithm, data placement is calculated rather than being based on table lookups and can scale to hundreds of petabytes without the risk of bottlenecks and the associated single points of failure. Clients also form direct connections with the server storing the requested data and so their is no centralised bottleneck in the data path.

Ceph provides the three main types of storage, being block via RADOS Block Devices (RBD), file via Ceph Filesystem (CephFS), and object via the Reliable Autonomous Distributed Object Store (RADOS) gateway, which provides Simple Storage Service (S3) and Swift compatible storage.

Ceph is a pure SDS solution and as such means that you are free to run it on commodity hardware as long as it provides the correct guarantees around data consistency. More information on the recommended types of hardware can be found later on in this chapter. This is a major development in the storage industry which has typically suffered from strict vendor lock-in. Although there have been numerous open source projects to provide storage services. Very few of them have been able to offer the scale and high resilience of Ceph, without requiring propriety hardware.

It should be noted that Ceph prefers consistency as per the CAP theorem and will try at all costs to make protecting your data the biggest priority over availability in the event of a partition.