Book Image

Mastering Ceph - Second Edition

By : Nick Fisk
Book Image

Mastering Ceph - Second Edition

By: Nick Fisk

Overview of this book

Ceph is an open source distributed storage system that is scalable to Exabyte deployments. This second edition of Mastering Ceph takes you a step closer to becoming an expert on Ceph. You’ll get started by understanding the design goals and planning steps that should be undertaken to ensure successful deployments. In the next sections, you’ll be guided through setting up and deploying the Ceph cluster with the help of orchestration tools. This will allow you to witness Ceph’s scalability, erasure coding (data protective) mechanism, and automated data backup features on multiple servers. You’ll then discover more about the key areas of Ceph including BlueStore, erasure coding and cache tiering with the help of examples. Next, you’ll also learn some of the ways to export Ceph into non-native environments and understand some of the pitfalls that you may encounter. The book features a section on tuning that will take you through the process of optimizing both Ceph and its supporting infrastructure. You’ll also learn to develop applications, which use Librados and distributed computations with shared object classes. Toward the concluding chapters, you’ll learn to troubleshoot issues and handle various scenarios where Ceph is not likely to recover on its own. By the end of this book, you’ll be able to master storage management with Ceph and generate solutions for managing your infrastructure.
Table of Contents (18 chapters)
Free Chapter
1
Section 1: Planning And Deployment
6
Section 2: Operating and Tuning
13
Section 3: Troubleshooting and Recovery

What this book covers

Chapter 1, Planning for Ceph, covers the fundamentals of how Ceph works, its basic architecture, and examines some good use cases. It also discusses the planning steps that you should take before implementing Ceph, including setting design goals, developing proof of concepts and working on infrastructure design.

Chapter 2, Deploying Ceph with Containers, is a no-nonsense, step-by-step instructional chapter on how to set up a Ceph cluster. This chapter covers ceph-deploy for testing and goes onto cover Ansible. Finally, we take a look at the Rook project to deploy Ceph clusters running atop of a Kubernetes cluster. A section on change management is also included, and explains how this is essential for the stability of large Ceph clusters. This chapter also serves the purpose of providing the reader with a common platform on which you can build the examples that we use later in the book.

Chapter 3, BlueStore, explains that Ceph has to be able to provide atomic operations around data and metadata and how FileStore was built to provide these guarantees over the top of standard filesystems. We will also cover the problems around this approach.

The chapter then introduces BlueStore, and explains how it works and the problems that it solves. This will cover the components and how they interact with different types of storage devices. We'll explore an overview of key value stores, including RocksDB, which is used by BlueStore. Some of the BlueStore settings will be discussed, along with how they interact with different hardware configurations.

The chapter finishes by discussing the methods available to upgrade a cluster to BlueStore and walks the reader through an example upgrade.

Chapter 4, Ceph and Non-Native Protocols, discusses the storage abilities that Ceph provides to native Ceph clients and then highlights the issues in more legacy storage deployments where it currently doesn't enjoy widespread adoption. The chapter continues to explore the ways in which Ceph can be exported via NFS and iSCSI to clients that don't natively speak Ceph, and provides examples to demonstrate configuration.

Chapter 5, RADOS Pools and Client Access, explores how Ceph provides storage via the three main protocols of block, file, and object. This chapter discusses the use cases of each and the Ceph components used to provide each protocol. The chapter also covers the difference between replicated and erasure-coded pools, and takes a deeper look into the operation of erasure coded pools.

Chapter 6, Developing with Librados, explains how librados is used to build applications that can interact directly with a Ceph cluster. It then moves onto several different examples of using librados in different languages to give the reader an idea of how librados can be used, including in atomic transactions.

Chapter 7, Distributed Computation with Ceph RADOS Classes, discusses the benefits of moving processing directly into the OSD to effectively perform distributed computing. It then covers how to get started with RADOS classes by building simple ones with Lua. It then examines how to build your own C++ RADOS class into the Ceph source tree and conduct benchmarks against performing processing on the client versus the OSD.

Chapter 8, Monitoring Ceph, starts with a description of why monitoring is important, and discusses the difference between alerting and monitoring. The chapter then covers how to obtain performance counters from all the Ceph components, explains what some of the key counters mean, and how to convert them into usable values.

An example using graphite will show the value of being able to manipulate captured data to provide more meaningful output in graph form. A look at the new Ceph Dashboard, introduced in the Ceph Mimic release, is covered, along with a step-by-step example to enable it on a running Ceph cluster.

Chapter 9, Tuning Ceph, starts with a brief overview on how to tune Ceph and the operating system. It also covers basic concepts of avoiding trying to tune something that is not a bottleneck. It also covers the areas that you may wish to tune and establish how to gauge the success of tuning attempts. It then shows how to benchmark Ceph and take baseline measurements, so that any results achieved are meaningful. Finally, it discusses different tools and how benchmarks might relate to real-life performance.

Chapter 10, Tiering with Ceph, explains how RADOS tiering works in Ceph, where it should be used, and its pitfalls. The chapter goes on take the reader through a step-by-step guide on configuring tiering on a Ceph cluster, and finally covers the tuning options to be able to extract the best performance for tiering.

Chapter 11, Troubleshooting, outlines how, although Ceph is largely autonomous in taking care of itself and recovering from failure scenarios, in some cases, human intervention is required. This chapter will look at common errors and failure scenarios, and how to bring Ceph back to full health by troubleshooting them.

Chapter 12, Disaster Recovery, details how, when Ceph is such a state that a complete loss of service or data loss has occurred, less familiar recovery techniques are required to restore access to the cluster and hopefully recover data. This chapter arms you with the knowledge to attempt recovery in these scenarios.