Book Image

Mastering Ceph - Second Edition

By : Nick Fisk
Book Image

Mastering Ceph - Second Edition

By: Nick Fisk

Overview of this book

Ceph is an open source distributed storage system that is scalable to Exabyte deployments. This second edition of Mastering Ceph takes you a step closer to becoming an expert on Ceph. You’ll get started by understanding the design goals and planning steps that should be undertaken to ensure successful deployments. In the next sections, you’ll be guided through setting up and deploying the Ceph cluster with the help of orchestration tools. This will allow you to witness Ceph’s scalability, erasure coding (data protective) mechanism, and automated data backup features on multiple servers. You’ll then discover more about the key areas of Ceph including BlueStore, erasure coding and cache tiering with the help of examples. Next, you’ll also learn some of the ways to export Ceph into non-native environments and understand some of the pitfalls that you may encounter. The book features a section on tuning that will take you through the process of optimizing both Ceph and its supporting infrastructure. You’ll also learn to develop applications, which use Librados and distributed computations with shared object classes. Toward the concluding chapters, you’ll learn to troubleshoot issues and handle various scenarios where Ceph is not likely to recover on its own. By the end of this book, you’ll be able to master storage management with Ceph and generate solutions for managing your infrastructure.
Table of Contents (18 chapters)
Free Chapter
1
Section 1: Planning And Deployment
6
Section 2: Operating and Tuning
13
Section 3: Troubleshooting and Recovery

Investigating asserts

Assertions are used in Ceph to ensure that, during the execution of the code, any assumptions that have been made about the operating environment remain true. These assertions are scattered throughout the Ceph code and are designed to catch any conditions that may go on to cause further problems if the code is not stopped.

If you trigger an assertion in Ceph, it's likely that some form of data has a value that is unexpected. This may be caused by some form or corruption or unhandled bug.

If an OSD causes an assert and refuses to restart, the usual recommended approach would be to destroy the OSD, recreate it, and then let Ceph backfill objects back to it. If you have a reproducible failure scenario, it is probably also worth filing a bug in the Ceph bug tracker.

As mentioned, OSDs can fail either due to hardware or software faults in either the stored...