Book Image

Ceph: Designing and Implementing Scalable Storage Systems

By : Michael Hackett, Vikhyat Umrao, Karan Singh, Nick Fisk, Anthony D'Atri, Vaibhav Bhembre
Book Image

Ceph: Designing and Implementing Scalable Storage Systems

By: Michael Hackett, Vikhyat Umrao, Karan Singh, Nick Fisk, Anthony D'Atri, Vaibhav Bhembre

Overview of this book

This Learning Path takes you through the basics of Ceph all the way to gaining in-depth understanding of its advanced features. You’ll gather skills to plan, deploy, and manage your Ceph cluster. After an introduction to the Ceph architecture and its core projects, you’ll be able to set up a Ceph cluster and learn how to monitor its health, improve its performance, and troubleshoot any issues. By following the step-by-step approach of this Learning Path, you’ll learn how Ceph integrates with OpenStack, Glance, Manila, Swift, and Cinder. With knowledge of federated architecture and CephFS, you’ll use Calamari and VSM to monitor the Ceph environment. In the upcoming chapters, you’ll study the key areas of Ceph, including BlueStore, erasure coding, and cache tiering. More specifically, you’ll discover what they can do for your storage system. In the concluding chapters, you will develop applications that use Librados and distributed computations with shared object classes, and see how Ceph and its supporting infrastructure can be optimized. By the end of this Learning Path, you'll have the practical knowledge of operating Ceph in a production environment. This Learning Path includes content from the following Packt products: • Ceph Cookbook by Michael Hackett, Vikhyat Umrao and Karan Singh • Mastering Ceph by Nick Fisk • Learning Ceph, Second Edition by Anthony D'Atri, Vaibhav Bhembre and Karan Singh
Table of Contents (27 chapters)
Title Page
About Packt
Contributors
Preface
Index

Lost objects and inactive PGs


This section of the chapter will cover the scenario where a number of OSDs may have gone offline in a short period of time, leaving some objects with no valid replica copies. It's important to note that there is a difference between an object that has no remaining copies and an object that has a remaining copy, but it is known that another copy has had more recent writes. The latter is normally seen when running the cluster with min_size set to 1.

To demonstrate how to recover an object that has an out-of-date copy of data, let's perform a series of steps to break the cluster:

  1. First, let's set min_size to 1; hopefully, by the end of this example, you will see why you don't ever want to do this in real life:
sudo ceph osd pool set rbd min_size 1
  1. Create a test object that we will make later make Ceph believe is lost:
sudo rados -p rbd put lost_object logo.png
sudo ceph osd set norecover
sudo ceph osd set nobackfill

These two flags make sure that when the OSDS come back...