Book Image

Ceph Cookbook - Second Edition

By : Vikhyat Umrao, Karan Singh, Michael Hackett
Book Image

Ceph Cookbook - Second Edition

By: Vikhyat Umrao, Karan Singh, Michael Hackett

Overview of this book

Ceph is a unified distributed storage system designed for reliability and scalability. This technology has been transforming the software-defined storage industry and is evolving rapidly as a leader with its wide range of support for popular cloud platforms such as OpenStack, and CloudStack, and also for virtualized platforms. Ceph is backed by Red Hat and has been developed by community of developers which has gained immense traction in recent years. This book will guide you right from the basics of Ceph , such as creating blocks, object storage, and filesystem access, to advanced concepts such as cloud integration solutions. The book will also cover practical and easy to implement recipes on CephFS, RGW, and RBD with respect to the major stable release of Ceph Jewel. Towards the end of the book, recipes based on troubleshooting and best practices will help you get to grips with managing Ceph storage in a production environment. By the end of this book, you will have practical, hands-on experience of using Ceph efficiently for your storage requirements.
Table of Contents (15 chapters)

Ceph – the architectural overview

The Ceph internal architecture is pretty straightforward, and we will learn about it with the help of the following diagram:

  • Ceph monitors (MON): Ceph monitors track the health of the entire cluster by keeping a map of the cluster state. They maintain a separate map of information for each component, which includes an OSD map, MON map, PG map (discussed in later chapters), and CRUSH map. All the cluster nodes report to monitor nodes and share information about every change in their state. The monitor does not store actual data; this is the job of the OSD.
  • Ceph object storage device (OSD): As soon as your application issues a write operation to the Ceph cluster, data gets stored in the OSD in the form of objects.

This is the only component of the Ceph cluster where actual user data is stored, and the same data is retrieved when the client issues a read operation. Usually, one OSD daemon is tied to one physical disk in your cluster. So in general, the total number of physical disks in your Ceph cluster is the same as the number of OSD daemons working underneath to store user data on each physical disk.

  • Ceph metadata server (MDS): The MDS keeps track of file hierarchy and stores metadata only for the CephFS filesystem. The Ceph block device and RADOS gateway do not require metadata; hence, they do not need the Ceph MDS daemon. The MDS does not serve data directly to clients, thus removing the single point of failure from the system.
  • RADOS: The Reliable Autonomic Distributed Object Store (RADOS) is the foundation of the Ceph storage cluster. Everything in Ceph is stored in the form of objects, and the RADOS object store is responsible for storing these objects irrespective of their data types. The RADOS layer makes sure that data always remains consistent. To do this, it performs data replication, failure detection, and recovery, as well as data migration and rebalancing across cluster nodes.
  • librados: The librados library is a convenient way to gain access to RADOS with support to the PHP, Ruby, Java, Python, C, and C++ programming languages. It provides a native interface for the Ceph storage cluster (RADOS) as well as a base for other services, such as RBD, RGW, and CephFS, which are built on top of librados. librados also supports direct access to RADOS from applications with no HTTP overhead.
  • RADOS block devices (RBDs): RBDs, which are now known as the Ceph block device, provide persistent block storage, which is thin-provisioned, resizable, and stores data striped over multiple OSDs. The RBD service has been built as a native interface on top of librados.
  • RADOS gateway interface (RGW): RGW provides object storage service. It uses librgw (the Rados Gateway Library) and librados, allowing applications to establish connections with the Ceph object storage. The RGW provides RESTful APIs with interfaces that are compatible with Amazon S3 and OpenStack Swift.
  • CephFS: The Ceph filesystem provides a POSIX-compliant filesystem that uses the Ceph storage cluster to store user data on a filesystem. Like RBD and RGW, the CephFS service is also implemented as a native interface to librados.
  • Ceph manager: The Ceph manager daemon (ceph-mgr) was introduced in the Kraken release, and it runs alongside monitor daemons to provide additional monitoring and interfaces to external monitoring and management systems.