Book Image

Ceph: Designing and Implementing Scalable Storage Systems

By : Michael Hackett, Vikhyat Umrao, Karan Singh, Nick Fisk, Anthony D'Atri, Vaibhav Bhembre
Book Image

Ceph: Designing and Implementing Scalable Storage Systems

By: Michael Hackett, Vikhyat Umrao, Karan Singh, Nick Fisk, Anthony D'Atri, Vaibhav Bhembre

Overview of this book

This Learning Path takes you through the basics of Ceph all the way to gaining in-depth understanding of its advanced features. You’ll gather skills to plan, deploy, and manage your Ceph cluster. After an introduction to the Ceph architecture and its core projects, you’ll be able to set up a Ceph cluster and learn how to monitor its health, improve its performance, and troubleshoot any issues. By following the step-by-step approach of this Learning Path, you’ll learn how Ceph integrates with OpenStack, Glance, Manila, Swift, and Cinder. With knowledge of federated architecture and CephFS, you’ll use Calamari and VSM to monitor the Ceph environment. In the upcoming chapters, you’ll study the key areas of Ceph, including BlueStore, erasure coding, and cache tiering. More specifically, you’ll discover what they can do for your storage system. In the concluding chapters, you will develop applications that use Librados and distributed computations with shared object classes, and see how Ceph and its supporting infrastructure can be optimized. By the end of this Learning Path, you'll have the practical knowledge of operating Ceph in a production environment. This Learning Path includes content from the following Packt products: • Ceph Cookbook by Michael Hackett, Vikhyat Umrao and Karan Singh • Mastering Ceph by Nick Fisk • Learning Ceph, Second Edition by Anthony D'Atri, Vaibhav Bhembre and Karan Singh
Table of Contents (27 chapters)
Title Page
About Packt
Contributors
Preface
Index

Preface

This Learning Path takes you through the basics of Ceph all the way to gain an in-depth understanding of its advanced features. You'll gather skills to plan, deploy, and manage your Ceph cluster. After an introduction to the Ceph architecture and its core projects, you'll be able to set up a Ceph cluster and learn how to monitor its health, improve its performance, and troubleshoot any issues.

By following the step-by-step approach of this Learning Path, you'll learn how Ceph integrates with OpenStack, Glance, Manila, Swift, and Cinder. With knowledge of federated architecture and CephFS, you'll use Calamari and VSM to monitor the Ceph environment. In the upcoming chapters, you'll study the key areas of Ceph, including BlueStore, erasure coding, and cache tiering. More specifically, you'll discover what they can do for your storage system. In the concluding chapters, you will develop applications that use Librados and distributed computations with shared object classes, and see how Ceph and its supporting infrastructure can be optimized. By the end of this Learning Path, you'll have the practical knowledge of operating Ceph in a production environment.

Who This Book Is For

If you are a developer, system administrator, storage professional, or cloud engineer who wants to understand how to deploy a Ceph cluster, this Learning Path is ideal for you. It will help you discover ways in which Ceph features can solve your data storage problems. Basic knowledge of storage systems and GNU/Linux will be beneficial.

What This Book Covers

Chapter 1Ceph - Introduction and Beyond, covers an introduction to Ceph, gradually moving toward RAID and its challenges, and a Ceph architectural overview. Finally, we will go through Ceph installation and configuration.

Chapter 2Working with Ceph Block Device, covers an introduction to the Ceph Block Device and provisioning of the Ceph block device. We will also go through RBD snapshots and clones, as well as implementing a disaster-recovery solution with RBD mirroring.

Chapter 3Working with Ceph and Openstack, covers configuring Openstack clients for use with Ceph, as well as storage options for OpenStack using cinder, glance, and nova.

 

 

 

 

 

 

Chapter 4Working with Ceph Object Storage, covers a deep dive into Ceph object storage, including RGW setup and configuration, S3, and OpenStack Swift access. Finally, we will set up RGW with the Hadoop S3A plugin.  

Chapter 5Working with Ceph Object Storage Multi-Site v2, helps you to deep dive into the new Multi-site v2, while configuring two Ceph clusters to mirror objects between them in an object disaster recovery solution. 

Chapter 6Working with the Ceph Filesystem, covers an introduction to CephFS, deploying and accessing MDS and CephFS via kerenel, FUSE, and NFS-Ganesha.

Chapter 7Operating and Managing a Ceph Cluster, covers Ceph service management with systemd, and scaling up and scaling down a Ceph cluster. This chapter also includes failed disk replacement and upgrading Ceph infrastructures.

Chapter 8Ceph under the Hood, explores the Ceph CRUSH map, understanding the internals of the CRUSH map and CRUSH tunables, followed by Ceph authentication and authorization. This chapter also covers dynamic cluster management and understanding Ceph PG. Finally, we create the specifics required for specific hardware.

Chapter 9The Virtual Storage Manager for Ceph, speaks about Virtual Storage Manager (VSM), covering it’s introduction and architecture. We will also go through the deployment of VSM and then the creation of a Ceph cluster, using VSM to manage it.

Chapter 10More on Ceph, covers Ceph benchmarking, Ceph troubleshooting using admin socket, API, and the ceph-objectstore tool. This chapter also covers the deployment of Ceph using Ansible and Ceph memory profiling. Furthermore, it covers health checking your Ceph cluster using Ceph Medic and the new experimental backend Ceph BlueStore.

Chapter 11Deploying Ceph, is a no-nonsense step-by-step instructional chapter on how to set up a Ceph cluster. This chapter covers the ceph-deploy tool for testing and goes onto covering Ansible. A section on change management is also included, and it explains how this is essential for the stability of large Ceph clusters.

Chapter 12BlueStore, explains that Ceph has to be able to provide atomic operations around data and metadata and how filestore was built to provide these guarantees on top of standard filesystems. We will also cover the problems around this approach. The chapter then introduces BlueStore and explains how it works and the problems that it solves. This will include the components and how they interact with different types of storage devices. We will also have an overview of key-value stores, including RocksDB, which is used by BlueStore. Some of the BlueStore settings and how they interact with different hardware configurations will be discussed.

 

Chapter 13Erasure Coding for Better Storage Efficiency, covers how erasure coding works and how it's implemented in Ceph, including explanations of RADOS pool parameters and erasure coding profiles. A reference to the changes in the Kraken release will highlight the possibility of append-overwrites to erasure pools, which will allow RBDs to directly function on erasure-coded pools. Performance considerations will also be explained. This will include references to BlueStore, as it is required for sufficient performance. Finally, we have step-by-step instructions on actually setting up erasure coding on a pool, which can be used as a mechanical reference for sysadmins.

Chapter 14Developing with Librados, explains how Librados is used to build applications that can interact directly with a Ceph cluster. It then moves onto several different examples of using Librados in different languages to give you an idea of how it can be used, including atomic transactions.

Chapter 15Distributed Computation with Ceph RADOS Classes, discusses the benefits of moving processing directly into the OSD to effectively perform distributed computing. It then covers how to get started with RADOS classes by building simple ones with Lua. It then covers how to build your own C++ RADOS class into the Ceph source tree and conduct benchmarks against performing processing on the client versus the OSD.

Chapter 16Tiering with Ceph, explains how RADOS tiering works in Ceph, where it should be used, and its pitfalls. It takes you step-by-step through configuring tiering on a Ceph cluster and finally covers the tuning options to extract the best performance for tiering. An example using Graphite will demonstrate the value of being able to manipulate captured data to provide more meaningful output in graph form.

Chapter 17Troubleshooting, explains how although Ceph is largely autonomous in taking care of itself and recovering from failure scenarios, in some cases, human intervention is required. We'll look at common errors and failure scenarios and how to bring Ceph back to full health by troubleshooting them.

Chapter 18Disaster Recovery, covers situations when Ceph is in such a state that there is a complete loss of service or data loss has occurred. Less familiar recovery techniques are required to restore access to the cluster and, hopefully, recover data. This chapter arms you with the knowledge to attempt recovery in these scenarios.

Chapter 19Operations and Maintenance, is a deep and wide inventory of day to day operations. We cover management of Ceph topologies, services, and configuration settings as well as, maintenance and debugging.

Chapter 20Monitoring Ceph, a comprehensive collection of commands, practices, and dashboard software to help keep a close eye on the health of Ceph clusters.

 

Chapter 21Performance and Stability Tuning, provides a collection of Ceph, networks, filesystems, and underlying operating system settings to optimize cluster performance and stability. Benchmarking of cluster performance is also explored.

To Get the Most out of This Book

This book requires that you have enough resources to run the whole Ceph lab environment. The minimum hardware or virtual requirements are as follows:

  • CPU: 2 cores
  • Memory: 8 GB RAM
  • Disk space: 40 GB

The various software components required to follow the instructions in the chapters are as follows:

 

Download the Example Code Files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packt.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Ceph-Designing-and-Implementing-Scalable-Storage-Systems. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

 

Conventions Used

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Verify the installrc file" A block of code is set as follows:

AGENT_ADDRESS_LIST="192.168.123.101 192.168.123.102 192.168.123.103"
        CONTROLLER_ADDRESS="192.168.123.100"

Any command-line input or output is written as follows:

vagrant plugin install vagrant-hostmanager

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "It has probed OSDs 1 and 2 for the data, which means that it didn’t find anything it needed. It wants to try and pol OSD 0, but it can’t because the OSD is down, hence the message as starting or marking this osd lost may let us proceed appeared."

Note

Warnings or important notes appear like this.

Note

Tips and tricks appear like this.

Get in Touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

 

 

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.