Book Image

Learning Ceph

By : Karan Singh
Book Image

Learning Ceph

By: Karan Singh

Overview of this book

<p>Ceph is an open source, software-defined storage solution, which runs on commodity hardware to provide exabyte-level scalability. It is well known to be a highly reliable storage system that has no single point of failure.</p> <p>This book will give you all the skills you need to plan, deploy, and effectively manage your Ceph cluster, guiding you through an overview of Ceph's technology, architecture, and components. With a step-by-step, tutorial-style explanation of the deployment of each Ceph component, the book will take you through Ceph storage provisioning and integration with OpenStack.</p> <p>You will then discover how to deploy and set up your Ceph cluster, discovering the various components and why we need them. This book takes you from a basic level of knowledge in Ceph to an expert understanding of its most advanced features.</p>
Table of Contents (18 chapters)
Learning Ceph
Credits
Foreword
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Foreword

We like to call Ceph the "future of storage", a message that resonates with people at a number of different levels. For system designers, the Ceph system architecture captures the requirements for the types of systems everyone is trying to build; it is horizontally scalable, fault-tolerant by design, modular, and extensible. For users, Ceph provides a range of storage interfaces for both legacy and emerging workloads and can run on a broad range of commodity hardware, allowing production clusters to be deployed with a modest capital investment. For free software enthusiasts, Ceph pushes this technical envelope with a code base that is completely open source and free for all to inspect, modify, and improve in an industry still dominated by expensive and proprietary options.

The Ceph project began as a research initiative at the University of California, Santa Cruz, funded by several Department of Energy laboratories (Los Alamos, Lawrence Livermore, and Sandia). The goal was to further enhance the design of petabyte-scale, object-based storage systems. When I joined the group in 2005, my initial focus was on scalable metadata management for the filesystem—how to distribute management of the file and directory hierarchy across many servers so that the system could cope with a million processors in a supercomputer, dumping files into the filesystem, often in the same directory and at the same time. Over the course of the next 3 years, we incorporated the key ideas from years of research and built a complete architecture and working implementation of the system.

When we published the original academic paper describing Ceph in 2006 and the code was open sourced and posted online, I thought my work was largely complete. The system "worked", and now the magic of open source communities and collaborative development could kick in and quickly transform Ceph into the free software I'd always wanted to exist to run in my own data center. It took time for me to realize that there is a huge gap between prototype and production code, and effective free software communities are built over time. As we continued to develop Ceph over the next several years, the motivation remained the same. We built a cutting-edge distributed storage system that was completely free (as in beer and speech) and could do to the storage industry what Linux did to the server market.

Building a vibrant user and developer community around the Ceph project has been the most rewarding part of this experience. While building the Inktank business to productize Ceph in 2012 and 2013, the community was a common topic of conversation and scrutiny. The question at that point in time was how do we invest and hire to build a community of experts and contributors who do not work for us? I believe it was a keen attention to and understanding of the open source model that ultimately made Inktank and Ceph a success. We sought to build an ecosystem of users, partners, and competitors that we could lead, not dominate.

Karan Singh has been one such member of the community who materialized around Ceph over the last several years. He is an early and active member of the e-mail- and IRC-based discussion forums, where Ceph users and developers meet online to conduct their business, whether it is finding help to get started with Ceph, discussing optimal hardware or software configuration options, sharing crash reports and tracking down bugs, or collaborating in the development of new features.

Although we have known each other online for several years now, I recently had the opportunity to meet Karan in person and only then discovered that he has been hard at work writing a book on Ceph. I find it fitting and a testament to the diversity and success of the community we have built that this book, the first published about Ceph, is written by someone with no direct ties to the original Ceph research team or the Inktank business that helped push it into the limelight. Karan's long background with Ceph and deep roots in the community gave him an ideal perspective on the technology, its impact, and the all-important user experience.

Sage Weil

Ceph Principal Architect, Red Hat