Book Image

Learning Ceph

By : Karan Singh
Book Image

Learning Ceph

By: Karan Singh

Overview of this book

<p>Ceph is an open source, software-defined storage solution, which runs on commodity hardware to provide exabyte-level scalability. It is well known to be a highly reliable storage system that has no single point of failure.</p> <p>This book will give you all the skills you need to plan, deploy, and effectively manage your Ceph cluster, guiding you through an overview of Ceph's technology, architecture, and components. With a step-by-step, tutorial-style explanation of the deployment of each Ceph component, the book will take you through Ceph storage provisioning and integration with OpenStack.</p> <p>You will then discover how to deploy and set up your Ceph cluster, discovering the various components and why we need them. This book takes you from a basic level of knowledge in Ceph to an expert understanding of its most advanced features.</p>
Table of Contents (18 chapters)
Learning Ceph
Credits
Foreword
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The Ceph filesystem


The Ceph filesystem, also known as CephFS, is a POSIX-compliant filesystem that uses the Ceph storage cluster to store user data. CephFS has support for the native Linux kernel driver, which makes CephFS highly adaptive across any flavor of the Linux OS. CephFS stores data and metadata separately, thus providing increased performance and reliability to the application hosted on top of it.

Inside a Ceph cluster, the Ceph filesystem library (libcephfs) works on top of the RADOS library (librados), which is the Ceph storage cluster protocol, and is common for file, block, and object storage. To use CephFS, you will require at least one Ceph metadata server (MDS) to be configured on any of your cluster nodes. However, it's worth keeping in mind that only one MDS server will be a single point of failure for the Ceph filesystem. Once MDS is configured, clients can make use of CephFS in multiple ways. To mount Ceph as a filesystem, clients may use native Linux kernel capabilities or can make use of the ceph-fuse (filesystem in user space) drivers provided by the Ceph community.

In addition to this, clients can make use of third-party open source programs such as Ganesha for NFS and Samba for SMB/CIFS. These programs interact with libcephfs to store user's data to a reliable and distributed Ceph storage cluster. CephFS can also be used as a replacement for Apache Hadoop File System (HDFS). It also makes use of the libcephfs component to store data to the Ceph cluster. For its seamless implementation, the Ceph community provides the required CephFS Java interface for Hadoop and Hadoop plugins. The libcephfs and librados components are very flexible and you can even build your custom program that interacts with it and stores data to the underlying Ceph storage cluster.

CephFS is the only component of the Ceph storage system, which is not production-ready at the time of writing this book. It has been improving at a very high pace and is expected to be production-ready very soon. Currently, it's quite popular in the testing and development environment, and has been evolved with enterprise-demanding features such as dynamic rebalancing and a subdirectory snapshot. The following diagram shows various ways in which CephFS can be used: