Book Image

Mastering RethinkDB

By : Shahid Shaikh
Book Image

Mastering RethinkDB

By: Shahid Shaikh

Overview of this book

RethinkDB has a lot of cool things to be excited about: ReQL (its readable,highly-functional syntax), cluster management, primitives for 21st century applications, and change-feeds. This book starts with a brief overview of the RethinkDB architecture and data modeling, and coverage of the advanced ReQL queries to work with JSON documents. Then, you will quickly jump to implementing these concepts in real-world scenarios, by building real-time applications on polling, data synchronization, share market, and the geospatial domain using RethinkDB and Node.js. You will also see how to tweak RethinkDB's capabilities to ensure faster data processing by exploring the sharding and replication techniques in depth. Then, we will take you through the more advanced administration tasks as well as show you the various deployment techniques using PaaS, Docker, and Compose. By the time you have finished reading this book, you would have taken your knowledge of RethinkDB to the next level, and will be able to use the concepts in RethinkDB to develop efficient, real-time applications with ease.
Table of Contents (16 chapters)
Mastering RethinkDB
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Sharding the table to scale the database


Replication puts the same copy of the table in different RethinkDB instances in the cluster, while sharding splits the data and puts it in a different cluster. As we have studied in Chapter 1RethinkDB Architecture and Data Model, RethinkDB uses the range sharding algorithm to perform the splitting of records.

You can refer to that chapter for more details on the algorithm; in this section, we will be doing sharding in our cluster.

So let's take our cluster again and perform sharding of a table. To do so, again we have two options; either do it via a web console or ReQL. I am going to use a web console for the same.

So, as you can see in the following image, we have about 900 documents in the table with random IDs (remember the mock data we generated in Chapter 3, Data Exploration Using RethinkDB?). The reason I am mentioning ID here is that the range sharding algorithm is going to partition our data on the basis of IDs.

For ease of understanding, let...