Book Image

Getting Started with RethinkDB

By : Gianluca Tiepolo
Book Image

Getting Started with RethinkDB

By: Gianluca Tiepolo

Overview of this book

RethinkDB is a high-performance document-oriented database with a unique set of features. This increasingly popular NoSQL database is used to develop real-time web applications and, together with Node.js, it can be used to easily deploy them to the cloud with very little difficulty. Getting Started with RethinkDB is designed to get you working with RethinkDB as quickly as possible. Starting with the installation and configuration process, you will learn how to start importing data into the database and run simple queries using the intuitive ReQL query language. After successfully running a few simple queries, you will be introduced to other topics such as clustering and sharding. You will get to know how to set up a cluster of RethinkDB nodes and spread database load across multiple machines. We will then move on to advanced queries and optimization techniques. You will discover how to work with RethinkDB from a Node.js environment and find out all about deployment techniques. Finally, we’ll finish by working on a fully-fledged example that uses the Node.js framework and advanced features such as Changefeeds to develop a real-time web application.
Table of Contents (15 chapters)
Getting Started with RethinkDB
Credits
About the Author
Acknowledgement
About the Reviewer
www.PacktPub.com
Preface
Index

Clustering RethinkDB


Clustering in RethinkDB's context refers to the ability of several server or instances to support a single database. An instance can be a database process on the same machine as other RethinkDB instances, or it can be a completely different and separate server.

Clustering offers three major advantages especially in databases with large datasets:

  • Data scalability

  • Fault tolerance

  • Load balancing

As we have seen in the previous sections, clustering solves the increasing data volume problem as it allows us to store large quantities of data in multiple instances. This is because a single machine has limited capacity.

Clustering also provides us with additional fault tolerance; that is, in the event that a software component fails, a backup component or procedure can immediately take place; in fact, in a clustered environment, because there is more than one instance for the client to connect to, there will always be an alternative endpoint for the client in the event of an individual...