Book Image

Getting Started with RethinkDB

By : Gianluca Tiepolo
Book Image

Getting Started with RethinkDB

By: Gianluca Tiepolo

Overview of this book

RethinkDB is a high-performance document-oriented database with a unique set of features. This increasingly popular NoSQL database is used to develop real-time web applications and, together with Node.js, it can be used to easily deploy them to the cloud with very little difficulty. Getting Started with RethinkDB is designed to get you working with RethinkDB as quickly as possible. Starting with the installation and configuration process, you will learn how to start importing data into the database and run simple queries using the intuitive ReQL query language. After successfully running a few simple queries, you will be introduced to other topics such as clustering and sharding. You will get to know how to set up a cluster of RethinkDB nodes and spread database load across multiple machines. We will then move on to advanced queries and optimization techniques. You will discover how to work with RethinkDB from a Node.js environment and find out all about deployment techniques. Finally, we’ll finish by working on a fully-fledged example that uses the Node.js framework and advanced features such as Changefeeds to develop a real-time web application.
Table of Contents (15 chapters)
Getting Started with RethinkDB
Credits
About the Author
Acknowledgement
About the Reviewer
www.PacktPub.com
Preface
Index

Replication


Replication is a way of keeping a number of copies of your data on multiple servers. Why would we need replication? We need it because it provides us with multiple copies of the same data that can be used for redundancy and increasing data availability.

There are actually countless reasons why replicating your data is a good idea, but they generally boil down to two major reasons:

  • Availability

  • Scalability

If, for any reason, your main server goes down, then you don't want to lose any data, and you probably want another server to immediately start serving data to the clients. If your dataset is reasonably small, then in the event of a failure, you could just spin up a new server and restore data from a backup. However, if the dataset is large, the restore process could take hours! To avoid this downtime, it's a good idea to replicate your data.

The other main reason is scalability. If your database has lots of traffic, it may become slow and unresponsive, degrading the clients' experience...