Book Image

Getting Started with RethinkDB

By : Gianluca Tiepolo
Book Image

Getting Started with RethinkDB

By: Gianluca Tiepolo

Overview of this book

RethinkDB is a high-performance document-oriented database with a unique set of features. This increasingly popular NoSQL database is used to develop real-time web applications and, together with Node.js, it can be used to easily deploy them to the cloud with very little difficulty. Getting Started with RethinkDB is designed to get you working with RethinkDB as quickly as possible. Starting with the installation and configuration process, you will learn how to start importing data into the database and run simple queries using the intuitive ReQL query language. After successfully running a few simple queries, you will be introduced to other topics such as clustering and sharding. You will get to know how to set up a cluster of RethinkDB nodes and spread database load across multiple machines. We will then move on to advanced queries and optimization techniques. You will discover how to work with RethinkDB from a Node.js environment and find out all about deployment techniques. Finally, we’ll finish by working on a fully-fledged example that uses the Node.js framework and advanced features such as Changefeeds to develop a real-time web application.
Table of Contents (15 chapters)
Getting Started with RethinkDB
Credits
About the Author
Acknowledgement
About the Reviewer
www.PacktPub.com
Preface
Index

Configuring RethinkDB


Before you start the database server, there is a bit of fiddling to be done with the configuration file; as of now, if the database is correctly installed, you can run the rethinkdb command and RethinkDB will start up and create a data file in your current directory. The problem is that RethinkDB does not start up on boot by default and is not configured properly for long-term use. We will go over this procedure in the following section.

Running as a daemon

Once you've got RethinkDB installed, you'll probably want to run it as a daemon—a daemon is a software application or script that runs continuously in the background waiting to handle requests; this is how most production database servers run. You can configure RethinkDB to run like this too.

The default RethinkDB package includes various control scripts including the init script /etc/init.d/rethinkdb. These scripts are used to start, stop, and restart daemon processes. Depending on how you've installed the database, you probably already have the init script installed as packet managers such as apt and yum.

You can check if the control script is installed by running the following command in the terminal:

sudo /etc/init.d/rethinkdb status

If the init script is installed correctly, you will receive an output similar to the following:

rethinkdb: No instances defined in /etc/rethinkdb/instances.d/
rethinkdb: See http://www.rethinkdb.com/docs/guides/startup/ for more information

This message is normal and indicates that we have not yet created a configuration file for our database. You can now skip to the following section.

Tip

Depending on your operating system, the RethinkDB daemon script will be installed into a directory called init.d if you're using a SysV-style OS or a directory called rc.d for BSD-style systems. The preceding command uses init.d, but you must replace it with the correct directory for your system before actually running the command.

If, however, the preceding command results in an error such as Command not found, it means that the control script is not installed, and we must proceed to a manual installation. Thankfully, this is as easy as running two commands from the terminal:

sudo wget –O /etc/init.d/ https://raw.githubusercontent.com/rethinkdb/rethinkdb/next/packaging/assets/init/rethinkdb
sudo chmod +x /etc/init.d/rethinkdb

The first command will download the init script to the correct directory, whereas the second command will give it execution permissions. We are now ready to proceed with the creation of a configuration file for our database.

Creating a configuration file

RethinkDB is installed with a generic configuration file sample, suitable for light and casual use. This is perfect for giving RethinkDB a try but is hardly suitable for a production database application. So, we will now see how to edit the configuration for our database instance.

Note

On some platforms including OS X, a configuration file is not provided; however, you can specify all desired options by passing them as command-line arguments. Running RethinkDB followed by the help command will list all the available command-line options.

Instead of rewriting the full settings file, we will use the provided sample configuration as a starting point and proceed to edit it to customize the settings. The following commands copy the sample conf file into the correct directory and open the nano editor to edit it:

sudo cp /etc/rethinkdb/default.conf.sample /etc/rethinkdb/instances.d/instance1.conf
sudo nano /etc/rethinkdb/instances.d/instance1.conf

Here, I am using nano to edit the file, but you may use whatever text editor you prefer.

Note

If you built the database by compiling the source code, you may not have the sample configuration file. If this is the case, you can download it from https://github.com/rethinkdb/rethinkdb/blob/next/packaging/assets/config/default.conf.sample.

As you can see, the configuration file is extremely well-commented and very intuitive; however, there are a couple of important entries we need to look at in the following subsections.

  • bind: By default, RethinkDB will only bind on the local IP address 127.0.0.1. What this means is that only the server which hosts the database will be able to access the web interface, and no other machine will be able to access the data or join the cluster. This configuration can be useful for testing, but in a production environment where the database is probably running on a separate physical server than the application code, you will need to change this setting. Another reason why you may want to change this setting is if you're running RethinkDB on a cloud server as EC2, and you're accessing the server via SSH. We're going to change this setting so that the database will bind to all IP addresses including public IPs; to set this, just set the bind setting to all:

    bind=all

    Make sure to remove the leading hash symbol (#) as doing this will uncomment the line and make the configuration active.

    Note

    Note that there are security implications of exposing your database to the internet. We'll address these issues is Chapter 6, RethinkDB Administration and Deployment.

  • driver-port, cluster-port: These settings let you change the default ports on which RethinkDB will accept connections from client drivers and other servers in the cluster. Generally, they should not be changed unless these ports conflict with any other service that is being executed on the server. You may think that changing the default values could prevent someone from just guessing which ports you're using for your database; however, this doesn't really add any layer of security. We will discuss how to secure the database in Chapter 6, RethinkDB Administration and Deployment.

  • http-port: This setting controls which port the HTTP web interface will be accessible on. As with the previous options, change this value only if this port is already in use by another service.

  • join: The join setting allows you to connect your RethinkDB instance to another existing server to form a cluster. Suppose we have another RethinkDB instance running on a different server that has the IP address 192.168.1.100, we could connect this database to the existing cluster by editing the join setting as in the following:

    join=192.168.1.100:29015
  • Always remember to activate the setting by uncommenting the line (removing the hash).

Once you've configured all of the options appropriately, save the configuration file and exit the editor. Now that we have our database configured, we're ready to start the database server!