Book Image

MongoDB Cookbook - Second Edition - Second Edition

By : Amol Nayak
Book Image

MongoDB Cookbook - Second Edition - Second Edition

By: Amol Nayak

Overview of this book

MongoDB is a high-performance and feature-rich NoSQL database that forms the backbone of the systems that power many different organizations – it’s easy to see why it’s the most popular NoSQL database on the market. Packed with many features that have become essential for many different types of software professionals and incredibly easy to use, this cookbook contains many solutions to the everyday challenges of MongoDB, as well as guidance on effective techniques to extend your skills and capabilities. This book starts with how to initialize the server in three different modes with various configurations. You will then be introduced to programming language drivers in both Java and Python. A new feature in MongoDB 3 is that you can connect to a single node using Python, set to make MongoDB even more popular with anyone working with Python. You will then learn a range of further topics including advanced query operations, monitoring and backup using MMS, as well as some very useful administration recipes including SCRAM-SHA-1 Authentication. Beyond that, you will also find recipes on cloud deployment, including guidance on how to work with Docker containers alongside MongoDB, integrating the database with Hadoop, and tips for improving developer productivity. Created as both an accessible tutorial and an easy to use resource, on hand whenever you need to solve a problem, MongoDB Cookbook will help you handle everything from administration to automation with MongoDB more effectively than ever before.
Table of Contents (17 chapters)
MongoDB Cookbook Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Starting a single node instance using command-line options


In this recipe, we will see how to start a standalone single node server with some command-line options. We will see an example where we want to do the following:

  • Start the server listening to port 27000

  • Logs should be written to /logs/mongo.log

  • The database directory is /data/mongo/db

As the server has been started for development purposes, we don't want to preallocate full-size database files. (We will soon see what this means.)

Getting ready

If you have already seen and executed the Installing single node MongoDB recipe, you need not do anything different. If all these prerequisites are met, we are good for this recipe.

How to do it…

  1. The /data/mongo/db directory for the database and /logs/ for the logs should be created and present on your filesystem with appropriate permissions to write to it.

  2. Execute the following command:

    > mongod --port 27000 --dbpath /data/mongo/db –logpath /logs/mongo.log --smallfiles
    

How it works…

Ok, this wasn't too difficult and is similar to the previous recipe, but we have some additional command-line options this time around. MongoDB actually supports quite a few options at startup, and we will see a list of the most common and important ones in my opinion:

Option

Description

--help or -h

This is used to print the information of various start up options available.

--config or -f

This specifies the location of the configuration file that contains all the configuration options. We will see more on this option in a later recipe. It is just a convenient way of specifying the configurations in a file rather than on the command prompt; especially when the number of options specified is more. Using a separate configuration file shared across different MongoDB instances will also ensure that all the instances are running with identical configurations.

--verbose or -v

This makes the logs more verbose; we can put more v's to make the output even more verbose, for example, -vvvvv.

--quiet

This gives a quieter output; this is the opposite of verbose or the - v option. It will keep the logs less chatty and clean.

--port

This option is used if you are looking to start the server listening to some port other than the default 27017. We would be frequently using this option whenever we are looking to start multiple mongo servers on the same machine, for example, --port 27018 will start the server listening to port 27018 for new connections.

--logpath

This provides a path to a log file where the logs will be written. The value defaults to STDOUT. For example, --logpath /logs/server.out will use /logs/server.out as the log file for the server. Remember that the value provided should be a file and not a directory where the logs will be written.

--logappend

This option appends to the existing log file, if any. The default behavior is to rename the existing log file and then create a new file for the logs of the currently started mongo instance. Suppose that we have used the name of the log file as server.out, and on startup, the file exists, then by default this file will be renamed as server.out.<timestamp>, where <timestamp> is the current time. The time is GMT as against the local time. Let's assume that the current date is October 28th, 2013 and time is 12:02:15, then the file generated will have the following value as the timestamp: 2013-10-28T12-02-15.

--dbpath

This provides you with the directory where a new database will be created or an existing database is present. The value defaults to /data/db. We will start the server using /data /mongo/db as the database directory. Note that the value should be a directory rather than the name of the file.

--smallfiles

This is used frequently for development purposes when we plan to start more than one mongo instance on our local machine. Mongo, on startup, creates a database file of size 64 MB (on 64-bit machines). This preallocation happens for performance reasons, and the file is created with zeros written to it to fill out space on the disk. Adding this option on startup creates a preallocated file of 16 MB only (again, on a 64-bit machine). This option also reduces the maximum size of the database and journal files. Avoid using this option for production deployments. Additionally, the file sizes double to a maximum of 2 GB by default. If the --smallfile option is chosen, it goes up to a maximum of 512 MB.

--replSet

This option is used to start the server as a member of the replica set. The value of this arg is the name of the replica set, for example, --replSet repl1. You will learn more on this option in a later recipe where we will start a simple mongo replica set.

--configsvr

This option is used to start the server as a configuration server. The role of the configuration server will be made clearer when we set up a simple sharded environment in a later recipe in this chapter.

--shardsvr

This informs the started mongod process that this server is being started as a shard server. By giving this option, the server also listens to port 27018 instead of the default 27017. We will know more on this option when we start a simple sharded server.

--oplogSize

Oplog is the backbone of replication. It is a capped collection where the data being written to the primary instances is stored in order to be replicated to the secondary instances. This collection resides in a database named local. On initialization of the replica set, the disk space for oplog is preallocated, and the database file (for the local database) is filled with zeros as placeholders. The default value is 5% of the disk space, which should be good enough for most of the cases.

The size of oplog is crucial because capped collections are of a fixed size and they discard the oldest documents in them on exceeding their size, thereby making space for new documents. Having a very small oplog size can result in data being discarded before being replicated to secondary nodes. A large oplog size can result in unnecessary disk space utilization and large duration for the replica set initialization.

For development purposes, when we start multiple server processes on the same host, we might want to keep the oplog size to a minimum value, quickly initiate the replica set, and use minimum disk space.

--storageEngine

Starting with MongoDB 3.0, a new storage engine called Wired Tiger was introduced. The previous (default) storage engine is now called mmapv1. To start MongoDB with Wired Tiger instead of mmapv1, use the wiredTiger value with this option.

--dirctoryperdb

By default, MongoDB's database files are stored in a common directory (as provided in --dbpath). This option allows you to store each database in its own subdirectory in the aforementioned data directory. Having such granular control allows you to have separate disks for each database.

There's more…

For an exhaustive list of options that are available, use the --help or -h option. This list of options is not exhaustive, and we will see some more coming up in later recipes as and when we need them. In the next recipe, we will see how to use a configuration file instead of the command-line arguments.

See also

  • Single node installation of MongoDB with options from config file for using configuration files to provide start up options

  • Starting multiple instances as part of a replica set to start a replica set

  • Starting a simple sharded environment of two shards to set up a sharded environment