MongoDB Administrator???s Guide

MongoDB Administrator???s Guide

By : Cyrus Dasadia

Buy this Book

MongoDB Administrator???s Guide

By: Cyrus Dasadia

Buy this Book

Overview of this book

MongoDB is a high-performance and feature-rich NoSQL database that forms the backbone of the systems that power many different organizations. Packed with many features that have become essential for many different types of software professional and incredibly easy to use, this cookbook contains more than 100 recipes to address the everyday challenges of working with MongoDB. Starting with database configuration, you will understand the indexing aspects of MongoDB. The book also includes practical recipes on how you can optimize your database query performance, perform diagnostics, and query debugging. You will also learn how to implement the core administration tasks required for high-availability and scalability, achieved through replica sets and sharding, respectively. You will also implement server security concepts such as authentication, user management, role-based access models, and TLS configuration. You will also learn how to back up and recover your database efficiently and monitor server performance. By the end of this book, you will have all the information you need—along with tips, tricks, and best practices—to implement a high-performance MongoDB solution.

Title Page

Credits

About the Author

About the Reviewers

www.PacktPub.com

Customer Feedback

Preface

Free Chapter

Installation and Configuration

Introduction

Installing and starting MongoDB on Linux

Installing and starting MongoDB on macOS

Binding MongoDB process to a specific network interface and port

Enabling SSL for MongodDB

Choosing the right MongoDB storage engine

Changing storage engine

Separating directories per database

Customizing the MongoDB configuration file

Running MongoDB as a Docker container

Understanding and Managing Indexes

Introduction

Creating an index

Managing existing indexes

How to use compound indexes

Creating background indexes

Creating TTL-based indexes

Creating a sparse index

Creating a partial index

Creating a unique index

Performance Tuning

Introduction

Configuring disks for better I/O

Measuring disk I/O performance with mongoperf

Finding slow running queries and operations

Storage considerations when using Amazon EC2

Figuring out the size of a working set

High Availability with Replication

Introduction

Initializing a new replica set

Adding a node to the replica set

Removing a node from the replica set

Working with an arbiter

Switching between primary and secondary nodes

Changing replica set configuration

Changing priority to replica set nodes

High Scalability with Sharding

Understanding sharding and its components

Setting up and configuring a sharded cluster

Managing chunks

Moving non-sharded collection data from one shard to another

Removing a shard from the cluster

Understanding tag aware sharding – zones

Managing MongoDB Backups

Introduction

Taking backup using mongodump tool

Taking backup of a specific mongodb database or collection

Taking backup of a small subset of documents in a collection

Using bsondump tool to view mongodump output in human readable form

Creating a point in time backup of replica sets

Using the mongoexport tool

Creating a backup of a sharded cluster

Restoring MongoDB from Backups

Introduction

Restoring standalone MongoDB using the mongorestore tool

Restoring specific database or specific collection

Restoring data from one collection or database to another

Creating a new MongoDB replica set node using backups

Restoring a MongoDB sharded cluster from backup

Monitoring MongoDB

Introduction

Monitoring MongoDB performance with mongostat

Checking replication lag of nodes in a replica set

Monitoring and killing long running operations on MongoDB

Checking disk I/O usage

Collecting MongoDB metrics using Diamond and Graphite

Authentication and Security in MongoDB

Introduction

Setting up authentication in MongoDB and creating a superuser account

Creating normal users and assigning built-in roles

Creating and assigning custom roles

Restoring access if you are locked out

Using key files to authenticate servers in a replica set

Deploying MongoDB in Production

Introduction

Configuring MongoDB for a production deployment

Upgrading production MongoDB to a newer version

Setting up and configuring TLS (SSL)

Restricting network access using firewalls

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Choosing the right MongoDB storage engine

Starting with MongoDB Version 3.0, a new storage engine named WiredTiger was available and very soon it became the default storage engine in version 3.2. Up until then, MMAPv1 was used as the default storage engine. I will give you a brief rundown on the main features of both storage engines and hopefully it should give you enough to decide which one suits your application's requirements.

WiredTiger

WiredTiger provides the ability, for multiple clients, to perform write operations on the same collection. This is achieved by providing document-level concurrency such that during a given write operation, the database only locks a given document in the collection as against its predecessors, which would lock the entire collection. This drastically improves performance for write heavy applications. Additionally, WiredTiger provides compression of data for indexes and collections. The current compression algorithms used by WiredTiger are Google's Snappy and zLib. Although disabling compression is possible, one should not immediately jump this gun unless it is truly load-tested while planning your storage strategy.

WiredTiger uses Multi-Version Concurrency Control (MVCC) that allows asserting point-in-time snapshots of transactions. These finalized snapshots are written to disk which helps create checkpoints in the database. These checkpoints eventually help determine the last good state of data files and helps in recovery of data during abnormal shutdowns. Additionally, journaling is also supported with WiredTiger where write-ahead transaction logs are maintained. The combination of journaling and checkpoints increases the chance of data recovery during failures. WiredTiger uses internal caching as well as filesystem cache to provide faster responses on queries. With high concurrency in mind, the architecture of WiredTiger is such that it better utilizes multi-core systems.

MMAPv1

MMAPv1 is quite mature and has proven to be quite stable over the years. One of the storage allocation strategies used with this engine is the power of two allocation strategy. This primarily involves storing double the amount of document space (in power of twos) such that in-place updates of documents become highly likely without having to move the documents during updates. Another storage strategy used with this engine is fixed sizing. In this, the documents are padded (for example, with zeros) such that maximum data allocation for each document is attained. This strategy is usually followed by applications that have fewer updates.

Consistency in MMAPv1 is achieved by journaling, where writes are written to a private view in memory which are written to the on-disk journal. Upon which the changes are then written to a shared view that is the data files. There is no support for data compression with MMAPv1. Lastly, MMAPv1 heavily relies on page caches and hence uses up available memory to retain the working dataset in cache thus providing good performance. Although, MongoDB does yield (free up) memory, used for cache, if another process demands it. Some production deployments avoid enabling swap space to ensure these caches are not written to disk which may deteriorate performance.

The verdict

So which storage engine should you choose? Well, with the above mentioned points, I personally feel that you should go with WiredTiger as the document level concurrency itself is a good marker for attaining better performance. However, as all engineering decisions go, one should definitely not shy away from performing appropriate load testing of the application across both storage engines.

Note

The enterprise MongoDB version also provides in-memory storage engine and supports encryption at rest. These are good features to have depending on your application's requirements.

MongoDB Administrator???s Guide

By : Cyrus Dasadia

MongoDB Administrator???s Guide

By: Cyrus Dasadia

Overview of this book

Related Content you might be interested in

Current Title:

MongoDB Administrator???s Guide

Mastering MongoDB 6.x

MongoDB Fundamentals

Mastering MongoDB 7.0

Choosing the right MongoDB storage engine

WiredTiger

MMAPv1

The verdict

Note