Book Image

MongoDB 4 Quick Start Guide

By : Doug Bierer
Book Image

MongoDB 4 Quick Start Guide

By: Doug Bierer

Overview of this book

MongoDB has grown to become the de facto NoSQL database with millions of users, from small start-ups to Fortune 500 companies. It can solve problems that are considered difficult, if not impossible, for aging RDBMS technologies. Written for version 4 of MongoDB, this book is the easiest way to get started with MongoDB. You will start by getting a MongoDB installation up and running in a safe and secure manner. You will learn how to perform mission-critical create, read, update, and delete operations, and set up database security. You will also learn about advanced features of MongoDB such as the aggregation pipeline, replication, and sharding. You will learn how to build a simple web application that uses MongoDB to respond to AJAX queries, and see how to make use of the MongoDB programming language driver for PHP. The examples incorporate new features available in MongoDB version 4 where appropriate.
Table of Contents (11 chapters)

Overview of MongoDB

MongoDB represents a radical and much needed departure from relational database technology. Dr. Edgar F. Codd (https://en.wikipedia.org/wiki/Edgar_F._Codd), an English computer scientist working for IBM, published his seminal paper, A Relational Model of Data for Large Shared Data Banks in 1970. It formed the basis for what we now know as RDBMS (Relational Database Management Systems), using SQL (Structured Query Language), adopted by IBM, Relational Software (later Oracle), and Ingres (https://en.wikipedia.org/wiki/Ingres_(database), a research project at the University of California in Berkeley. Ingres, in turn, spawned Postgres, Sybase, Microsoft SQL Server, and others.

The first version of MongoDB was introduced in 2009 by 10gen (https://en.wikipedia.org/wiki/MongoDB_Inc.#History) (later MongoDB Inc.) to address a crying need not addressed by the current stable of RDBMS systems, which were, for the most part, based on almost 50-year-old technology; handling big data and modeling objects. Initially proprietary, MongoDB was later released as open source.

DB-Engines (https://db-engines.com/en/ranking) provides up-to-date rankings of competing database systems. It is of interest to note that MongoDB is now in the Top 10, currently ranked fifth. However, you should also note that the score assigned to MongoDB is 343.79 compared with the number one ranked system, Oracle, with a score of 1,311.25.

Handling big data

One massive problem faced by legacy RDBMS systems is difficulty managing Big Data (https://en.wikipedia.org/wiki/Big_data). Examples would include data produced by the NASA Center for Climate Change, the Human Genome Project, which analyzes strands of DNA, or the Sloan Digital Sky Survey, which collects astronomical data. RDBMS systems are designed to maximize storage, which was an expensive resource 50 years ago. In the 21st century, storage costs have dropped dramatically, making this a secondary consideration. Another aspect of RDBMS systems is their ability to provide flexibility by way of creating relations between tables, which by its very nature introduces overheads, compounded when handling big data.

MongoDB addresses the needs of big data by incorporating modern algorithms such as map reduce (https://en.wikipedia.org/wiki/MapReduce), which allows for parallel distributed processing on a cluster of servers. In addition, MongoDB has a feature referred to as sharding, which allows fragments of a database to be stored and processed on multiple servers.

It should be noted that although MongoDB is designed to handle big data, it is actually more of a general purpose platform. If your only need is to handle big data, it might be worth your while to investigate Apache Cassandra (https://cassandra.apache.org/) with Hadoop (http://hadoop.apache.org/), which is expressly designed to handle massive amounts of data.

Modeling objects without SQL

A classic paradox in object oriented programming (OOP) code that requires database access is caused by the two-dimensional architecture of the traditional RDBMS. The two dimensions, rows and columns, are in turn grouped into tables, much like a legacy spreadsheet. In order to achieve the third dimension one needs to perform resource intensive joins and form relationships between tables. In order to map programming object classes to the database, incredible programmatic gymnastics are required to achieve the goal.

With MongoDB, there is no rigid database schema you must adhere to. Instead of rows you insert documents. A set of documents is referred to as a collection. Each document can directly model an object class, which in turn greatly facilitates the work of storing and retrieving from the database.

MongoDB has its own rich query language, which can perform tasks similar to what the developer might expect from a legacy RDBMS using SQL. Because MongoDB does not use SQL, it is often referred to as a NoSQL database.

For an excellent introduction to NoSQL, its underlying philosophy and its ramifications, a highly recommended resource can be found in the NoSQL Guide (https://martinfowler.com/nosql.html) on Martin Fowler's website.