Ruby and MongoDB Web Development Beginner's Guide

Ruby and MongoDB Web Development Beginner's Guide

By : Gautam Rege

Buy this Book

Ruby and MongoDB Web Development Beginner's Guide

By: Gautam Rege

Buy this Book

Overview of this book

MongoDB is a high-performance, open source, schema-free document-oriented database. Ruby is an object- oriented scripting language. Ruby and MongoDB are an ideal partnership for building scalable web applications. Ruby and MongoDB Web Development Beginner's Guide is a fast-paced, hands-on guide to get started with web application development using Ruby and MongoDB. The book follows a practical approach, using clear and step-by-step instructions and examples in Ruby to demonstrate application development using MongoDB. The book starts by introducing the concepts of MongoDB. The book teaches everything right from the installation to creating objects, MongoDB internals, queries and Ruby Data Mappers. You will learn how to use various Ruby data mappers like Mongoid and MongoMapper to map Ruby objects to MongoDB documents. You will learn MongoDB features and deal with geo-spatial indexing with MongoDB and Scaling MongoDB. With its coverage of concepts and practical examples, Ruby and MongoDB Web Development Beginner's Guide is the right choice for Ruby developers to get started with developing websites with MongoDB as the database.

Ruby and MongoDB Web Development Beginner's Guide

Credits

About the Author

Acknowledgement

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Installing MongoDB and Ruby

Installing Ruby

Installing MongoDB

Configuring the MongoDB server

Starting MongoDB

Stopping MongoDB

The MongoDB CLI

Installing Rails/Sinatra

Summary

Diving Deep into MongoDB

Creating documents

Time for action — creating our first document

Using MongoDB embedded documents

Time for action — embedding reviews and votes

Using MongoDB document relationships

Time for action — creating document relations

Comparing MongoDB versus SQL syntax

Using Map/Reduce instead of join

Time for action — writing the map function for calculating vote statistics

Time for action — writing the reduce function to process emitted information

Understanding the Ruby perspective

Time for action — creating the project

Time for action — start your engines

Time for action — configuring Mongoid

Time for action — planning the object schema

Time for action — putting it all together

Time for action — adding reviews to books

Time for action — embedding Lease and Purchase models

Time for action — writing the map function to calculate ratings

Time for action — writing the reduce function to process the emitted results

Time for action — working with Map/Reduce using Ruby

Summary

MongoDB Internals

Understanding Binary JSON

What is ObjectId?

Documents and collections

JavaScript and MongoDB

Time for action — writing our own custom functions in MongoDB

Ensuring write consistency or "read your writes"

Global write lock

Transactional support in MongoDB

Time for action — implementing optimistic locking

Why are there no joins in MongoDB?

Summary

Working Out Your Way with Queries

Searching by fields in a document

Time for action — searching by a string value

Time for action — fetching only for specific fields

Time for action — skipping documents and limiting our search results

Time for action — finding books by name or publisher

Time for action — finding the highly ranked books

Searching inside arrays

Time for action — searching inside reviews

Searching inside hashes

Searching inside embedded documents

Searching with regular expressions

Time for action — using regular expression searches

Summary

Ruby DataMappers: Ruby and MongoDB Go Hand in Hand

Why do we need Ruby DataMappers

Time for action — using mongo gem

The Ruby DataMappers for MongoDB

Setting up DataMappers

Time for action — configuring MongoMapper

Time for action — setting up Mongoid

Creating, updating, and destroying documents

Time for action — creating and updating objects

Using finder methods

Using MongoDB criteria

Time for action — fetching using the where criterion

Understanding model relationships

Time for action — relating models

Time for action — categorizing books

Time for action — adding book details

Time for action — managing the driver entities

Time for action — creating vehicles using basic polymorphism

Using embedded objects

Time for action — creating embedded objects

Reverse embedded relations in Mongoid

Time for action — using embeds_one without specifying embedded_in

Time for action — using embeds_many without specifying embedded_in

Understanding embedded polymorphism

Time for action — adding licenses to drivers

Time for action — insuring drivers

Choosing whether to embed or to associate documents

Mongoid or MongoMapper — the verdict

Summary

Modeling Ruby with Mongoid

Developing a web application with Mongoid

Time for action — setting up a Rails project

Time for action — using Sinatra professionally

Defining attributes in models

Time for action — adding dynamic fields

Time for action — localizing fields

Using arrays and hashes in models

Defining relations in models

Time for action — configuring the many-to-many relation

Time for action — setting up the following and followers relationship

Time for action — setting up cyclic relations

Managing changes in models

Time for action — changing models

Mixing in Mongoid modules

Time for action — getting paranoid

Time for action — including a version

Summary

Achieving High Performance on Your Ruby Application with MongoDB

Profiling MongoDB

Time for action — enabling profiling for MongoDB

Using the explain function

Time for action — explaining a query

Using covered indexes

Time for action — using covered indexes

Other MongoDB performance tuning techniques

Understanding web application performance

Optimizing our code for performance

Optimizing and tuning the web application stack

Summary

Rack, Sinatra, Rails, and MongoDB — Making Use of them All

Revisiting Sodibee

The Rails way

Time for action — modeling the Author class

Time for action — writing the Book, Category and Address models

Time for action — modeling the Order class

Time for action — configuring routes

Time for action — writing the AuthorsController

Time for action — designing the layout

Time for action — listing authors

Time for action — adding new authors and books

The Sinatra way

Time for action — setting up Sinatra and Rack

Testing and automation using RSpec

Time for action — installing RSpec

Time for action — sporking it

Documenting code using YARD

Summary

Going Everywhere — Geospatial Indexing with MongoDB

What is geolocation

Identifying the exact geolocation

Storing coordinates in MongoDB

Time for action — geocoding the Address model

Time for action — saving geolocation coordinates

Time for action — using geocoder for storing coordinates

Firing geolocation queries

Time for action — finding nearby addresses

Time for action — firing near queries in Mongoid

Summary

Scaling MongoDB

High availability and failover via replication

Time for action — setting up the master/slave replication

Time for action — implementing replica sets

Implementing replica sets for Sodibee

Time for action — configuring replica sets for Sodibee

Implementing sharding

Time for action — setting up the shards

Time for action — starting the config server

Time for action — setting up mongos

Implementing Map/Reduce

Time for action — planning the Map/Reduce functionality

Time for action — Map/Reduce via the mongo console

Time for action — Map/Reduce via Ruby

Time for action — iterating Ruby objects

Summary

Pop quiz — Answers

Chapter 2: Diving Deep into MongoDB

Chapter 3: MongoDB Internals

Chapter 4: Working out your Way with Queries

Chapter 5: Ruby DataMappers: Ruby and MongoDB Go Hand in Hand

Chapter 6: Modeling Ruby with Mongoid

Chapter 8: Rack, Sinatra, Rails and MongoDB - Making use of them all

Chapter 10: Scaling MongoDB

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preface

And then there was light — a lightweight database! How often have we all wanted some database that was "just a data store"? Sure, you can use it in many complex ways but in the end, it's just a plain simple data store. Welcome MongoDB!

And then there was light — a lightweight language that was fun to program in. It supports all the constructs of a pure object-oriented language and is fun to program in. Welcome Ruby!

Both MongoDB and Ruby are the fruits of people who wanted to simplify things in a complex world. Ruby, written by Yokihiro Matsumoto was made, picking the best constructs from Perl, SmallTalk and Scheme. They say Matz (as he is called lovingly) "writes in C so that you don't have to". Ruby is an object-oriented programming language that can be summarized in one word: fun!

Note

It's interesting to know that Ruby was created as an "object-oriented scripting language". However, today Ruby can be compiled using JRuby or Rubinius, so we could call it a programming language.

MongoDB has its roots from the word "humongous" and has the primary goal to manage humongous data! As a NoSQL database, it relies heavily on data stored as key-value pairs.

Wait! Did we hear NoSQL — (also pronounced as No Sequel or No S-Q-L)? Yes! The roots of MongoDB lie in its data not having a structured format! Even before we dive into Ruby and MongoDB, it makes sense to understand some of these basic premises:

NoSQL
Brewer's CAP theorem
Basically Available, Soft-state, Eventually-consistent (BASE)
ACID or BASE

Understanding NoSQL

When the world was living in an age of SQL gurus and Database Administrators with expertise in stored procedures and triggers, a few brave men dared to rebel. The reason was "simplicity". SQL was good to use when there was a structure and a fixed set of rules. The common databases such as Oracle, SQL Server, MySQL, DB2, and PostgreSQL, all promoted SQL — referential integrity, consistency, and atomic transactions. One of the SQL based rebels - SQLite decided to be really "lite" and either ignored most of these constructs or did not enforce them based on the premise: "Know what you are doing or beware".

Similarly, NoSQL is all about using simple keys to store data. Searching keys uses various hashing algorithms, but at the end of the day all we have is a simple data store!

With the advent of web applications and crowd sourcing web portals, the mantra was "more scalable than highly available" and "more speed instead of consistency". Some web applications may be okay with these and others may not. What is important is that there is now a choice and developers can choose wisely!

It's interesting to note that "key-value pair" databases have existed from the early 80's — the earliest to my knowledge being Berkeley DB — blazingly fast, light-weight, and a very simple library to use.

Brewer's CAP theorem

Brewer's CAP theorem states that any distributed computer system can support only any two among consistency, atomicity, and partition tolerance.

Consistency deals with consistency of data or referential integrity
Atomicity deals with transactions or a set of commands that execute as "all or nothing"
Partition tolerance deals with distributed data, scaling and replication

There is sufficient belief that any database can guarantee any two of the above. However, the essence of the CAP theorem is not to find a solution to have all three behaviors, but to allow us to look at designing databases differently based on the application we want to build!

For example, if you are building a Core Banking System (CBS) , consistency and atomicity are extremely important. The CBS must guarantee these two at the cost of partition tolerance. Of course, a CBS has its failover systems, backup, and live replication to guarantee zero downtime, but at the cost of additional infrastructure and usually a single large instance of the database.

A heavily accessed information web portal with a large amount of data requires speed and scale, not consistency. Does the order of comments submitted at the same time really matter? What matters is how quickly and consistently the data was delivered. This is a clear case of consistency and partition tolerance at the cost of atomicity.

Note

An excellent article on the CAP theorem is at http://www.julianbrowne.com/article/viewer/brewers-cap-theorem.

What are BASE databases?

"Basically Available, Soft-state, Eventually-consistent"!!

Just the name suggests, a trade-off, BASE databases (yes, they are called BASE databases intentionally to mock ACID databases) use some tactics to have consistency, atomicity, and partition tolerance "eventually". They do not really defy the CAP theorem but work around it.

Simply put: I can afford my database to be consistent over time by synchronizing information between different database nodes. I can cache data (also called "soft-state") and persist it later to increase the response time of my database. I can have a number of database nodes with distributed data (partition tolerance) to be highly available and any loss of connectivity to any nodes prompts other nodes to take over!

This does not mean that BASE databases are not prone to failure. It does imply however, that they can recover quickly and consistently. They usually reside on standard commodity hardware, thus making them affordable for most businesses!

A lot of databases on websites prefer speed, performance, and scalability instead of pure consistency and integrity of data. However, as the next topic will cover, it is important to know what to choose!

Using ACID or BASE?

"Atomic, Consistent, Isolated, and Durable" (ACID) is a cliched term used for transactional databases. ACID databases are still very popular today but BASE databases are catching up.

ACID databases are good to use when you have heavy transactions at the core of your business processes. But most applications can live without this complexity. This does not imply that BASE databases do not support transactions, it's just that ACID databases are better suited for them.

Choose a database wisely — an old man said rightly! A choice of a database can decide the future of your product. There are many databases today that we can choose from. Here are some basic rules to help choose between databases for web applications:

A large number of small writes (vote up/down) — Redis
Auto-completion, caching — Redis, memcached
Data mining, trending — MongoDB, Hadoop, and Big Table
Content based web portals — MongoDB, Cassandra, and Sharded ACID databases
Financial Portals — ACID database

Using Ruby

So, if you are now convinced (or rather interested to read on about MongoDB), you might wonder where Ruby fits in anyway? Ruby is one of the languages that is being adopted the fastest among all the new-age object oriented languages. But the big differentiator is that it is a language that can be used, tweaked, and cranked in any way that you want — from writing sweet smelling code to writing a domain-specific language (DSL)!

Ruby metaprogramming lets us easily adapt to any new technology, frameworks, API, and libraries. In fact, most new services today always bundle a Ruby gem for easy integration.

There are many Ruby implementations available today (sometimes called Rubies) such as, the original MRI, JRuby, Rubinius, MacRuby, MagLev, and the Ruby Enterprise Edition. Each of them has a slightly different flavors, much like the different flavors of Linux.

I often have to "sell" Ruby to nontechnical or technically biased people. This simple experiment never fails:

When I code in Ruby, I can guarantee, "My grandmother can read my code". Can any other language guarantee that? The following is a simple code in C:

/* A simple snippet of code in C */
for (i = 0; i < 10; i++) {
printf("Hi");
}

And now the same code in Ruby:

# The same snippet of code in Ruby
10.times do
print "hi"
end

There is no way that the Ruby code can be misinterpreted. Yes, I am not saying that you cannot write complex and complicated code in Ruby, but most code is simple to read and understand. Frameworks, such as Rails and Sinatra, use this feature to ensure that the code we see is readable! There is a lot of code under the cover which enables this though. For example, take a look at the following Ruby code:

# library.rb
class Library
has_many :books
end
# book.rb
class Book
belongs_to :library
end

It's quite understandable that "A library has many books" and that "A book belongs to a library".

The really fun part of working in Ruby (and Rails) is the finesse in the language. For example, in the small Rails code snippet we just saw, books is plural and library is singular. The framework infers the model Book model by the symbol :books and infers the Library model from the symbol :library — it goes the distance to make code readable.

As a language, Ruby is free flowing with relaxed rules — you can define a method call true in your calls that could return false! Ruby is a language where you do whatever you want as long as you know its impact. It's a human language and you can do the same thing in many different ways! There is no right or wrong way; there is only a more efficient way. Here is a simple example to demonstrate the power of Ruby! How do you calculate the sum of all the numbers in the array [1, 2, 3, 4, 5]?

The non-Ruby way of doing this in Ruby is:

sum = 0
for element in [1, 2, 3, 4, 5] do
sum += element
end

The not-so-much-fun way of doing this in Ruby could be:

sum = 0
[1, 2, 3, 4, 5].each do |element|
sum += element
end

The normal-fun way of doing this in Ruby is:

[1, 2, 3, 4, 5].inject(0) { |sum, element| sum + element }

Finally, the kick-ass way of doing this in Ruby is either one of the following:

[1, 2, 3, 4, 5].inject(&:+)
[1, 2, 3, 4, 5].reduce(:+)

There you have it! So many different ways of doing the same thing in Ruby but notice how most Ruby — code gets done in one line.

Enjoy Ruby!

What this book covers

Chapter 1, Installing MongoDB and Ruby, describes how to install MongoDB on Linux and Mac OS. We shall learn about the various MongoDB utilities and their usage. We then install Ruby using RVM and also get a brief introduction to rbenv.

Chapter 2, Diving Deep into MongoDB, explains the various concepts of MongoDB and how it differs from relational databases. We learn various techniques, such as inserting and updating documents and searching for documents. We even get a brief introduction to Map/Reduce.

Chapter 3, MongoDB Internals, shares some details about what BSON is, usage of JavaScript, the global write lock, and why there are no joins or transactions supported in MongoDB. If you are a person in the fast lane, you can skip this chapter.

Chapter 4, Working Out Your Way with Queries, explains how we can query MongoDB documents and search inside different data types such as arrays, hashes, and embedded documents. We learn about the various query options and even regular expression based searching.

Chapter 5, Ruby DataMappers: Ruby and MongoDB Go Hand in Hand, provides details on how to use Ruby data mappers to query MongoDB. This is our first introduction to MongoMapper and Mongoid. We learn how to configure both of them, query using these data mappers, and even see some basic comparison between them.

Chapter 6, Modeling Ruby with Mongoid, introduces us to data models, Rails, Sinatra, and how we can model data using MongoDB data mappers. This is the core of the web application and we see various ways to model data, organize our code, and query using Mongoid.

Chapter 7, Achieving High Performance on Your Ruby Application with MongoDB, explains the importance of profiling and ensuring better performance right from the start of developing web applications using Ruby and MongoDB. We learn some best practices and concepts concerning the performance of web applications, tools, and methods which monitor the performance of our web application.

Chapter 8, Rack, Sinatra, Rails, and MongoDB — Making Use of them All, describes in detail how to build the full web application in Rails and Sinatra using Mongoid. We design the logical flow, the views, and even learn how to test our code and document it.

Chapter 9, Going Everywhere — Geospatial Indexing with MongoDB, helps us understand geolocation concepts. We learn how to set up geospatial indexes, get introduced to geocoding, and learn about geolocation spherical queries.

Chapter 10, Scaling MongoDB, provides details on how we scale MongoDB using replica sets. We learn about sharding, replication, and how we can improve performance using MongoDB map/reduce.

Appendix, Pop Quiz Answers, provides answers to the quizzes present at the end of chapters.

What you need for this book

This book would require the following:

MongoDB version 2.0.2 or latest
Ruby version 1.9 or latest
RVM (for Linux and Mac OS only)
DevKit (for Windows only)
MongoMapper
Mongoid

And other gems, of which I will inform you as we need them!

Who this book is for

This book assumes that you are experienced in Ruby and web development skills - HTML, and CSS. Having knowledge of using NoSQL will help you get through the concepts quicker, but it is not mandatory. No prior knowledge of MongoDB required.

Conventions

In this book, you will find several headings appearing frequently.

To give clear instructions of how to complete a procedure or task, we use:

Time for action — heading

1. Action 1
2. Action 2
3. Action 3

Instructions often need some extra explanation so that they make sense, so they are followed with:

What just happened?

This heading explains the working of tasks or instructions that you have just completed.

You will also find some other learning aids in the book, including:

Pop quiz — heading

These are short multiple choice questions intended to help you test your own understanding.

Have a go hero — heading

These set practical challenges and give you ideas for experimenting with what you have learned.

You will also find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text are shown as follows: "We can include other contexts through the use of the include directive."

A block of code is set as follows:

book = {
name: "Oliver Twist",
author: "Charles Dickens",
publisher: "Dover Publications",
published_on: "December 30, 2002",
category: ['Classics', 'Drama']
}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

function(key, values) {
var result = {votes: 0}
values.forEach(function(value) {
result.votes += value.votes;

});
return result;
}

Any command-line input or output is written as follows:

$ curl -L get.rvm.io | bash -s stable

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "clicking the Next button moves you to the next screen".

Note

Warnings or important notes appear in a box like this.

Note

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title through the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website, or added to any list of existing errata, under the Errata section of that title.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Ruby and MongoDB Web Development Beginner's Guide

By : Gautam Rege

Ruby and MongoDB Web Development Beginner's Guide

By: Gautam Rege

Overview of this book

Related Content you might be interested in

Current Title:

Ruby and MongoDB Web Development Beginner's Guide

Preface

Note

Understanding NoSQL

Brewer's CAP theorem

Note

What are BASE databases?

Using ACID or BASE?

Using Ruby

What this book covers

What you need for this book

Who this book is for

Conventions

Time for action — heading

What just happened?

Pop quiz — heading

Have a go hero — heading

Note

Note

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions