Book Image

Ruby and MongoDB Web Development Beginner's Guide

By : Gautam Rege
Book Image

Ruby and MongoDB Web Development Beginner's Guide

By: Gautam Rege

Overview of this book

<p>MongoDB is a high-performance, open source, schema-free document-oriented database. Ruby is an object- oriented scripting language. Ruby and MongoDB are an ideal partnership for building scalable web applications.<br /><br /><em>Ruby and MongoDB Web Development Beginner's Guide</em> is a fast-paced, hands-on guide to get started with web application development using Ruby and MongoDB. The book follows a practical approach, using clear and step-by-step instructions and examples in Ruby to demonstrate application development using MongoDB. <br /><br />The book starts by introducing the concepts of MongoDB. The book teaches everything right from the installation to creating objects, MongoDB internals, queries and Ruby Data Mappers. <br /><br />You will learn how to use various Ruby data mappers like Mongoid and MongoMapper to map Ruby objects to MongoDB documents.<br /><br />You will learn MongoDB features and deal with geo-spatial indexing with MongoDB and Scaling MongoDB. <br /><br />With its coverage of concepts and practical examples, <em>Ruby and MongoDB Web Development Beginner's Guide</em> is the right choice for Ruby developers to get started with developing websites with MongoDB as the database.</p>
Table of Contents (18 chapters)
Ruby and MongoDB Web Development Beginner's Guide
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface

Preface

And then there was light — a lightweight database! How often have we all wanted some database that was "just a data store"? Sure, you can use it in many complex ways but in the end, it's just a plain simple data store. Welcome MongoDB!

And then there was light — a lightweight language that was fun to program in. It supports all the constructs of a pure object-oriented language and is fun to program in. Welcome Ruby!

Both MongoDB and Ruby are the fruits of people who wanted to simplify things in a complex world. Ruby, written by Yokihiro Matsumoto was made, picking the best constructs from Perl, SmallTalk and Scheme. They say Matz (as he is called lovingly) "writes in C so that you don't have to". Ruby is an object-oriented programming language that can be summarized in one word: fun!

Note

It's interesting to know that Ruby was created as an "object-oriented scripting language". However, today Ruby can be compiled using JRuby or Rubinius, so we could call it a programming language.

MongoDB has its roots from the word "humongous" and has the primary goal to manage humongous data! As a NoSQL database, it relies heavily on data stored as key-value pairs.

Wait! Did we hear NoSQL — (also pronounced as No Sequel or No S-Q-L)? Yes! The roots of MongoDB lie in its data not having a structured format! Even before we dive into Ruby and MongoDB, it makes sense to understand some of these basic premises:

  • NoSQL

  • Brewer's CAP theorem

  • Basically Available, Soft-state, Eventually-consistent (BASE)

  • ACID or BASE

Understanding NoSQL

When the world was living in an age of SQL gurus and Database Administrators with expertise in stored procedures and triggers, a few brave men dared to rebel. The reason was "simplicity". SQL was good to use when there was a structure and a fixed set of rules. The common databases such as Oracle, SQL Server, MySQL, DB2, and PostgreSQL, all promoted SQL — referential integrity, consistency, and atomic transactions. One of the SQL based rebels - SQLite decided to be really "lite" and either ignored most of these constructs or did not enforce them based on the premise: "Know what you are doing or beware".

Similarly, NoSQL is all about using simple keys to store data. Searching keys uses various hashing algorithms, but at the end of the day all we have is a simple data store!

With the advent of web applications and crowd sourcing web portals, the mantra was "more scalable than highly available" and "more speed instead of consistency". Some web applications may be okay with these and others may not. What is important is that there is now a choice and developers can choose wisely!

It's interesting to note that "key-value pair" databases have existed from the early 80's — the earliest to my knowledge being Berkeley DB — blazingly fast, light-weight, and a very simple library to use.

Brewer's CAP theorem

Brewer's CAP theorem states that any distributed computer system can support only any two among consistency, atomicity, and partition tolerance.

  • Consistency deals with consistency of data or referential integrity

  • Atomicity deals with transactions or a set of commands that execute as "all or nothing"

  • Partition tolerance deals with distributed data, scaling and replication

There is sufficient belief that any database can guarantee any two of the above. However, the essence of the CAP theorem is not to find a solution to have all three behaviors, but to allow us to look at designing databases differently based on the application we want to build!

For example, if you are building a Core Banking System (CBS) , consistency and atomicity are extremely important. The CBS must guarantee these two at the cost of partition tolerance. Of course, a CBS has its failover systems, backup, and live replication to guarantee zero downtime, but at the cost of additional infrastructure and usually a single large instance of the database.

A heavily accessed information web portal with a large amount of data requires speed and scale, not consistency. Does the order of comments submitted at the same time really matter? What matters is how quickly and consistently the data was delivered. This is a clear case of consistency and partition tolerance at the cost of atomicity.

Note

An excellent article on the CAP theorem is at http://www.julianbrowne.com/article/viewer/brewers-cap-theorem.

What are BASE databases?

"Basically Available, Soft-state, Eventually-consistent"!!

Just the name suggests, a trade-off, BASE databases (yes, they are called BASE databases intentionally to mock ACID databases) use some tactics to have consistency, atomicity, and partition tolerance "eventually". They do not really defy the CAP theorem but work around it.

Simply put: I can afford my database to be consistent over time by synchronizing information between different database nodes. I can cache data (also called "soft-state") and persist it later to increase the response time of my database. I can have a number of database nodes with distributed data (partition tolerance) to be highly available and any loss of connectivity to any nodes prompts other nodes to take over!

This does not mean that BASE databases are not prone to failure. It does imply however, that they can recover quickly and consistently. They usually reside on standard commodity hardware, thus making them affordable for most businesses!

A lot of databases on websites prefer speed, performance, and scalability instead of pure consistency and integrity of data. However, as the next topic will cover, it is important to know what to choose!

Using ACID or BASE?

"Atomic, Consistent, Isolated, and Durable" (ACID) is a cliched term used for transactional databases. ACID databases are still very popular today but BASE databases are catching up.

ACID databases are good to use when you have heavy transactions at the core of your business processes. But most applications can live without this complexity. This does not imply that BASE databases do not support transactions, it's just that ACID databases are better suited for them.

Choose a database wisely — an old man said rightly! A choice of a database can decide the future of your product. There are many databases today that we can choose from. Here are some basic rules to help choose between databases for web applications:

  • A large number of small writes (vote up/down) — Redis

  • Auto-completion, caching — Redis, memcached

  • Data mining, trending — MongoDB, Hadoop, and Big Table

  • Content based web portals — MongoDB, Cassandra, and Sharded ACID databases

  • Financial Portals — ACID database

Using Ruby

So, if you are now convinced (or rather interested to read on about MongoDB), you might wonder where Ruby fits in anyway? Ruby is one of the languages that is being adopted the fastest among all the new-age object oriented languages. But the big differentiator is that it is a language that can be used, tweaked, and cranked in any way that you want — from writing sweet smelling code to writing a domain-specific language (DSL)!

Ruby metaprogramming lets us easily adapt to any new technology, frameworks, API, and libraries. In fact, most new services today always bundle a Ruby gem for easy integration.

There are many Ruby implementations available today (sometimes called Rubies) such as, the original MRI, JRuby, Rubinius, MacRuby, MagLev, and the Ruby Enterprise Edition. Each of them has a slightly different flavors, much like the different flavors of Linux.

I often have to "sell" Ruby to nontechnical or technically biased people. This simple experiment never fails:

When I code in Ruby, I can guarantee, "My grandmother can read my code". Can any other language guarantee that? The following is a simple code in C:

/* A simple snippet of code in C */
for (i = 0; i < 10; i++) {
printf("Hi");
}

And now the same code in Ruby:

# The same snippet of code in Ruby
10.times do
print "hi"
end

There is no way that the Ruby code can be misinterpreted. Yes, I am not saying that you cannot write complex and complicated code in Ruby, but most code is simple to read and understand. Frameworks, such as Rails and Sinatra, use this feature to ensure that the code we see is readable! There is a lot of code under the cover which enables this though. For example, take a look at the following Ruby code:

# library.rb
class Library
has_many :books
end
# book.rb
class Book
belongs_to :library
end

It's quite understandable that "A library has many books" and that "A book belongs to a library".

The really fun part of working in Ruby (and Rails) is the finesse in the language. For example, in the small Rails code snippet we just saw, books is plural and library is singular. The framework infers the model Book model by the symbol :books and infers the Library model from the symbol :library — it goes the distance to make code readable.

As a language, Ruby is free flowing with relaxed rules — you can define a method call true in your calls that could return false! Ruby is a language where you do whatever you want as long as you know its impact. It's a human language and you can do the same thing in many different ways! There is no right or wrong way; there is only a more efficient way. Here is a simple example to demonstrate the power of Ruby! How do you calculate the sum of all the numbers in the array [1, 2, 3, 4, 5]?

The non-Ruby way of doing this in Ruby is:

sum = 0
for element in [1, 2, 3, 4, 5] do
sum += element
end

The not-so-much-fun way of doing this in Ruby could be:

sum = 0
[1, 2, 3, 4, 5].each do |element|
sum += element
end

The normal-fun way of doing this in Ruby is:

[1, 2, 3, 4, 5].inject(0) { |sum, element| sum + element }

Finally, the kick-ass way of doing this in Ruby is either one of the following:

[1, 2, 3, 4, 5].inject(&:+)
[1, 2, 3, 4, 5].reduce(:+)

There you have it! So many different ways of doing the same thing in Ruby but notice how most Ruby — code gets done in one line.

Enjoy Ruby!

What this book covers

Chapter 1, Installing MongoDB and Ruby, describes how to install MongoDB on Linux and Mac OS. We shall learn about the various MongoDB utilities and their usage. We then install Ruby using RVM and also get a brief introduction to rbenv.

Chapter 2, Diving Deep into MongoDB, explains the various concepts of MongoDB and how it differs from relational databases. We learn various techniques, such as inserting and updating documents and searching for documents. We even get a brief introduction to Map/Reduce.

Chapter 3, MongoDB Internals, shares some details about what BSON is, usage of JavaScript, the global write lock, and why there are no joins or transactions supported in MongoDB. If you are a person in the fast lane, you can skip this chapter.

Chapter 4, Working Out Your Way with Queries, explains how we can query MongoDB documents and search inside different data types such as arrays, hashes, and embedded documents. We learn about the various query options and even regular expression based searching.

Chapter 5, Ruby DataMappers: Ruby and MongoDB Go Hand in Hand, provides details on how to use Ruby data mappers to query MongoDB. This is our first introduction to MongoMapper and Mongoid. We learn how to configure both of them, query using these data mappers, and even see some basic comparison between them.

Chapter 6, Modeling Ruby with Mongoid, introduces us to data models, Rails, Sinatra, and how we can model data using MongoDB data mappers. This is the core of the web application and we see various ways to model data, organize our code, and query using Mongoid.

Chapter 7, Achieving High Performance on Your Ruby Application with MongoDB, explains the importance of profiling and ensuring better performance right from the start of developing web applications using Ruby and MongoDB. We learn some best practices and concepts concerning the performance of web applications, tools, and methods which monitor the performance of our web application.

Chapter 8, Rack, Sinatra, Rails, and MongoDB — Making Use of them All, describes in detail how to build the full web application in Rails and Sinatra using Mongoid. We design the logical flow, the views, and even learn how to test our code and document it.

Chapter 9, Going Everywhere — Geospatial Indexing with MongoDB, helps us understand geolocation concepts. We learn how to set up geospatial indexes, get introduced to geocoding, and learn about geolocation spherical queries.

Chapter 10, Scaling MongoDB, provides details on how we scale MongoDB using replica sets. We learn about sharding, replication, and how we can improve performance using MongoDB map/reduce.

Appendix, Pop Quiz Answers, provides answers to the quizzes present at the end of chapters.

What you need for this book

This book would require the following:

  • MongoDB version 2.0.2 or latest

  • Ruby version 1.9 or latest

  • RVM (for Linux and Mac OS only)

  • DevKit (for Windows only)

  • MongoMapper

  • Mongoid

And other gems, of which I will inform you as we need them!

Who this book is for

This book assumes that you are experienced in Ruby and web development skills - HTML, and CSS. Having knowledge of using NoSQL will help you get through the concepts quicker, but it is not mandatory. No prior knowledge of MongoDB required.

Conventions

In this book, you will find several headings appearing frequently.

To give clear instructions of how to complete a procedure or task, we use:

Time for action — heading

  1. 1. Action 1

  2. 2. Action 2

  3. 3. Action 3

Instructions often need some extra explanation so that they make sense, so they are followed with:

What just happened?

This heading explains the working of tasks or instructions that you have just completed.

You will also find some other learning aids in the book, including:

Pop quiz — heading

These are short multiple choice questions intended to help you test your own understanding.

Have a go hero — heading

These set practical challenges and give you ideas for experimenting with what you have learned.

You will also find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text are shown as follows: "We can include other contexts through the use of the include directive."

A block of code is set as follows:

book = {
name: "Oliver Twist",
author: "Charles Dickens",
publisher: "Dover Publications",
published_on: "December 30, 2002",
category: ['Classics', 'Drama']
}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

function(key, values) {
var result = {votes: 0}
values.forEach(function(value) {
result.votes += value.votes;

});
return result;
}

Any command-line input or output is written as follows:

$ curl -L get.rvm.io | bash -s stable

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "clicking the Next button moves you to the next screen".

Note

Warnings or important notes appear in a box like this.

Note

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to , and mention the book title through the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website, or added to any list of existing errata, under the Errata section of that title.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at if you are having a problem with any aspect of the book, and we will do our best to address it.