Book Image

Cloning Internet Applications with Ruby

By : Chang Sau Sheong
Book Image

Cloning Internet Applications with Ruby

By: Chang Sau Sheong

Overview of this book

Most users on the Internet have a few favorite Internet web applications that they use often and cannot do without. These popular applications often provide essential services that we need even while we don’t fully understand its features or how they work. Ruby empowers you to develop your own clones of such applications without much ordeal. Learning how these sites work and describing how they can be implemented enables you to move to the next step of customizing them and enabling your own version of these services.This book shows the reader how to clone some of the Internet's most popular applications in Ruby by first identifying their main features, and then showing example Ruby code to replicate this functionality.While we understand that it connects us to our friends and people we want to meet up with, what is the common feature of a social network that makes it a social network? And how do these features work? This book is the answer to all these questions. It will provide a step-by-step explanation on how the application is designed and coded, and then how it is deployed to the Heroku cloud platform. This book’s main purpose is to break up popular Internet services such as TinyURL, Twitter, Flickr, and Facebook to understand what makes it tick. Then using Ruby, the book describes how a minimal set of features for these sites can be modeled, built, and deployed on the Internet.
Table of Contents (13 chapters)

Technologies used


The technology stack used in this book consists of mainly Ruby-only libraries and tools:

  • Sinatra—a Ruby domain-specific language (DSL) with a minimalist approach in building web applications

  • DataMapper—a Ruby object-relational mapping library

  • Haml—a Ruby-friendly markup language that allows us to manipulate XHTML of any web document programmatically

We will be going in depth in each of these technologies. While this seems a bit too much to cover within a single chapter, each technology is essentially not complex. Once you have grasped the basics of each technology, a quick reference back to the documentation will allow you do to anything you want.

Sinatra

Sinatra is a domain-specific language built with Ruby, used to build web applications. Sinatra was created with a minimalist approach in mind and focuses on the fastest way to get a web application up and running. For example, you can create a simple web application with just the following in a file named hello.rb:

require 'rubygems'
require 'sinatra'

get '/' do   
"Hello world, it's #{Time.now} at the server!"
end

After that just run the following command:

ruby hello.rb

Then go to http://localhost:4567/ and you will see the hello statement with the current time. Writing a web application becomes almost trivial up to this stage. Of course as web applications become more complex, unlike other full-fledged web frameworks such as Ruby on Rails or Merbs, you will need to write more code.

As mentioned earlier, one of the reasons why we chose Sinatra is because of its simplicity and minimalist approach. In a book that teaches how application features can be implemented, more complex frameworks can often add to the clutter because of 'the way it works' rather than clarifying the implementation of the feature. As a result, a DSL such as Sinatra, where nothing is taken for granted, is very useful as a teaching tool.

Installing

Sinatra can be easily installed through Rubygems:

$ sudo gem install sinatra

That's all there is to it. You will be able to use Sinatra immediately after that.

Routes

In Sinatra, a route is HTTP method and a URL matching pattern. For example, this is a route:

get '/' do
  ...
end

And so are these:

post '/some_url' do
  ...
end
put '/another_url' do
  ...
end
delete '/any_url' do
  ...
end

Whenever a HTTP request comes in, the request will be matched in the order they are defined. For example, if a POST request is made to http://localhost:4567/some_url, the some_url route will be invoked. The route pattern matching includes named parameters, for example:

get '/hello/:name' do
  puts "Hello #{params[:name]}!"
end

Patterns may also include other matching conditions such as user agents. This is useful if we want to determine the type of device that is accessible by the application, for example if we create an iPhone web application we can indicate that the user agent is the following:

Mozilla/5.0 (iPhone; U; CPU iPhone OS 2_0 like Mac OS X; en-us) AppleWebKit/525.18.1 (KHTML, like Gecko) Version/3.1.1 Mobile/1A543 Safari/525.20
get '/hello', :agent => /iPhone/ do
  puts "You are using an iPhone!"
end

GET and POST methods are quite simply implemented above, but how about PUT and DELETE? These two methods are normally not natively supported by most browsers but can be worked around using a POST. If you set up a HTML form that sends a POST with a hidden element with the name '_method' and the value 'put' or 'delete' accordingly, Sinatra will interpret it accordingly and invoke the correct route.

For example:

<form method="post" action="/destroy">   
  <input name="_method" value="delete" />
  <button type="submit">Destroy</button>
</form>

The above code will invoke this route:

delete '/destroy' do
  ...
end

Splitting a route into multiple files

Sinatra looks very good and simple if we're writing simple web applications with only a few routes but what if the application is much larger? Managing all those routes in a single file becomes a hassle and is rather unwieldy. Remember Sinatra is also all-Ruby, so you use load to load in other files that contain routes. This way you can make your application more modular by placing related routes in the same file.

%w(photos user helpers).each {|feature| load "#{feature}.rb"}

In the example code snippet above, we have three files named photos.rb, users.rb, and helpers.rb in which we place related routes. This helps us to include features that we want and potentially to remove features we do not want by changing the list. The code snippet above would then be placed in the main file such as myapp.rb.

Redirection

Sometimes within a route you want to redirect the user somewhere else. This can be some other route or to an external site. This can be done using the redirect helper, for example:

redirect '/'  
redirect 'http://www.google.com'

The redirect actually sends back a 302 Found HTTP status code to the browser and tells the browser where to go next. To force Sinatra to send a different status code, just add the status code to the redirection helper.

redirect '/', 303
redirect '/', 307

Note that this sends the browser to another route or site and not to a view.

Filters

Sinatra has a simple filtering mechanism. If you define a before filter, it will be invoked every time before a route is invoked.

before do
  ...
end

This becomes especially useful in securing routes because we can check if the user has access to that route before it is invoked. Any instance variables defined in the before filter will be available to the route and the views subsequently.

Similarly, if you define an after filter, it will be invoked every time after a route is invoked.

after do
  ...
end

Just as the before filter, you can modify the instance variables that go to the view. You can also modify the response.

Static pages

By default, all pages in a folder named public are served out as static pages. For example, if you have a page.html file in the public folder, you will be able to access it from http://localhost:4567/page.html. This means that you can also serve out Javascript libraries, CSS stylesheets, and image files through the same folder.

If you want to change default public folder, just change the settings:

set :public, File.dirname(__FILE__) + '/static'

Views

Similarly, by default Sinatra looks for view templates in a folder named views. You can also change the default directory by changing the settings as follows:

set :views, File.dirname(__FILE__) + '/templates'

View templates are files that are used to display data that is processed by a route. For example, this route will redirect to a Haml view template, which is a file called view_page.haml in the views folder:

get '/page/view' do
  . . .
  haml :view_page
end

Besides Haml, Sinatra also supports a variety of view template types such as Erb, Erubis, Sass, Builder, and so on. We will discuss Haml in a later section in this chapter.

Note that the templates always need to be referenced as symbols, even in subdirectories. For example, if the Haml view template is in a file called view.haml in the views/page subfolder, then you should reference it as:'page/view'.

Layouts

While you are not required to use any layouts, if you have a file named layout.haml (or layout.erb and so on) in your views folder, it will be used as a layout template. A layout template is a view template that is re-used for multiple views. For example, this is a Haml layout:

html
  %head
    %title Cloning Internet Applications with Ruby
  %body
    #container
      =yield

Any view rendered for Haml will now use this layout and the page will include the layout with the view replaced in the yield.

Helpers

If you have some functions you need repeatedly, you can create helpers. Helpers in Sinatra are methods that can be reused in routes and templates.

helpers do
  def encrypt(data)
    . . .
  end
end

get '/secret/:policy' do
  encrypt(params[:policy])
end

One use of helpers we employ repeatedly in this book is to create partials. Sinatra does not support partials on its own, which can be a bit annoying, but the implementation of partials is easily done.

helpers do
    def snippet(page, options={})
    haml page, options.merge!(:layout => false)
    end
end

Essentially we just render a given page template, and declaring that we do not use the layout.

Error handling

Sinatra handles error in a minimalist way. There are two basic handlers. If any resource or route is not found, and if not_found is defined, it will be invoked.

not_found do   
  'This is nowhere to be found'
end

Any other errors will be caught by error. By default error will catch Sinatra::ServerError and Sinatra will pass you the error through sinatra.error in request.env.

error do   
  'Sorry there was a nasty error - ' + request.env['sinatra.error'].name
end

You can also customize the errors such as the following:

error MyCustomError do   
  'So what happened was...' + request.env['sinatra.error'].message
end

This could happen:

get '/' do   
  raise MyCustomError, 'something bad'
end

In which case, the error helper will be called and the message displayed.

That was a whirlwind tour of Sinatra but it has covered everything you need to know about Sinatra to start writing Sinatra applications. For more information on Sinatra please head on to http://www.sinatrarb.com.

DataMapper

DataMapper is a Ruby object-relational mapping library, one of the three main libraries as of writing. Object-relational mapping libraries exist to resolve impedance mismatch between Ruby, the object-oriented programming language, and a relational database. Essentially it maps database tables as classes, rows as objects, and columns as properties and values of an object while mapping relationships as one-to-one, one-to-many, or many-to-many.

Note

Object-oriented programming languages and relational databases are a common match and a large number of applications have been developed with such pairing of technologies. However, the underlying principles of object-oriented programming and relational databases do not match and can potentially cause problems. For example, the basic principles of classes of objects, inheritance, and polymorphism don't exist in relational databases and the expectations of the data types often differ. This mismatch is commonly known as the object-relational impedance mismatch.

One way to overcome this mismatch is to use object-relational mapping or ORM tools such as DataMapper. Such tools map a relational database to a layer of objects that can be manipulated by the application. As a result the application does not interact with the relational database directly. Instead, it manipulates data through the ORM, which in turn controls how the data is finally persisted into the database.

DataMapper and ActiveRecord (the default ORM library in Ruby on Rails) are quite similar. If you have prior experience in ActiveRecord, most of what you read here will be very familiar.

A note on the DataMapper version used in this book. As of writing, the latest version of DataMapper is 0.10.2. However, in this book we will be using version 0.9.11. This is because a feature we need in the projects in this book (self-referential many-to-many) is not supported in 0.10.2. In fairness the feature has been removed to prepare a better implementation in a future version. Unfortunately, for this book we will be using a slightly older version.

Installing

DataMapper is broken up into the core library, dm-core, various database adapters and a number of optional libraries collectively known as dm-more. While you can install dm-more as an umbrella library, it is generally more advisable to just install those that you need. For a basic installation, you need to install the core library as well as at least one database adapter:

gem install dm-more

The most popular adapters are probably ones that relate to the DataObjects library. The DataObjects library is an attempt to rewrite existing database drivers to conform to a standard interface and has some of the more popular databases supported. For example to install support for MySQL:

gem install do_mysql

Connecting to the database

The first thing you need to do before you start using DataMapper is to specify the connection to the database. This is easily done by specifying the database connection string:

DataMapper.setup(:default, 'mysql://localhost/ database_name')

Creating models

Once you have the connection, you can define your DataMapper models. Unlike ActiveRecord (or Sequel, the other popular ORM library), DataMapper does not need a separate migration step or file to create the database tables. The database tables are created from the definition of the model itself.

An example of a DataMapper model is as follows:

class User
  include DataMapper::Resource
  property :id,         Serial
  property :email,      String, :length => 255
  property :nickname,   String, :length => 255
  property :birth_date, DateTime
  property :education,  Text
  property :work_history, Text
  property :description, TExt
end

Let's go through several key elements of this definition. Firstly all DataMapper models are classes that include the Datamapper::Resource module. This provides them with the necessary methods used in defining the model. Each property of the model is defined with the method property, with a given name and a type. The types used are atypical. The Serial type however is a shortcut for defining an auto-incrementing integer that is a primary key. Otherwise you'll need to define it yourself like this:

property :some_id, :key => true

Note that DataMapper supports composite keys, meaning we can make more than one property in the model a primary key.

While dm-core supports the standard set of properties you'll find in any database, DataMapper actually supports a lot more other types if you include dm-types, including CSV (comma-separated values), IP addresses, JSON, URIs and so on.

Properties can be configured to be lazy loaded, which means that the value of the property is not requested from the data store by default but only loaded when its accessor is called for the first time. Some properties, such as the Text, are lazily loaded by default to improve performance.

Lazy loading can also be done together. For example, if one property is loaded, we can force related properties to be loaded. For example, the three properties for the User model above, education, work_history, and description are Text and are lazily loaded by default. If we define them this way:

property :education,  Text,   :lazy => [:show]
property :work_history, Text  :lazy => [:show]
property :description, Text  

If the education property is called, the work_history property will also be loaded from the datastore, since both of them are members of the :show group. However, the description property will only be fetched when it's asked.

Defining associations between models

A major use of ORM libraries such as DataMapper is that it provides object-oriented convenience for relationships between rows in different tables. The three main types of relationships or associations between tables are:

  • One-to-one

  • One-to-many

  • Many-to-many

One-to-one

DataMapper's one-to-one association uses the has 1 and belongs_to methods.

class User
  include DataMapper::Resource
  property :id, Serial
  has 1, :account
end
class Account
  include DataMapper::Resource
  property :id, Serial
  belongs_to, :user
end

Very simply put, the has 1 method shows the user owning one account while belongs_to defines the two-way relationship back to the user.

The database tables generated from these models looks like the following:

To use these models, fire up irb.

$ irb -r models.rb
>> user = User.create
=> #<User id=1>
>> account = Account.create
=> #<Account id=1 user_id=nil>

We create a user and an account. Note that when the account is created it's not attached to any users yet.

>> user.account = account
=> #<Account id=1 user_id=nil>
>> user.save
=> true
>> user.account
=> #<Account id=1 user_id=1>

By specifying that user only has 1 account, we added in the User#account and User#account= methods to the User class. This allows us to set our new account to the user object. Notice that even after having set the account to the user, the Accounts table user_id column is still unpopulated. This is because we are still manipulating in memory. We need to persist it by saving the object.

One-to-many

The one-to-many association can be defined with the has n and belongs_to methods , shown as follows:

class User
  include DataMapper::Resource
  property :id, Serial
  has n, :comments
end
class Comment
  include DataMapper::Resource
  property :id, Serial
  belongs_to, :user
end

The database tables created from these models look like the following:

The database tables look exactly the same as in the one-to-one. This is because the controls and logic are actually set by the has n method we used in the User class. Let's look at how we use the one-to-many relationship. As before let's start with creating the user and some comments:

>> user = User.create
=> #<User id=1>
>> comment1 = Comment.create
=> #<Comment id=1 user_id=nil>
>> comment2 = Comment.create
=> #<Comment id=2 user_id=nil>

To add the comments to the user, we treat user.comments as an array and simply stuff the comments in using the << operator:

>> user.comments << comment1 << comment2

Note that user.comments can be treated as an array, and even be converted to one if necessary:

>> user.comments.class
=> DataMapper::Associations::OneToMany::Proxy
>> user.comments.to_a
=> [#<Comment id=1 user_id=1>, #<Comment id=2 user_id=1>]
Many-to-many

The many-to-many association can be defined with the has n and belongs_to methods. There are two ways of defining many-to-many associations. The first is to use a concrete model to represent the relationship between the two models. In this example, we have a user who can borrow many books and books that can be borrowed by many users. To represent the relationship between users and books, we will create a concrete model called Loan.

class User
  include DataMapper::Resource
  property :id, Serial
  has n, :loans
  has n, :books, :through => :loans
end
class Loan
  include DataMapper::Resource
  property :id,	Serial
  property :created_at, DateTime

  belongs_to :user
  belongs_to :book
end
class Book
  include DataMapper::Resource
  property :id, Serial
  has n, :loans
  has n, :users, :through => :loans
end

This creates the database tables as follows:

To use these models:

>> user1 = User.create
=> #<User id=1>
>> book1 = Book.create
=> #<Book id=1>
>> Loan.create(:book => book1, :user => user1)
=> #<Loan id=1 created_at=nil user_id=1 book_id=1>
>> user1.books.to_a
=> [#<Book id=1>]

Why can't we add the books to the user right away like that we did in the one-to-many? Unfortunately, DataMapper in version 0.9.11 has a bug that does not allow this. It has been fixed in version 0.10.2 but as mentioned earlier it is not the version used in this book.

The second way of defining many-to-many associations is through an anonymous resource:

class User
  include DataMapper::Resource
  property :id, Serial
  has n, :books, :through => Resource
end
class Book
  include DataMapper::Resource
  property :id, Serial
  has n, :users, :through => Resource
end

These are the tables generated by the models:

Notice that a table named books_users has been created for you with the user_id and book_id primary keys.

The shorter way of adding books to users works here as in one-to-many:

>> user1 = User.create
=> #<User id=1>
>> book1 = Book.create
=> #<Book id=1>
>> user1.books << book1
=> . . .
>> user1.save
=> true
>> user1.books.to_a
=> [#<Book id=1>]

There are some reasons why you would use one way or the other. You can have additional attributes for the concrete models so if you need to add additional attributes you cannot run away from them. In the preceding example we can include the date and time when the loan was made. We can't do this with the anonymous resource. However, the anonymous resource way is much shorter and simpler to maintain and at least at this point in time works better than the awkward creation of the many-to-many concrete model.

Creating the database tables

Creating the database tables is relatively simple. We just need to log into irb with the necessary models loaded and run auto_migrate. Assuming that the database setup and model definitions are in a file named models.rb:

$ irb –r models.rb
>> DataMapper.auto_migrate!

This will create the necessary tables.

Finding records

One of the most important and frequent actions with DataMapper would be to find and retrieve data from the database. DataMapper provides a few methods of retrieving data. The simplest is to retrieve a record by its key:

>> User.get(1)

We can also find a record by any of the columns using the first method:

>> User.first(:nickname =>  'sausheong')

We can get all the records in the table:

>> User.all

Records can also be filtered and the filters can be chained:

>> active_users = User.all(:active => true)
>> male_active_users = active_users.all(:sex => 'male')

The all and first methods can both have more than one filter and these filters can use certain symbols to specify how the filters work. For example, the filters below indicate that we want to find all users who are born after 1980, who are not married and the sex as male:

>> User.all(:birth_date.gt => '1980-01-01', :marital_status.not => 'married', :sex => 'male')

However, note that these filters are AND filters, meaning that the records retrieved must pass all the filters before they are retrieved. In the later 0.10.2 release, you can combine these queries using OR or more complex filtering conditions.

DataMapper is very powerful and we have only scratched the surface on its capabilities. DataMapper supports an aspect-oriented approach in doing callbacks or hooks, chained association calls, single table inheritance, multiple data stores, and many other features that are provided by various optional packages in dm-more. To find out more about DataMapper you should visit http://www.datamapper.org and go through the existing documentation.

Haml

Haml (which stands for XHTML Abstraction Markup Language) is a markup language that cleanly describes XHTML without the use of inline code. Haml was originally written for Ruby but has since been used in many other languages including Python, PHP, Perl, ASP.NET and even Scala.

Installing

Installing Haml is very easy and done through the usual Haml gem:

$ sudo gem install Haml

Using Haml

The easiest way to explain Haml is to do a quick comparison between Haml and HTML. This is a simple HTML snippet:

<div id='content'>
  <div class='left column'>
    <h2>Welcome to our site!</h2>
    <p>Some basic information</p>
  </div>
  <div class="right column">
    Some more information
  </div>
  <div>
    <a href="/some_url">here</a>
  </div>
</div>

And this is the Haml equivalent:

#content
  .left.column
    %h2 Welcome to our site!
    %p Some basic information
  .right.column
    Some more information
   %a{:href => "/some_url"}

Note that the Haml template is smaller and easier to read without the opening and closing tags. We can do away with the tags because Haml is whitespace active, meaning whitespaces are important in Haml. The indentation defines how the tags are grouped. While this can be restrictive at times, it actually helps us to write code that is more easily debugged and maintained. Ultimately the Haml template is compiled into the same HTML.

Here are some simple rules to start using Haml:

  • All tags are replaced with %. For example, instead of writing <h2> you just need to do %h2. The exception to this is the DIV tag, which is used so often that it is simply omitted if there are attributes.

  • As mentioned earlier, indentation is important and defines the nesting in the tags. For example, in the snippet above the H2 tag is at the same indentation level as the P tag. This means they are not nested but are sibling tags. If instead of being on the same level, the P tag is indented another level to the H2 tag, the P tag will be nested within the H2 tag.

  • Brackets represent a Ruby hash that is used for specifying the attributes of a tag. For example %a{:href => '/some_url'} here is compiled to <a href='/some_url'>here</a>.

  • Borrowing from CSS, we can use the . shortcut to indicate a class attribute and the # shortcut to indicate an id attribute. For example, .left.column is compiled to <div class='left column'> since DIV is assumed if no tag is used.

Haml and Ruby

While Haml is interesting and useful as a means to simplify HTML, it is only really powerful as a templating engine when combined with Ruby. Here is the same snippet above, re-written to include some Ruby code:

#content
  .left.column
    %h2 Welcome to our site #{@user.name}!
    %p Some basic information
	 %ol
	   - @some_array.each do |item|
        %li= item.name

  .right.column
    Some more information
	 %a{:href => "/some_url"}

There are a few ways Ruby code can be integrated within Haml:

  • To evaluate some Ruby code and insert the output into the compiled document, we use the equals(=) sign. This can be placed after the tag to place the output within the tag.

  • To evaluate some Ruby code but not insert any output into the compiled document, we the dash(-) sign. We can place the dash sign anywhere. If the evaluated code is a block, we don't need to explicitly close the block, Haml will take care of it.

  • To evaluate some Ruby code and insert the output within some text, you can use #{} and place it within any text just as you would do with a Ruby string.

For more information on Haml please go to http://www.haml-lang.com.

Now that we have wrapped up the quick tour of the technology stack, let's get back to the book and describe how to approach reading it.