Book Image

Amazon SimpleDB Developer Guide

Book Image

Amazon SimpleDB Developer Guide

Overview of this book

SimpleDB is a highly scalable, simple-to-use, and inexpensive database in the cloud from Amazon Web Services. But in order to use SimpleDB, you really have to change your mindset. This isn't a traditional relational database; in fact it's not relational at all. For developers who have experience working with relational databases, this may lead to misconceptions as to how SimpleDB works.This practical book aims to address your preconceptions on how SimpleDB will work for you. You will be quickly led through the differences between relational databases and SimpleDB, and the implications of using SimpleDB. Throughout this book, there is an emphasis on demonstrating key concepts with practical examples for Java, PHP, and Python developers.You will be introduced to this massively scalable schema-less key-value data store: what it is, how it works, and why it is such a game-changer. You will then explore the basic functionality offered by SimpleDB including querying, code samples, and a lot more. This book will help you deploy services outside the Amazon cloud and access them from any web host.You will see how SimpleDB gives you the freedom to focus on application development. As you work through this book you will be able to optimize the performance of your applications using parallel operations, caching with memcache, asynchronous operations, and more.
Table of Contents (16 chapters)
Amazon SimpleDB Developer Guide
Credits
Foreword
About the Authors
About the Reviewers
Preface

How does SimpleDB work?


The best way to wrap your head around the way SimpleDB works is to picture a spreadsheet that contains your structured data. For instance, a contact database that stores information on your customers can be represented in SimpleDB as follows:

As SimpleDB is a different database metaphor, new terms have been introduced. The use of this new terminology by Amazon stresses that the traditional assumptions may not be valid.

Domain

The entire customers table will be represented as the domain Customers. Domains group similar data for your application, and you can have up to 100 domains per AWS account. If required, you can increase this limit further by filling out a form on the SimpleDB website. The data stored in these domains is retrieved by making queries against the specific domain. There is no concept of joins as in the relational database world; therefore, queries run within a specific domain and not across domains.

Item

Each customer is represented by a unique Customer ID. Items are similar to rows in a database table. Each item identifies a single object and contains data for that individual item as a number of key-value attributes. Each item is identified by a unique key or identifier, or in traditional terminology, the primary key. SimpleDB does not support the concept of auto-incrementing keys, and most people use a generated key such as the unix timestamp combined with the user identifier or something similar as the unique identifier for an item. You can have up to one billion items in each domain.

Attributes

Each customer item will have distinguishing characteristics that are represented by an attribute. A customer will have a name, a phone number, an address, and other such attributes, which are similar to the columns in a table in a database. SimpleDB even enables you to have different attributes for each item in a domain. This kind of schema independence lets you mix and match items within a domain to satisfy the needs of your application easily, while at the same time enables you to take advantage of the benefits of the automatic indexing provided by SimpleDB. If your company suddenly decides to start marketing using Twitter, you can simply add a new attribute to your customer domain for the customers who have a Twitter ID! In traditional database terminology, there is no need to add a new column to the table.

Values

Each customer attribute will be associated with a value, which is the same as a cell in a spreadsheet or the value of a column in a database. A relational database or a spreadsheet supports only a single value for each cell or column, while SimpleDB allows you to have multiple values for a single attribute. This lets you do things such as store multiple e-mail addresses for a customer while taking advantage of automatic indexing, without the need for you to manually create new and separate columns for each e-mail address, and then index each new column. In a relational DB, a separate table with a join would be used to store the multiple values. Unlike a delimited list in a character field, the multiple values are indexed, enabling quick searching.

It is a simple way of modeling your data, but at the same time, it is different from a relational database model that is familiar to most users. The following table compares SimpleDB components with a spreadsheet and a relational database:

Relational Database

Spreadsheet

SimpleDB

Table

Worksheet

Domain

Row

Row

Item

Column

Cell

Attribute

Value

Value

Value(s)