The following table makes understanding the data model easier. In RDBMS, a table is organized into rows and columns, but in DynamoDB we will never use these two words (except in this paragraph). Even if it is used mistakenly, please understand that rows are called items and columns are called attributes in DynamoDB, as shown in the following table:
Having said that, let's go and look at realizing a table in DynamoDB. Throughout this book, we are going to use a common illustration. The common illustration is that of a library catalogue, and we are going to discuss examples related to it. Let's take a look at the library catalogue table:
If you wish to know how to create a table with the attributes mentioned in Table 1.1, read the DynamoDB data types section first. During the creation of a DynamoDB table, it is only possible to specify secondary index attributes, and hash and range key attributes. It is not possible to specify other attributes (previously mentioned as optional attributes) during the creation of the table. In fact, except for hash and range key attributes, all other attributes are part of the items (rows); that is the reason why we don't specify these optional attributes while creating the table.
Let's call the table
Tbl_Book. The table has seven attributes. The first two attributes act as a compound primary key. We set the first attribute
BookTitle as the hash key and the second attribute
Author as the range key. Except for the primary key attributes, all other attributes are optional and we need not specify nonprimary key attributes while creating the table.
Therefore, during the creation of the
Tbl_Book table in DynamoDB, we will specify only the
Author attributes. All other attributes can be specified while inserting an item into this table.
One quick question: while inserting the first item into the table, do we need to specify the
PubDate attribute as null? The answer is no; every item can have its own attributes, along with mandatory primary key attributes specified during table creation. In fact, if we want to insert a fifth item with a new attribute named
CoverPhoto, we can do it without affecting the previous four items.
Unlike RDBMS tables, the attributes (that is, what we call columns in RDBMS) of DynamoDB tables are stored in the item itself as a key-value pair. The attribute name becomes the key and the attribute value becomes the value. So every item will have its own attributes. There is a tradeoff here. Fetching a record will not only fetch the attribute value, but also its attribute name. So if you choose very long attribute names, then the efficiency will decrease.
Let's take a look at a few valid table schema that are supported by DynamoDB:
The schema for Table 1.7 is invalid, because it doesn't have the hash key attribute that is mandatory to create the table. Table 1.8 is invalid because of the same reason. The schemas for Table 1.9 and Table 1.10 are invalid because the hash and range keys must be either
Binary. It cannot be
Set. We will discuss the
Set data type at the end of this chapter.
What is the difference between the hash key and the range key?
What is the difference between
Stringdata type and
Set, is there any other data type that I should know about?
During table creation, what mandatory information should I provide?
Let us discuss the answers to these questions, which will help us understand the DynamoDB data model better. Here comes the answer to the first question. With the hash and range keys, hash and range are two attributes that act like a (compound) primary key. The range key must be accompanied by the hash key, but the hash key can optionally be accompanied by the range key. The hash key is an attribute that every table must have. It is an unordered collection of items; this means that items with the same hash key values will go to the same partition, but there won't be any ordering based on these hash key values, whereas items will always be ordered on range key values (but grouped on hash key values). After applying the previous statements to the already-created table, its order will look as follows:
So there is no guarantee that the table data will be sorted by the hash key (that is
BookTitle), but it will be hashed or grouped based on the hash key attribute value. That is the reason why
Item4 are placed close together. On the other hand, the records are ordered on the range key (that is,
Author). That is the reason why the book SCJP authored by Kathy is first, followed by the book authored by Khalid. This answers the first question.
An attribute of the type
String can hold only a simple string. For example, in the previous table we have two attributes (
Language2) to store the edition language of the book. If a book has 10 different language editions, then we would be left with too many attributes in an item (which will reduce fetch efficiency as discussed on the previous page). So a better solution is to change the
Language attribute from a simple
String type to
StringSet as shown in the following table:
The same cannot be done for the
Author attribute. Can you guess why? If not, you can go back and take a look at Table 1.9 and Table 1.10. Can you guess now? It's because neither the hash key nor the range key can be of the
During table creation, there are two scenarios that decide the mandatory parameters needed to create a DynamoDB table.
Hash and range primary key: In this scenario, we must (and we can only) provide three parameters. The first parameter is the table name, the second parameter is the name and type of hash key, and the third parameter is the name and type of range key.
There are different interfaces available to interact with DynamoDB. Take a look at Chapter 2, DynamoDB Interfaces, to know more about the interfaces. We are now done with the basics of this chapter.