Book Image

Redis Applied Design Patterns

By : Arun Chinnachamy
Book Image

Redis Applied Design Patterns

By: Arun Chinnachamy

Overview of this book

<p>With new data stores making their way onto the market, it's necessary for businesses to understand the features and techniques to use data stores most effectively. Redis Applied Design Patterns is designed to guide you into the world of Redis and will help you understand how business problems can be solved using Redis in your application stack.</p> <p>This book introduces you to the NoSQL way of thinking and how it is different from SQL. It helps you understand various functionality of Redis through a series of use cases designed for you to learn the more complex and less well-known features of Redis in an incremental way. By the time you've read this book, you will be capable of designing a system using Redis. This book will guide you through various use cases along with code samples, easy-to-understand diagrams, and tips to help you take advantage of Redis. The book explains the benefits of adding Redis to the application stack and discusses various practical use cases for Redis such as the caching system, commenting system, and social networking.</p>
Table of Contents (18 chapters)

Data modeling for relational databases


A new application almost always starts with data modeling. In traditional data modeling for relational databases, this process is known as data normalization, which focuses on removing data redundancy and minimizing data dependency. The modeling formulates a series of tables, which are defined by its schema (rows and columns of the table).

Normalization

Why normalization? Data normalization helps to reduce dependency and redundancy of data, which makes it easier to update the data, while maintaining its consistency.

Normalization ensures that the update is done only once for an entity. Let's take a typical example to showcase how data normalization works:

ID

Name

Subject

1

John

Biology

2

Smith

Math

3

John

Math

4

Scott

Biology

5

Smith

Physics

In this example, the data in the Name field is repeating as one student can take multiple subjects. In normalization, it is required to have data that is tabular with each row-column intersection having one value. The redundancy of data can create inconsistencies in the data during updates, where there is a real possibility of the same data having different values. To make the preceding table normalized, we can consider adding another column for extra subjects, which looks something like this:

ID

Name

Subject_1

Subject_2

1

John

Math

Biology

2

Smith

Physics

Math

3

Scott

Biology

Null

The preceding structure creates its own problems. For instance, if you want to know all the students who are taking a Math class, you have to check two columns. The problem gets more complex if there is a student who is taking more than two subjects, which requires you to add a column to the table. Updating and deleting a record becomes more problematic.

In this scenario, to maintain consistency and to avoid redundancy, we have to create two tables. The first table will contain all the student information as follows:

Student_ID

Name

1

John

2

Smith

3

Scott

The second table will contain the information related to the subjects offered as follows:

Subject_ID

Subject_name

1

Biology

2

Math

3

Physics

Now, we have to create a map of uniquely identifiable IDs in the form of a table that shows which student is taking what subjects:

Student_ID

Subject_ID

1

1

1

2

2

2

2

3

3

1

In the preceding data model, we do not have to worry about data inconsistencies. This allows us to manipulate the data easily, and allows insertions and deletions of data without having to worry about artifacts. However, a problem arises when reading the data. If you want to know the name of the student who is taking Math, the SQL query will be as follows:

SELECT a.Name from STUDENTS a JOIN STU_SUB_MAP b ON a.student_id=b.student_id JOIN SUBJECTS c ON c.Subject_id = b.Subject_id where c.subject_name='Math'

The result of the query will look like the following table:

Name

John

Smith

Even though we have got the correct result from the database, the problem lies in the amount of work the database needs to do to get this result. It has to join three different tables, read the data, and display the results. The JOIN operation is a relatively expensive operation, which can be mitigated to some extent by using indexes. However, as the table size increases, the amount of time the database takes to compute also increases.

Though normalization gives us highly structured tables with the utmost consistency, sometimes we need to keep the data in a denormalized format. This is usually necessary in scenarios where performance takes precedence over structure. A denormalized structure is more beneficial if your application has much more reads than writes.