Decoding database structures – relational and non-relational

Before we delve into the details of relational and non-relational databases, let us first understand the meaning of the term database and why it is important to know about databases.

What is a database?

Most of us have heard of a database, right? To put it simply, it is a collection of information that is stored in an organized and logical manner. This helps people keep track of things and find information quickly. For example, imagine you are walking into a superstore and looking for a specific item, such as a phone charger. To find it, you would use logical categorization. First, you would go to the electronics section, but this section would have all sorts of electronics the superstore had to offer. So, you would then look for a section called phones and accessories and search for the specific phone charger that was compatible with your phone.

By using logical reasoning, you can determine the location of the object and purchase the charger successfully. If we consider the process from the perspective of the superstore, we can see that they have divided the entire area into sections such as electronics, merchandise, and groceries, and further subdivided it into rows and columns known as aisles. They store each object according to its category in an organized manner, which can be accessed through the store’s database.

The business definition of a database is that it is a collection of information stored on a server that is accessed regularly for analysis and decision-making. The information is organized into tables, which are similar to spreadsheets, with rows and columns. A database can contain multiple tables, and a server can have multiple databases for different categories or clients. For example, a university database may contain information on students, teachers, and subjects, while a superstore database may contain data on products, orders, store locations, and customers. Each row in the database represents a specific occurrence or transaction. The database stores information and its relationships.

Types of databases

Database Management Systems (DBMSs) are used to store and manage data in a database. The most commonly used language to extract information from a database is SQL. The history of databases dates back several decades, specifically to the 1970s. Since then, databases have evolved into two broad categories, known as relational and non-relational.

Relational databases

A relational database, or relational DBMS, stores data in the form of tables or entities that we want to track, such as customers and orders. The data about these entities is stored in relations, which are 2D tables of rows and columns, similar to a spreadsheet. Each row contains data, and each column contains different attributes about that entity.

Figure 1.1 – Relational database

For instance, in a table/entity that contains information about customers, the attributes or columns could include Name, Phone Number, Address, and Gender. The rows would then represent specific information for each customer in a separate row.

For example, we could have a customers table as follows:

Customer_ID	Name	Address	Phone	Gender	Email
1	Joey	Texas	834-2345	M	[email protected]
2	Ron	Tennessee	987-6543	M	[email protected]
3	Fred	New York	876-5678	M	[email protected]
4	Tom	LA	765-7654	M	[email protected]
5	Mary	Georgia	124-0987	F	[email protected]

Figure 1.2 – Customers table

Every row in a relational database should have a unique key, which we call the primary key (discussed later in the chapter). This key can be used as a foreign key in a different table to build logical referential relations between the two tables. The relations between the fields and tables are known as schemas. To extract data from databases, we use SQL queries.

These are some of the advantages of relational databases:

Highly efficient
High readability as data is sorted and unique
High data integrity
Normalized data

Non-relational databases

A non-relational database stores data in a non-tabular format, meaning it does not have a structure of tables and relations. Instead, this type of database stores information in various ways, such as key-value and document-based databases. In a key-value database, data is stored in two parts: a key and its corresponding value. Each key is unique and can only connect to one value in the collection. In contrast, a document-oriented database pairs a key with a document that contains a complex combination of several key-value pairs. Non-relational databases, also known as NoSQL databases, are more flexible than traditional relational databases. Some commonly used non-relational databases include MongoDB, Cassandra, Amazon DynamoDB, and Apache HBase.

Figure 1.3 – NoSQL databases

Key

Document

2022

{  Customer ID: 1234,  Customer Name: Joey,  Customer address: XYZTX,  Order details:    {     Order 1: {product 1, product description}     Order 2: {Product 1, product description}    }}

2023

{  Customer ID:5667,  Customer Name: Ron,  Customer address: LKJHNTN,  Order details:    {     Order 1: {product 1, product description}     Order 2: {Product 1, product description}    }}

Figure 1.4 – Non-relational database example

These are some of the advantages of non-relational databases:

Simple management – no sorting needed, so data can be directly dumped into the database without any preprocessing
Higher readability of a particular document, especially when the dataset contains big data, avoiding the need to parse through multiple tables and write complex queries
Can be scaled to a huge level by splitting the servers into multiple clusters and managing the CPU utilization on each of these clusters

Let’s understand that last point in detail. Multiple clusters refers to a distributed computing architecture that consists of multiple servers or nodes, each serving a different purpose, but working together to achieve a common goal. In this context, a cluster typically consists of a group of interconnected computers that works together to provide a more powerful and scalable computing environment. Each cluster may have its own dedicated resources, such as CPU, memory, and storage, and can be managed independently. By splitting servers into multiple clusters, the workload can be distributed more efficiently, allowing for greater scalability and flexibility. For example, suppose you have a large-scale application that requires a lot of processing power and storage. In that case, you might split your servers into multiple clusters and distribute the workload across those clusters. This way, you can achieve better performance, as well as reduce the risk of a single point of failure. Overall, multiple clusters offer several advantages, including increased scalability, improved performance, better fault tolerance, and the ability to handle large workloads more efficiently.

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Data Wrangling with SQL

By : Raghav Kandarpa, Shivangi Saxena

Data Wrangling with SQL

By: Raghav Kandarpa, Shivangi Saxena

Overview of this book

Decoding database structures – relational and non-relational

What is a database?

Types of databases

Relational databases

Non-relational databases

Data Wrangling with SQL

By : Raghav Kandarpa, Shivangi Saxena

Data Wrangling with SQL

By: Raghav Kandarpa, Shivangi Saxena

Overview of this book

Decoding database structures – relational and non-relational

What is a database?

Types of databases

Relational databases

Non-relational databases

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access