Book Image

Microsoft Certified Azure Data Fundamentals (Exam DP-900) Certification Guide

By : Marcelo Leite

5 (1)

Book Image

Microsoft Certified Azure Data Fundamentals (Exam DP-900) Certification Guide

5 (1)

By: Marcelo Leite

Overview of this book

Passing the DP-900 Microsoft Azure Data Fundamentals exam opens the door to a myriad of opportunities for working with data services in the cloud. But it is not an easy exam and you'll need a guide to set you up for success and prepare you for a career in Microsoft Azure. Absolutely everything you need to pass the DP-900 exam is covered in this concise handbook. After an introductory chapter covering the core terms and concepts, you'll go through the various roles related to working with data in the cloud and learn the similarities and differences between relational and non-relational databases. This foundational knowledge is crucial, as you'll learn how to provision and deploy Azure's relational and non-relational services in detail later in the book. You'll also gain an understanding of how to glean insights with data analytics at both small and large scales, and how to visualize your insights with Power BI. Once you reach the end of the book, you'll be able to test your knowledge with practice tests with detailed explanations of the correct answers. By the end of this book, you will be armed with the knowledge and confidence to not only pass the DP-900 exam but also have a solid foundation from which to embark on a career in Azure data services.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Share your thoughts

Download a free PDF copy of this book

Part 1: Core Data Concepts

Part 1: Core Data Concepts

Free Chapter

Chapter 1: Understanding the Core Data Terminologies

Chapter 1: Understanding the Core Data Terminologies

Understanding the core data concepts

Describing a data solution

Defining the data type and proper storage

Understanding data ingestion

Sample questions and answers

Chapter 2: Exploring the Roles and Responsibilities in Data Domain

Chapter 2: Exploring the Roles and Responsibilities in Data Domain

Different workforces in a data domain

Tasks and tools for database administration profiles

Tasks and tools for data engineer profiles

Tasks and tools for the data analyst

Sample questions and answers

Chapter 3: Working with Relational Data

Chapter 3: Working with Relational Data

Exploring the characteristics of relational data

Exploring relational data structures

Introducing SQL

Describing the database components

Sample questions and answers

Chapter 4: Working with Non-Relational Data

Chapter 4: Working with Non-Relational Data

Exploring the characteristics of non-relational data

Understanding the types of non-relational data

Exploring NoSQL databases

Identifying non-relational database use cases

Sample questions and answers

Chapter 5: Exploring Data Analytics Concepts

Chapter 5: Exploring Data Analytics Concepts

Exploring data ingestion and processing

Exploring the analytical data store

Exploring an analytical data model

Exploring data visualization

Sample questions and answers

Part 2: Relational Data in Azure

Part 2: Relational Data in Azure

Chapter 6: Integrating Relational Data on Azure

Chapter 6: Integrating Relational Data on Azure

Exploring relational Azure data services

Sample questions and answers

Chapter 7: Provisioning and Configuring Relational Database Services in Azure

Chapter 7: Provisioning and Configuring Relational Database Services in Azure

Technical requirements

Provisioning relational Azure data services

Configuring relational databases on Azure

Sample questions and answers

Chapter 8: Querying Relational Data in Azure

Chapter 8: Querying Relational Data in Azure

Technical requirements

Introducing SQL on Azure

Querying relational data in Azure SQL Database

Querying relational data in Azure Database for PostgreSQL

Sample questions and answers

Part 3: Non-Relational Data in Azure

Part 3: Non-Relational Data in Azure

Chapter 9: Exploring Non-Relational Data Offerings in Azure

Chapter 9: Exploring Non-Relational Data Offerings in Azure

Exploring Azure non-relational data stores

Exploring Azure NoSQL databases

Sample questions and answers

Chapter 10: Provisioning and Configuring Non-Relational Data Services in Azure

Chapter 10: Provisioning and Configuring Non-Relational Data Services in Azure

Technical requirements

Provisioning non-relational data services

Creating a sample Azure Cosmos DB database

Provisioning an Azure storage account and Data Lake Storage

Sample questions and answers

Part 4: Analytics Workload on Azure

Part 4: Analytics Workload on Azure

Chapter 11: Components of a Modern Data Warehouse

Chapter 11: Components of a Modern Data Warehouse

Describing modern data warehousing

Exploring Azure data services for modern data warehouses

Real-time data analytics – Azure Stream Analytics, Azure Synapse Data Explorer, and Spark streaming

Sample questions and answers

Chapter 12: Provisioning and Configuring Large-Scale Data Analytics in Azure

Chapter 12: Provisioning and Configuring Large-Scale Data Analytics in Azure

Technical requirements

Understanding common practices for data loading

Sample questions and answers

Chapter 13: Working with Power BI

Chapter 13: Working with Power BI

Technical requirements

Introducing Power BI

The building blocks of Power BI

Exploring Power BI Desktop

Exploring Power BI Service

Power BI mobile app

Sample questions and answers

Chapter 14: DP-900 Mock Exam

Chapter 14: DP-900 Mock Exam

Practice test – questions

Practice test – answers and explanations

Index

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share your thoughts

Download a free PDF copy of this book

Customer Reviews

5 (1)

5 star

100%

4 star

0

3 star

0

2 star

0

1 star

0

Understanding the core data concepts

To start, let’s understand the terminologies used in the data world so that all the following concepts are easily interpreted to be applied to technologies.

What is data?

Data is a record, also called a fact, which can be a number, text, or description used to make decisions. Data only generates intelligence when processed and then this data is called information or insights.

Data is classified into three basic formats: structured, semi-structured, and unstructured data. We will learn about them all in the following sections.

Structured data

Structured data is formatted and typically stored in a table represented by columns and rows. This data is found in relational databases, which organize their table structures in a way that creates relationships between these tables.

The following figure shows an example of a simple table with structured data:

Figure 1.1 – Example of structured data in a database

Figure 1.1 – Example of structured data in a database

In this example, the table called CUSTOMER has seven columns and six records (rows) with different values.

The CUSTOMER table could be part of a customer relationship management (CRM) database, for example, financial and enterprise resource planning (ERP), among other types of business applications.

Semi-structured data

Semi-structured data is a structure in which records have attributes such as columns but are not organized in a tabular way like structured data. One of the most used formats for semi-structured data is JavaScript Object Notation (JSON) files. The following example demonstrates the structure of a JSON file containing the registration of one customer:

## JSON FILE - Document 1 ##
{
  "CUSTOMER_ID": "10302",
  "NAME": 
  { 
    "FIRST_NAME": "Leo", 
    "LAST_NAME": "Boucher" 
  },
  "ADDRESS": 
  {
    "STREET": "54, rue Royale",
    "CITY": "Nantes",
    "ZIP_CODE": "44000",
    "COUNTRY": "France" 
   }
}

In this example, we can see that each JSON file contains a record, like the rows of the structured data table, but there are other formats of JSON and similar files that contain multiple records in the same file.

In addition to the JSON format, there is data in key-value and graph databases, which are considered semi-structured data, too.

The key-value database stores data in a related array format. These arrays have a unique identification key per record. Values written to a record can have a variety of formats, including numbers, text, and even full JSON files.

The following is an example of a key-value database:

Figure 1.2 – Example of a key-value database

Figure 1.2 – Example of a key-value database

As you can see in the preceding figure, each record can contain different attributes. They are stored in a single collection, with no predefined schema, tables, or columns, and no relationships between the entities; this differentiates the key-value database from the relational database.

The graph database is used to store data that requires complex relationships. A graph database contains nodes (object information) and edges (object relationship information). It means that the graph database predetermines what objects are and the relationships they will have with each other, but the records can contain different formats. The following is a representation of nodes and edges in a graph database of sales and deliveries:

Figure 1.3 – Example of a graph database

Figure 1.3 – Example of a graph database

The diagram demonstrates how the relations around the ORDER entity are created in a graph database, considering the CUSTOMER, LOCATION, SUPPLIER, and PRODUCT nodes in the process. It represents an interesting acceleration in terms of query processing in the database because the graph is already structured to deliver the relations faster.

Unstructured data

In addition to structured and semi-structured data, there is also unstructured data, such as audio, videos, images, or binary records without a defined organization.

This data can also be processed to generate information, but the type of storage and processing for this is different from that of structured and semi-structured data. It is common, for example, for unstructured data such as audio to be transcribed using artificial intelligence, generating a mass of semi-structured data for processing.

Now that you understand the basics of data types, let’s look at how that data is stored in a cloud environment.

How is data stored in a modern cloud environment?

Depending on the data format, structured, semi-structured, and unstructured cloud platforms have different solutions. In Azure, we can count on Azure SQL Database, Azure SQL Database for PostgreSQL, Azure Database for MySQL, and database servers installed on virtual machines, such as SQL Server on a virtual machine in Azure, to store structured data. These are called relational databases.

Semi-structured data can be stored in Azure Cosmos DB and unstructured data (such as videos and images) can be stored in Azure Blob storage in a platform called Azure Data Lake Storage, optimized for queries and processing.

These services are delivered by Azure in the following formats:

Infrastructure as a service (IaaS) – Databases deployed on virtual machines
Platform as a service (PaaS) – Managed database services, where the responsibility for managing the virtual machine and the operating system lies with Azure

For these database services to be used, they must be provisioned and configured to receive the data properly.

One of the most important aspects after provisioning a service is the access control configuration. Azure allows you to create custom access role control, but in general, we maintain at least three profiles:

Read-only – Users can read existing data on that service, but they cannot add new records or edit or delete them
Read/Write – Users can read, create, delete, and edit records
Owner – Higher access privilege, including the ability to manage permission for other users to use this data

With these configured profiles, you will be able to add users to the profiles to access the data storage/databases.

Let’s look at an example. You are an administrator of a CUSTOMER database, and you have the Owner profile. So, you configure access to this database for the leader of the commercial area to Read/Write, and for salespeople to Read-only.

In addition to the permissions configuration, it is important to review all network configurations, data retention, and backup patterns, among other administrative activities. These management tasks will be covered in Chapter 7, Provisioning and Configuring Relational Database Services in Azure.

In all database scenarios, we will have different access requirements, and it is important (as in the example) to accurately delimit the access level needs of each profile.