Book Image

Business Intelligence with Databricks SQL

By : Vihag Gupta
Book Image

Business Intelligence with Databricks SQL

By: Vihag Gupta

Overview of this book

In this new era of data platform system design, data lakes and data warehouses are giving way to the lakehouse – a new type of data platform system that aims to unify all data analytics into a single platform. Databricks, with its Databricks SQL product suite, is the hottest lakehouse platform out there, harnessing the power of Apache Spark™, Delta Lake, and other innovations to enable data warehousing capabilities on the lakehouse with data lake economics. This book is a comprehensive hands-on guide that helps you explore all the advanced features, use cases, and technology components of Databricks SQL. You’ll start with the lakehouse architecture fundamentals and understand how Databricks SQL fits into it. The book then shows you how to use the platform, from exploring data, executing queries, building reports, and using dashboards through to learning the administrative aspects of the lakehouse – data security, governance, and management of the computational power of the lakehouse. You’ll also delve into the core technology enablers of Databricks SQL – Delta Lake and Photon. Finally, you’ll get hands-on with advanced SQL commands for ingesting data and maintaining the lakehouse. By the end of this book, you’ll have mastered Databricks SQL and be able to deploy and deliver fast, scalable business intelligence on the lakehouse.
Table of Contents (21 chapters)
1
Part 1: Databricks SQL on the Lakehouse
9
Part 2: Internals of Databricks SQL
13
Part 3: Databricks SQL Commands
16
Part 4: TPC-DS, Experiments, and Frequently Asked Questions

Introduction to Databricks

Databricks is one of the most recognizable names in the big data industry. They are the providers of the lakehouse platform for data analytics and artificial intelligence (AI). This book is about Databricks SQL, a product within the Databricks Lakehouse platform that powers data analytics and business intelligence.

Databricks SQL is a rapidly evolving product. It is not a traditional data warehouse, yet its users are the traditional data warehouse and business intelligence users. It claims to provide all the functionality of data warehouses on what is essentially a data lake. This concept can be a bit jarring. It can create resistance in adoption as you might be wondering if your skills are transferrable, or if your work might be disrupted as a result of a new learning curve.

Hence, I am writing this book.

The primary intent of this book is to help you learn the fundamental concepts of Databricks SQL in a fun, follow-along interactive manner. My aim is that by the time you complete this book, you will be confident in your adoption of Databricks SQL as the enabler of your business intelligence.

This book does not intend to be a definitive guide or a complete reference, nor does it intend to be a replacement for the official documentation. It is too early for either of those. This book is your initiation into business intelligence on the data lakehouse, the Databricks SQL way.

Let’s begin!

In this chapter, we’ll cover the following topics:

  • An overview of Databricks, the company
  • An overview of the Lakehouse architecture
  • An overview of the Databricks Lakehouse platform