Book Image

Business Intelligence with Databricks SQL

By : Vihag Gupta
Book Image

Business Intelligence with Databricks SQL

By: Vihag Gupta

Overview of this book

In this new era of data platform system design, data lakes and data warehouses are giving way to the lakehouse – a new type of data platform system that aims to unify all data analytics into a single platform. Databricks, with its Databricks SQL product suite, is the hottest lakehouse platform out there, harnessing the power of Apache Spark™, Delta Lake, and other innovations to enable data warehousing capabilities on the lakehouse with data lake economics. This book is a comprehensive hands-on guide that helps you explore all the advanced features, use cases, and technology components of Databricks SQL. You’ll start with the lakehouse architecture fundamentals and understand how Databricks SQL fits into it. The book then shows you how to use the platform, from exploring data, executing queries, building reports, and using dashboards through to learning the administrative aspects of the lakehouse – data security, governance, and management of the computational power of the lakehouse. You’ll also delve into the core technology enablers of Databricks SQL – Delta Lake and Photon. Finally, you’ll get hands-on with advanced SQL commands for ingesting data and maintaining the lakehouse. By the end of this book, you’ll have mastered Databricks SQL and be able to deploy and deliver fast, scalable business intelligence on the lakehouse.
Table of Contents (21 chapters)
1
Part 1: Databricks SQL on the Lakehouse
9
Part 2: Internals of Databricks SQL
13
Part 3: Databricks SQL Commands
16
Part 4: TPC-DS, Experiments, and Frequently Asked Questions

Index

As this ebook edition doesn't have fixed pagination, the page numbers below are hyperlinked for reference only, based on the printed edition of this book.

Symbols

10-Query Rule 159

A

access

denying 83

revoking 82

access control

with Apache Hive™ Metastore 65, 66

with Unity Catalog 66

access control lists (ACLs) 65

access control, SQL Warehouse

about 170

chargeback 173-175

operations 171

privileges 171

programming access control 172, 173

Active Directory application

creating 93

ADLS Gen2

storage location, creating 94, 95

aggregate operator 234

Alerts page 24, 25

Amazon Web Services (AWS)

about 12, 97

cloud storage access 97

ANALYZE command

about 286

FOR COLUMN clause 286

NOSCAN clause 286

PARTITION clause 286

Apache Hive™ Metastore

about 34

access control 65, 66

Apache Spark 153

Apache Spark™ execution model

about 223, 227...