Data Modeling with Snowflake

By : Serge Gershkovich

5 (2)

Buy this Book

Data Modeling with Snowflake

5 (2)

By: Serge Gershkovich

Buy this Book

Overview of this book

The Snowflake Data Cloud is one of the fastest-growing platforms for data warehousing and application workloads. Snowflake's scalable, cloud-native architecture and expansive set of features and objects enables you to deliver data solutions quicker than ever before. Yet, we must ensure that these solutions are developed using recommended design patterns and accompanied by documentation that’s easily accessible to everyone in the organization. This book will help you get familiar with simple and practical data modeling frameworks that accelerate agile design and evolve with the project from concept to code. These universal principles have helped guide database design for decades, and this book pairs them with unique Snowflake-native objects and examples like never before – giving you a two-for-one crash course in theory as well as direct application. By the end of this Snowflake book, you’ll have learned how to leverage Snowflake’s innovative features, such as time travel, zero-copy cloning, and change-data-capture, to create cost-effective, efficient designs through time-tested modeling principles that are easily digestible when coupled with real-world examples.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Part 1: Core Concepts in Data Modeling and Snowflake Architecture

Free Chapter

Chapter 1: Unlocking the Power of Modeling

Technical requirements

Modeling with purpose

Leveraging the modeling toolkit

The benefits of database modeling

Operational and analytical modeling scenarios

A look at relational and transformational modeling

Summary

Further reading

References

Chapter 2: An Introduction to the Four Modeling Types

Logical

Summary

Chapter 3: Mastering Snowflake’s Architecture

Traditional architectures

Snowflake’s solution

Snowflake’s three-tier architecture

Snowflake’s features

Costs to consider

Saving cash by using cache

Summary

Further reading

Chapter 4: Mastering Snowflake Objects

Stages

Tables

Streams

Tasks

Summary

References

Chapter 5: Speaking Modeling through Snowflake Objects

Entities as tables

Attributes as columns

Constraints and enforcement

Identifiers as primary keys

Alternate keys as unique constraints

Relationships as foreign keys

Mandatory columns as NOT NULL constraints

Summary

Chapter 6: Seeing Snowflake’s Architecture through Modeling Notation

A history of relational modeling

RM versus entity-relationship diagram

Visual modeling conventions

The benefit of synchronized modeling

Summary

Part 2: Applied Modeling from Idea to Deployment

Chapter 7: Putting Conceptual Modeling into Practice

Embarking on conceptual design

Modeling in reverse

Summary

Further reading

Chapter 8: Putting Logical Modeling into Practice

Expanding from conceptual to logical modeling

Adding attributes

Cementing the relationships

Summary

Chapter 9: Database Normalization

An overview of database normalization

Data anomalies

Database normalization through examples

Data models on a spectrum of normalization

Summary

Chapter 10: Database Naming and Structure

Naming conventions

Organizing a Snowflake database

Summary

Chapter 11: Putting Physical Modeling into Practice

Technical requirements

Considerations before starting the implementation

Expanding from logical to physical modeling

Deploying a physical model

Creating an ERD from a physical model

Summary

Part 3: Solving Real-World Problems with Transformational Modeling

Chapter 12: Putting Transformational Modeling into Practice

Technical requirements

Separating the model from the object

Shaping transformations through relationships

Join elimination using constraints

Joins and set operators

Performance considerations and monitoring

Putting transformational modeling into practice

Summary

Chapter 13: Modeling Slowly Changing Dimensions

Technical requirements

Dimensions overview

Recipes for maintaining SCDs in Snowflake

Summary

Chapter 14: Modeling Facts for Rapid Analysis

Technical requirements

Fact table types

Fact table measures

Getting the facts straight

Maintaining fact tables using Snowflake features

Summary

Chapter 15: Modeling Semi-Structured Data

Technical requirements

The benefits of semi-structured data in Snowflake

Getting hands-on with semi-structured data

Schema-on-read != schema-no-need

Converting semi-structured data into relational data

Summary

Chapter 16: Modeling Hierarchies

Technical requirements

Understanding and distinguishing between hierarchies

Maintaining hierarchies in Snowflake

Summary

Chapter 17: Scaling Data Models through Modern Techniques

Technical requirements

Demystifying Data Vault 2.0

Modeling the data marts

Discovering Data Mesh

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Appendix

Technical requirements

The exceptional time traveler

The secret column type Snowflake refuses to document

Read the functional manual (RTFM)

Summary

Customer Reviews

5 (2)

5 star

100%

4 star

3 star

2 star

1 star

Operational and analytical modeling scenarios

The relational database as we know it today emerged in the 1970s—allowing organizations to store their data in a centralized repository instead of on individual tapes. Later that decade, Online Transaction Processing (OLTP) emerged, enabling faster access to data and unlocking new uses for databases such as booking and bank teller systems. This was a paradigm shift for databases, which evolved from data archives to operational systems.

Due to limited resources, data analysis could not be performed on the same database that ran operational processes. The need to analyze operational data gave rise, in the 1980s, to Management Information Systems (MIS), or Decision Support Systems (DSS) as they later became known. Data would be extracted from the operational database to the DSS, where it could be analyzed according to business needs. OLTP architecture is not best suited for the latter case, so Online Analytical Processing (OLAP) emerged to enable users to analyze multidimensional data from multiple perspectives using complex queries. This is the same paradigm used today by modern data platforms such as Snowflake.

The approach to storing and managing data in OLAP systems fundamentally differs from the operational or transactional database. Data in OLAP systems is generally stored in data warehouses (also known as DWs or DWHs)—centralized repositories that store structured data from various sources for the purpose of analysis and decision-making. While the transactional system keeps the up-to-date version of the truth and is generally concerned with individual records, the data warehouse snapshots many historical versions and aggregates volumes of data to satisfy various analytical needs.

Data originates in the transactional database when daily business operations (for example, bookings, sales, withdrawals) are recorded. In contrast, the warehouse does not create but rather loads extracted information from one or various source systems. The functional differences between transactional databases and warehouses present different modeling challenges.

A transactional system must be modeled to fit the nature of the data it is expected to process. This means knowing the format, relationships, and attributes required for a transaction.

The main concern of a transactional database model is the structure and relationships between its tables.

By contrast, the data warehouse loads existing data from the source system. A data warehouse isn’t concerned with defining a single transaction but with analyzing multitudes of transactions across various dimensions to answer business questions. To do this, a data warehouse must transform the source data to satisfy multiple business analyses, which often means creating copies with varying granularity and detail.

Modeling in a data warehouse builds upon the relational models of its source systems by conforming common elements and transforming the data using logic.

Wait—if transformational logic is a core concept in data warehouse modeling, why is it so consistently absent from modeling discussions? Because in order to do transformational modeling justice, one must forgo the universality of general modeling principles and venture into the realm of platform specifics (that is, syntax, storage, and memory utilization). This book, in contrast, will embrace Snowflake specifics and go beyond physical modeling by diving into the transformation logic behind the physical tables. This approach provides a fuller understanding of the underlying modeling concepts and equips the reader with the required SQL recipes to not only build models but to load and automate them in the most efficient way possible. As we’ll see in later chapters, this is where Snowflake truly shines and confers performance and cost-saving benefits.

Is Snowflake limited to OLAP?

Snowflake’s primary use case is that of a data warehouse—with all the OLAP properties to enable multidimensional analysis at scale over massive datasets. However, at the 2022 Snowflake Summit, the company announced a new table type called Hybrid Unistore, which features both an OLTP-storage table and an OLAP analysis table under one semantic object. This announcement means Snowflake users can now design transactional OLTP database schemas while leveraging the analytical performance that Snowflake is known for. Hybrid Unistore tables are discussed in more detail in later chapters.

Although OLAP and OLTP systems are optimized for different kinds of database operations, they are still databases at heart and operate on the same set of objects (such as tables, constraints, and views) using SQL. However, each use case requires very different approaches to modeling the data within. The following section demonstrates what modeling will typically look like in each scenario.

Data Modeling with Snowflake

By : Serge Gershkovich

Data Modeling with Snowflake

By: Serge Gershkovich

Overview of this book

Related Content you might be interested in

Current Title:

Data Modeling with Snowflake

Snowflake Cookbook

Data Modeling for Azure Data Services

Data Engineering with dbt

Operational and analytical modeling scenarios