Book Image

Data Modeling with Snowflake

By : Serge Gershkovich
5 (2)
Book Image

Data Modeling with Snowflake

5 (2)
By: Serge Gershkovich

Overview of this book

The Snowflake Data Cloud is one of the fastest-growing platforms for data warehousing and application workloads. Snowflake's scalable, cloud-native architecture and expansive set of features and objects enables you to deliver data solutions quicker than ever before. Yet, we must ensure that these solutions are developed using recommended design patterns and accompanied by documentation that’s easily accessible to everyone in the organization. This book will help you get familiar with simple and practical data modeling frameworks that accelerate agile design and evolve with the project from concept to code. These universal principles have helped guide database design for decades, and this book pairs them with unique Snowflake-native objects and examples like never before – giving you a two-for-one crash course in theory as well as direct application. By the end of this Snowflake book, you’ll have learned how to leverage Snowflake’s innovative features, such as time travel, zero-copy cloning, and change-data-capture, to create cost-effective, efficient designs through time-tested modeling principles that are easily digestible when coupled with real-world examples.
Table of Contents (24 chapters)
1
Part 1: Core Concepts in Data Modeling and Snowflake Architecture
8
Part 2: Applied Modeling from Idea to Deployment
14
Part 3: Solving Real-World Problems with Transformational Modeling

Leveraging the modeling toolkit

Before we continue, we need to formally delineate three distinct concepts often used together in the service of modeling to make it simpler to refer to a specific tool in the modeling toolkit in later sections. By understanding where each piece fits in the broader domain of database design and management, diving into deeper technical concepts later in the book will become more meaningful and easier to digest.

The three components are listed here:

  • Natural language semantics—words
  • Technical semantics—SQL
  • Visual semantics—diagrams

Let’s discuss each of these in detail, as follows:

  • Natural language semantics: Terminology employed in communicating details of a model between people. These are agreed-upon words that employ pre-defined conventions to encapsulate more complex concepts in simpler terms. For example, when both parties involved in a verbal exchange understand the concept of a surrogate key, it saves them from having to explain that it is a unique identifier for a table record that holds no intrinsic business meaning, such as an integer sequence or a hash value.

To ensure effective technical conversations, it helps to be fluent in the semantics of modeling. Not only does it save time by succinctly communicating a complex concept, but it also saves even more time by not miscommunicating it. A waiter would return different foods when ordering chips in London rather than in Los Angeles. A properly modeled database would never return different records for the same surrogate key.

  • Technical semantics: SQL is a domain-specific language used to manage data in a Relational Database Management System (RDBMS). Unlike a general-purpose language (for example, YAML or Python), domain-specific languages have a much smaller application but offer much richer nuance and precision. While it can’t format a website or send an email, SQL allows us to create the structure of our database and manipulate its contents.

SQL bridges modeling concepts (expressed in words or images) and what is physically defined in the database. Snowflake uses an American National Standards Institute (ANSI)-compliant SQL syntax, meaning its basic commands (such as SELECT, UPDATE, DELETE, INSERT, and WHERE) are compatible with other database vendors who use this standard. Snowflake also offers many extra functions, clauses, and conventions that go beyond ANSI-standard SQL and give users added flexibility to manage the database.

Unfortunately, due to its domain-specific nature, SQL presents a significant limitation: it can only express what the database explicitly understands. While SQL can define table structure and precisely manipulate data, it is too detailed to easily articulate the underlying business requirements.

  • Visual semantics: Through their simplicity, images can convey a density of information that other forms of language simply cannot. In modeling, diagrams combine the domain-specific precision of SQL with the nuance of natural language. This gives diagrams a lot to work with to capture a data model’s business meaning and technical specifics.

To start, diagrams vary in the level of detail they present—giving the observer exactly what they’re looking for without overwhelming (or underwhelming) them with information. Most importantly, the semantic conventions used in diagrams are universal and can be understood by people besides data analysts and engineers. Yes—modeling diagrams are considered technical drawings; they represent strict technical concepts through agreed-upon visual conventions. However, in their simplest form, models can be understood almost intuitively with no prior knowledge. Even at the more advanced levels, such as logical and physical, learning to read a model is much simpler than learning SQL.

When all these semantics come together and are understood by the entire organization, they form a ubiquitous language, a concept first described by Eric Evans in Domain-Driven Design. Modeling then forms a part of the vocabulary that is understood universally throughout the organization to describe its business and store the data assets that support it. But that is just one of the many benefits that modeling provides.