Book Image

The Data Warehouse Toolkit - Third Edition

By : Ralph Kimball, Margy Ross
Book Image

The Data Warehouse Toolkit - Third Edition

By: Ralph Kimball, Margy Ross

Overview of this book

The volume of data continues to grow as warehouses are populated with increasingly atomic data and updated with greater frequency. Dimensional modeling has become the most widely accepted approach for presenting information in data warehouse and business intelligence (DW/BI) systems. The goal of this book is to provide a one-stop shop for dimensional modeling techniques. The book is authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence. The book begins with a primer on data warehousing, business intelligence, and dimensional modeling, and you’ll explore more than 75-dimensional modeling techniques and patterns. Then you’ll understand dimension tables in-depth to get a good grip on retailing and moved towards the topics of inventory. Moving ahead, you’ll learn how to use this book for procurement, order management, accounting, customer relationship management, and many more business sectors. By the end of this book, you’ll be able to gather all the essential knowledge, practices, and patterns for designing dimensional models.
Table of Contents (31 chapters)
Free Chapter
1
Cover
2
Title Page
3
Copyright
4
About the Authors
5
Credits
6
Acknowledgements
29
Index
30
Advertisement
31
End User License Agreement

Dimension Triage to Avoid Too Few Dimensions

Based on the previous business requirements, the grain and dimensionality of the initial model begin to emerge. You can start with a fact table that records the primary balances of every account at the end of each month. Clearly, the grain of the fact table is one row for each account each month. Based on that grain declaration, you can initially envision a design with only two dimensions: month and account. These two foreign keys form the fact table primary key, as shown in Figure 10-2. A data-centric designer might argue that all the other description information, such as household, branch, and product characteristics should be embedded as descriptive attributes of the account dimension because each account has only one household, branch, and product associated with it.

image

Figure 10-2: Balance snapshot with too few dimensions.

Although this schema accurately represents the many-to-one and many-to-many relationships in the snapshot data...