Book Image

The Data Warehouse Toolkit - Third Edition

By : Ralph Kimball, Margy Ross
5 (2)
Book Image

The Data Warehouse Toolkit - Third Edition

5 (2)
By: Ralph Kimball, Margy Ross

Overview of this book

The volume of data continues to grow as warehouses are populated with increasingly atomic data and updated with greater frequency. Dimensional modeling has become the most widely accepted approach for presenting information in data warehouse and business intelligence (DW/BI) systems. The goal of this book is to provide a one-stop shop for dimensional modeling techniques. The book is authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence. The book begins with a primer on data warehousing, business intelligence, and dimensional modeling, and you’ll explore more than 75-dimensional modeling techniques and patterns. Then you’ll understand dimension tables in-depth to get a good grip on retailing and moved towards the topics of inventory. Moving ahead, you’ll learn how to use this book for procurement, order management, accounting, customer relationship management, and many more business sectors. By the end of this book, you’ll be able to gather all the essential knowledge, practices, and patterns for designing dimensional models.
Table of Contents (31 chapters)
Free Chapter
1
Cover
2
Title Page
3
Copyright
4
About the Authors
5
Credits
6
Acknowledgements
29
Index
30
Advertisement
31
End User License Agreement

Draft Design Exercise Discussion

Now that you’ve reviewed the common dimensional modeling mistakes frequently encountered during design reviews, refer to the draft design in Figure 11-2. Several opportunities for improvement should immediately jump out at you.

The first thing to focus on is the grain of the fact table. The team stated the grain is one row for each bill each month. However, based on your understanding from the source system documentation and data profiling effort, the lowest level of billing data would be one row per service line on a bill. When you point this out, the team initially directs you to the bill dimension, which includes the service line number. However, when reminded that each service line has its own set of billing metrics, the team agrees the more appropriate grain declaration would be one row per service line per bill. The service line key is moved into the fact table as a foreign key to the service line dimension.

While discussing the granularity...