Book Image

The Data Warehouse Toolkit - Third Edition

By : Ralph Kimball, Margy Ross
5 (1)
Book Image

The Data Warehouse Toolkit - Third Edition

5 (1)
By: Ralph Kimball, Margy Ross

Overview of this book

The volume of data continues to grow as warehouses are populated with increasingly atomic data and updated with greater frequency. Dimensional modeling has become the most widely accepted approach for presenting information in data warehouse and business intelligence (DW/BI) systems. The goal of this book is to provide a one-stop shop for dimensional modeling techniques. The book is authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence. The book begins with a primer on data warehousing, business intelligence, and dimensional modeling, and you’ll explore more than 75-dimensional modeling techniques and patterns. Then you’ll understand dimension tables in-depth to get a good grip on retailing and moved towards the topics of inventory. Moving ahead, you’ll learn how to use this book for procurement, order management, accounting, customer relationship management, and many more business sectors. By the end of this book, you’ll be able to gather all the essential knowledge, practices, and patterns for designing dimensional models.
Table of Contents (31 chapters)
Free Chapter
Title Page
About the Authors
End User License Agreement

Develop the ETL Plan

ETL development starts out with the high-level plan, which is independent of any specific technology or approach. However, it’s a good idea to decide on an ETL tool before doing any detailed planning; this can avoid redesign and rework later in the process.

Step 1: Draw the High-Level Plan

We start the design process with a very simple schematic of the known pieces of the plan: sources and targets, as shown in Figure 20-1. This schematic is for a fictitious utility company’s data warehouse, which is primarily sourced from a 30-year-old COBOL system. If most or all the data comes from a modern relational transaction processing system, the boxes often represent a logical grouping of tables in the transaction system model.


Figure 20-1: Example high-level data staging plan schematic.

As you develop the detailed ETL system specification, the high-level view requires additional details. Figure 20-1 deliberately highlights contemporary questions and unresolved...