Book Image

SQL Server Analysis Services 2012 Cube Development Cookbook

Book Image

SQL Server Analysis Services 2012 Cube Development Cookbook

Overview of this book

Microsoft SQL Server is a relational database management system. As a database, it is a software product whose primary function is to store and retrieve data as requested by other software applications. SQL Server Analysis Services adds OLAP and data mining capabilities for SQL Server databases. OLAP (online analytical processing) is a technique for analyzing business data for effective business intelligence. This practical guide teaches you how to build business intelligence solutions using Microsoft’s core product – SQL Server Analysis Services. The book covers the traditional multi-dimensional model which has been around for over a decade as well as the tabular model introduced with SQL Server 2012. Starting with comparing MultiDimensional and tabular models – discussing the values and limitations of each, you will then cover the essential techniques for building dimensions and cubes. Following on from this, you will be introduced to more advanced topics, such as designing partitions and aggregations, implementing security, and synchronizing databases for solutions serving many users. The book also covers administrative material, such as database backups, server configuration options, and monitoring and tuning performance. We also provide a primer on MultiDimensional eXpressions (MDX) as well as Data Analysis expressions (DAX) languages. This book provides you with data cube development techniques, and also the ongoing monitoring and tuning for Analysis Services.
Table of Contents (19 chapters)
SQL Server Analysis Services 2012 Cube Development Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

A sample scenario for choosing the Snowflake schema


Here's an example of a design decision process that would lead you to a Snowflake dimension. Start by assuming that all the dimensions in the Data Mart (versus the Data Warehouse, where we may have different ideas) will be modeled as Stars.

We start in our first design with a single dimension, Geography, containing the following columns:

  • skGeography (surrogate key)

  • PostalCode (business key)

  • CityID

  • CityName

  • StateID

  • StateName

  • CountryID

  • CountryName

We have one fact source table containing, say, population data with the following columns:

  • CensusDate

  • PostalCode

  • PopulationCount

In ETL, we would join this source table to the dimension table on the business key PostalCode to retrieve the surrogate key and use this to load the data mart fact table:

  • CensusDate

  • skGeography

  • PopulationCount

Now, let's introduce a second fact source table containing projected population data, but with a different grain. Let's assume this data comes in, not at the Postal Code grain but rather at the State grain. We'd have a source table with columns such as follows:

  • ProjectionDate

  • StateID

  • ProjectedGrowth

We can't join this new source table to our existing Geography dimension because if we do so, we will get back many surrogate keys—each representing one postal code within the specified state. So, we need to Snowflake (partially normalize) the Geography dimension so that it will support the grain of each of our fact source tables, giving us two dimension tables similar to the the following two bullet lists:

dimGeography:

  • skGeography

  • PostalCode

  • CityID

  • CityName

  • skGeographyState

and dimGeographyState:

  • skGeographyState

  • StateID

  • StateName

  • CountryID

  • CountryName

Notice that we did not fully normalize the dimension (postal code and city both exist in the first table, state and country in the second). We just normalized the dimension enough to give us a single relationship between each of our two facts and this dimension.