MDX with SSAS 2012 Cookbook

MDX with SSAS 2012 Cookbook - Second Edition

Overview of this book

MDX is the BI industry standard for multidimensional calculations and queries. Proficiency with this language is essential for the realization of your Analysis Services' full potential. MDX is an elegant and powerful language, and also has a steep learning curve.SQL Server 2012 Analysis Services has introduced a new BISM tabular model and a new formula language, Data Analysis Expressions (DAX). However, for the multi-dimensional model, MDX is still the only query and expression language. For many product developers and report developers, MDX is the preferred language for both the tabular model and multi-dimensional model. MDX with SSAS 2012 Cookbook is a must-have book for anyone who wants to be proficient in the MDX language and to enhance their business intelligence solutions.MDX with SSAS 2012 Cookbook is packed with immediately usable, practical solutions. It starts with elementary techniques that lay the foundation for designing advanced MDX calculations and queries. The discussions after each solution will provide you with a solid foundation and best practices. It covers a broad range of real-world topics and solutions and provides you with learning materials to become proficient in the language.This book will guide you through the hands-on and practical MDX solutions, best practices, and many intricacies that hide within the MDX calculations and queries. We will start by working with sets, creating time-aware, context-aware calculations, and business analytics solutions, through to the techniques of enhancing the cube design when MDX is not enough. We will then move on to capturing MDX generated by SSAS front-ends and using SSAS stored procedures, and we will explore the whole range of MDX solutions for real-world BI projects.

MDX with SSAS 2012 Cookbook

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Elementary MDX Techniques

Introduction

Putting data on x and y axes

Skipping axes

Using a WHERE clause to filter the data returned

Optimizing MDX queries using the NonEmpty() function

Using the PROPERTIES() function to retrieve data from attribute relationships

Basic sorting and ranking

Handling division by zero errors

Setting a default member of a hierarchy in the MDX script

Working with Sets

Introduction

Implementing the NOT IN set logic

Implementing the logical OR on members from different hierarchies

Implementing the logical AND on members from the same hierarchy

Iterating on a set in order to reduce it

Iterating on a set in order to create a new one

Iterating on a set using recursion

Dissecting and debugging MDX queries

Working with Time

Introduction

Calculating the YTD (Year-To-Date) value

Calculating the YoY (Year-over-Year) growth (parallel periods)

Calculating moving averages

Finding the last date with data

Getting values on the last date with data

Calculating today's date using the string functions

Calculating today's date using the MemberValue function

Calculating today's date using an attribute hierarchy

Calculating the difference between two dates

Calculating the difference between two times

Calculating parallel periods for multiple dates in a set

Calculating parallel periods for multiple dates in a slicer

Concise Reporting

Introduction

Isolating the best N members in a set

Isolating the worst N members in a set

Identifying the best/worst members for each member of another hierarchy

Displaying few important members, others as a single row, and the total at the end

Combining two hierarchies into one

Finding the name of a child with the best/worst value

Highlighting siblings with the best/worst values

Implementing bubble-up exceptions

Navigation

Introduction

Detecting a particular member in a hierarchy

Detecting the root member

Detecting members on the same branch

Calculating various percentages

Calculating various averages

Calculating various ranks

Business Analytics

Introduction

Forecasting using the linear regression

Forecasting using the periodic cycles

Allocating the nonallocated company expenses to departments

Analyzing fluctuation of customers

Implementing the ABC analysis

When MDX is Not Enough

Introduction

Using a new attribute to separate members on a level

Using a distinct count measure to implement histograms over existing hierarchies

Using a dummy dimension to implement histograms over nonexisting hierarchies

Creating a physical measure as a placeholder for MDX assignments

Using a new dimension to calculate the most frequent price

Using a utility dimension to implement flexible display units

Using a utility dimension to implement time-based calculations

Advanced MDX Topics

Introduction

Displaying members without children (leaves)

Displaying members with data in parent-child hierarchies

Displaying random values

Displaying a random sample of hierarchy members

Displaying a sample from a random hierarchy

Performing complex sorts

Using recursion to calculate cumulative values

On the Edge

Introduction

Clearing the Analysis Services cache

Using Analysis Services stored procedures

Executing MDX queries in T-SQL environments

Using SSAS Dynamic Management Views (DMV) to fast-document a cube

Using SSAS Dynamic Management View (DMVs) to monitor activity and usage

Capturing MDX queries generated by SSAS frontends

Performing a custom drillthrough

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Optimizing MDX queries using the NonEmpty() function

The NonEmpty() function is a very powerful MDX function. It is primarily used to improve query performance by reducing sets before the result is returned.

Both Customer and Date dimensions are relatively large in the Adventure Works DW 2012 database. Putting the cross product of these two dimensions on the query axis can take a long time. In this recipe, we'll show how the NonEmpty() function can be used on the Customer and Date dimensions to improve the query performance.

Getting ready

Start a new query in SSMS and make sure that you're working on the Adventure Works DW 2012 database. Then write the following query and execute it:

SELECT 
    { [Measures].[Internet Sales Amount] } ON 0,
    NON EMPTY
    Filter(
            { [Customer].[Customer].[Customer].MEMBERS } *
            { [Date].[Date].[Date].MEMBERS },
            [Measures].[Internet Sales Amount] > 1000
           ) ON 1
FROM
   [Adventure Works]

The query shows the sales per customer and dates of their purchases, and isolates only those combinations where the purchase was over 1000 USD.

On a typical server, it will take more than a minute before the query will return the results.

Now let's see how to improve the execution time by using the NonEmpty() function.

How to do it…

Follow these steps to improve the query performance by adding the NonEmpty() function:

Wrap NonEmpty() around the cross join of customers and dates so that it becomes the first argument of that function.
Use the measure on columns as the second argument of that function.

This is what the MDX query should look like:

SELECT 
    { [Measures].[Internet Sales Amount] } ON 0,
NON EMPTY
    Filter(
      NonEmpty(
                { [Customer].[Customer].[Customer].MEMBERS } *
                { [Date].[Date].[Date].MEMBERS },
                { [Measures].[Internet Sales Amount] }
               ),
      [Measures].[Internet Sales Amount] > 1000
           ) ON 1
FROM 
   [Adventure Works]

Execute that query and observe the results as well as the time required for execution. The query returned the same results, only much faster, right?

How it works…

Both the Customer and Date dimensions are medium-sized dimensions. The cross product of these two dimensions contains several million combinations. We know that typically, the cube space is sparse; therefore, many of these combinations are indeed empty. The Filter() operation is not optimized to work in block mode, which means a lot of calculations will have to be performed by the engine to evaluate the set on rows, whether the combinations are empty or not.

Fortunately, the NonEmpty() function exists. This function can be used to reduce any set, especially multidimensional sets that are the result of a cross join operation. It removes the empty combinations of the two sets before the engine starts to evaluate the sets on rows. A reduced set has fewer cells to be calculated, and therefore the query runs much faster.

There's more…

Regardless of the benefits that were shown in this recipe, NonEmpty() should be used with caution. Here are some good practices regarding the NonEmpty() function:

Use it with sets, such as named sets and axes.
Use it in the functions which are not optimized to work in block mode, such as with the Filter() function.
Avoid using it in aggregate functions such as Sum().
Avoid using it in other MDX set functions that are optimized to work in block mode. The use of NonEmpty() inside optimized functions will prevent them from evaluating the set in block mode. This is because the set will not be compact once it passes the NonEmpty() function. The function will break it into many small non-empty chunks, and each of these chunks will have to be evaluated separately. This will inevitably increase the duration of the query. In such cases, it is better to leave the original set intact, no matter its size. The engine will know how to run over it in optimized mode.

NonEmpty() versus NON EMPTY

Both the NonEmpty() function and the NON EMPTY keyword can reduce sets, but they do it in a different way.

The NON EMPTY keyword removes empty rows, columns, or both, depending on the axis on which that keyword is used in the query. Therefore, the NON EMPTY operator tries to push the evaluation of cells to an early stage whenever possible. This way the set on axis becomes already reduced and the final result is faster.

Take a look at the initial query in this recipe, remove the Filter() function, run the query, and notice how quickly the results come, although the multidimensional set again counts millions of tuples. The trick is that the NON EMPTY operator uses the set on the opposite axis, the columns, to reduce the set on rows. Therefore, it can be said that NON EMPTY is highly dependent on members on axes and their values in columns and rows.

Contrary to the NON EMPTY operator found only on axes, the NonEmpty() function can be used anywhere in the query.

The NonEmpty() function removes all the members from its first set, where the value of one or more measures in the second set is empty. If no measure is specified, the function is evaluated in the context of the current member.

In other words, the NonEmpty() function is highly dependent on members in the second set, the slicer, or the current coordinate, in general.

Common mistakes and useful tips

If a second set in the NonEmpty() function is not provided, the expression is evaluated in the context of the current measure in the moment of evaluation, and current members of attribute hierarchies, also in the time of evaluation. In other words, if you're defining a calculated measure and you forget to include a measure in the second set, the expression is evaluated for that same measure which leads to null, a default initial value of every measure. If you're simply evaluating the set on the axis, it will be evaluated in the context of the current measure, the default measure in the cube or the one provided in the slicer. Again, this is perhaps not something you expected. In order to prevent these problems, always include a measure in the second set.

NonEmpty() reduces sets, just like a few other functions, namely Filter() and Existing() do. But what's special about NonEmpty() is that it reduces sets extremely efficiently and quickly. Because of that, there are some rules about where to position NonEmpty() in calculations made by the composition of MDX functions (one function wrapping the other). If we're trying to detect multi-select, that is, multiple members in the slicer, NonEmpty() should go inside with the EXISTING function/keyword outside. The reason is that although they both shrink sets efficiently, NonEmpty() works great if the set is intact. EXISTING is not affected by the order of members or compactness of the set. Therefore, NonEmpty() should be applied earlier.

You may get System.OutOfMemory errors if you use the CrossJoin() operation on many large hierarchies because the cross join generates a Cartesian product of those hierarchies. In that case, consider using NonEmpty() to reduce the space to a smaller subcube. Also, don't forget to group the hierarchies by their dimension inside the cross join.

MDX with SSAS 2012 Cookbook - Second Edition

MDX with SSAS 2012 Cookbook - Second Edition

Overview of this book

Related Content you might be interested in

Current Title:

MDX with SSAS 2012 Cookbook - Second Edition

Optimizing MDX queries using the NonEmpty() function

Getting ready

How to do it…

How it works…

There's more…

NonEmpty() versus NON EMPTY

Common mistakes and useful tips