Book Image

SQL Server Analysis Services 2012 Cube Development Cookbook

Book Image

SQL Server Analysis Services 2012 Cube Development Cookbook

Overview of this book

Microsoft SQL Server is a relational database management system. As a database, it is a software product whose primary function is to store and retrieve data as requested by other software applications. SQL Server Analysis Services adds OLAP and data mining capabilities for SQL Server databases. OLAP (online analytical processing) is a technique for analyzing business data for effective business intelligence. This practical guide teaches you how to build business intelligence solutions using Microsoft’s core product – SQL Server Analysis Services. The book covers the traditional multi-dimensional model which has been around for over a decade as well as the tabular model introduced with SQL Server 2012. Starting with comparing MultiDimensional and tabular models – discussing the values and limitations of each, you will then cover the essential techniques for building dimensions and cubes. Following on from this, you will be introduced to more advanced topics, such as designing partitions and aggregations, implementing security, and synchronizing databases for solutions serving many users. The book also covers administrative material, such as database backups, server configuration options, and monitoring and tuning performance. We also provide a primer on MultiDimensional eXpressions (MDX) as well as Data Analysis expressions (DAX) languages. This book provides you with data cube development techniques, and also the ongoing monitoring and tuning for Analysis Services.
Table of Contents (19 chapters)
SQL Server Analysis Services 2012 Cube Development Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Working with non-SQL Server data sources


As discussed throughout the book you normally build cubes based on the SQL Server relational database. SQL Server DBA's and cube developers don't always share responsibilities, although some responsibilities may overlap. Generally, cube developers have database owner permissions to the relational source, which allows them to define necessary data structures for dimensions and partitions. However, in the real world the aforementioned assumptions do not always hold. Analysis Services solution can be developed on top of any relational data source to which you could connect to using .NET or OLEDB providers. The data warehouse might be owned by a different team than the one responsible for developing Analysis Services objects. Database administrators might only be willing to provide read access to the tables and no permission to create additional objects.

In the easiest scenario, you can exploit SQL Server Integration Services (SSIS) to import data from a nonSQL Server relational source into a SQL Server database and use it as the staging data repository. Once data is in the SQL Server you can use either stored procedures or SSIS to load data into fact and dimension tables as needed.

In a more complicated scenario, the company might not allow you to use the SQL Server even for a relational data warehouse or a staging area. The reasons for this could vary but aren't really relevant to this discussion—the bottom line is that you must create a SSAS solution using Oracle, db2, Sybase, Teradata, or another relational platform. This complicates development because you must learn at least the basics of SQL flavor, used by the relational data source, to define named queries and calculations. Additionally, you must also become familiar with errors that the relational data source might raise during processing (and queries, in case you use a ROLAP storage). The transaction isolation levels supported by each RDBMS can also be different, and the various classes of objects can be implemented using a different nomenclature. For example, the SQL Server implements ROLAP aggregations as indexed views, but db2 does not support indexed views—similar objects in db2 are called Materialized Query Tables (MQT). Lastly, not every database vendor allows looking under the hood as Microsoft does using Profiler, and majority of the vendors offer support for textual (not graphical) query execution plans.

In this section, I will cover a few "gotchas" in case you find yourself developing SSAS solutions against nonSQL Server data sources.

SSAS data structures

Most SSAS solutions are built using fact and dimension tables. If you must develop against a database using the third normal form, you may have to define the necessary objects in your data source views as named queries and/or named calculations. This is relatively straightforward as long as you have the necessary permissions to the underlying objects (tables or views) and you are comfortable using the SQL syntax of a given RDBMS. Additionally, you'll need the parameters for connecting to the relational data source using a .NET or OLEDB provider supplied either by Microsoft or another vendor. Keep in mind that Microsoft will not support third-party providers. If you have any issues during processing or ROLAP queries, you'll have to work directly with the vendor who supplied the driver.

Transaction isolation levels

By default, Analysis Services uses a read-committed transaction isolation level. With this level enforced, you should never experience data consistency issues. You could improve processing performance by using a read-uncommitted transaction isolation level, because the relational database wouldn't have the locking overhead while reading the fact and dimension table data. To ensure that each processing command reads uncommitted data, you can edit the cartridge corresponding to the relational data source. For example, you can reference a knowledge-based article: http://support.microsoft.com/kb/959026 for instructions to read the uncommitted data in db2 data sources. Please note that if you alter the cartridge, every statement will read uncommitted data—you won't be able to control it for individual statements. Furthermore, you cannot add query hints to SQL statements used in the named queries or partition definitions. This is because Analysis Services uses such SQL statements as subqueries and writes an outer query prior to sending the statement to the relational source—using query hints in subqueries is not permitted.

Processing performance issues

Analysis Services' processing performance largely depends on how fast you can execute queries against the relational data source. If you're limited to read-only access, you need to speak with the DBAs and see if they could review the query execution plans and come up with necessary indexes. Bear in mind that SSAS does not offer much flexibility in how the processing queries are written—you need to capture the progress report begin and progress report end events in SQL Profiler during processing to obtain the SQL statements. As discussed in the previous chapters, you can adjust the ProcessingGroup dimension property and use the ByTable option instead of the default ByAttribute option to run only one query while processing the dimension; by default, SSAS will run one query per attribute. You can also adjust queries defining each partition as necessary. To maximize processing performance, you should try processing as many partitions in parallel as possible. From the performance perspective, it is always preferable to build a true Star schema model instead of using a normalized database model.

You may also run into processing performance issues if you have a very large number of dimensions and measures in your fact table. In some Star schema models, each new attribute is celebrated with a separate dimension. This approach is simple to implement results in very wide fact tables. In fact with such a data model each row read during processing might require more than one buffer. If this is the case, you should try splitting measures into different measure groups. Better yet, try to combine some of the dimensions, build necessary attribute relationships, and use role-playing dimensions whenever appropriate.