In addition to table names, column names, and data formats, a semantic model defines how tables within a model relate to each other. This relationship is important because it defines the output of calculations (which are defined in the model). This recipe shows how to create relationships and the effect that these relationships have on the model.
This recipe assumes that the model in the recipe Managing the appearance of tables and fields has been created.
The reader should recognize that the model is designed to show sales information by product, date, and customer. This type of modeling scenario is commonly referred to as a star schema and is shown in the following diagram. The Sales
table is referred to as a fact table (since it stores the data facts that we wish to analyze—sales amount, tax amount, and so on) and the other tables are referred to as dimension (subject) tables because they hold descriptive information.

Extending the model further, the Products
table is linked to the Subcategory
table, and the Subcategory
table is linked to the Category
table. This is shown in the following diagram and is sometimes called a snowflake schema, since the dimension tables are not directly connected to the fact table:

An important point to note, is that each dimension table has a unique identifying field, for example, a product can be uniquely identified in the Products
table through the product_id
field. This is commonly referred to as the primary key for the table.
In contrast, the referring column (product_id
in the Sales
table) can have many occurrences of the product_id
field and is commonly referred to as the foreign key.
Start with the workbook that was developed in the prior recipe.
Drag the
Product Name
field onto the rows of the pivot table (under theRow Labels
column) and the Sales columntotal_price
onto values. Your screen should look like the following screenshot:Return to the PowerPivot window and select the
product_id
field and then click on the Create Relationship button (in the Design tab). A new window will open asking you to define the related (lookup) table and column.Select the Products option from the Related Lookup Table drop-down list and
Product ID
from the Related Lookup Column drop-down list. The Create button is now enabled. Click on Create.Return to the pivot table and refresh the model. The values for the Sum of total_price field have updated to reflect the total for each product.
Now, create relationships between the following tables and columns:
Source table
Source column
Related table
Related column
Sales
customer_id
Customers
Customer ID
Sales
order_date
Dates
Day
Tip
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
In the PowerPivot window, click on the Manage Relationships button. A new window will open showing all the relationships that have been built in the model.
Click on the Diagram View button in the Home menu to switch to the diagram modeling view. Your screen will now show tables and columns (rather than data) and look like the following screenshot. Note that previously defined relationships appear as lines that connect tables.
Tip
You can also switch between the data and diagram views by toggling the two buttons at the bottom-right side of the PowerPivot application status bar.
In the pivot table, replace the
Product Name
field with theCategory
field from thecategory
table. The total value ($1,662,869.42
) is repeated for all categories indicating that there is no relationship between theSales
table andCategory
table.From the
Products
table, select theProduct ID
field, and drag it to theproduct_id
field of theSubcateogry
table. A new relationship will be created betweenProducts
andSubcategory
. Use this method to create a relationship between theProducts
table and theCategory
table. Refresh the pivot table to ensure that the total value is not duplicated.In the pivot table, drag the
Product ID
field from theProducts
table into the Values pane. The pivot table will now show the price and number of products for each category and will look like the following screenshot:

The model has been extended to show two things. Firstly, by defining relationships between tables within the model, we have defined the filtering path for the data. This path is used to restrict rows between tables that have a relationship. Secondly, by adding a calculation (Sum of total_price and Count of Product ID), we have created measures that apply an aggregation function to the model fields. These are special types of measures within PowerPivot and are referred to as implicit measures (because the model implicitly defines a calculation for the field).
Relationships define how one table relates to another. In order to define a relationship, the join must occur on a field that has unique values in one of the tables (this is commonly called a primary key). The table that has the field with unique values is commonly called the related table. This can be seen in the diagram view, as shown in the following screenshot with the direction of the arrows on the relationships. Consider the Products
table (which has a unique field product_id
) that is related to the Sales
table (through the product_id
field in that table), but only the Products
table needs to have a unique product_id
. It is also said that the product_id
field relates to many records in the Sales
table. This can be seen by the direction of the arrow between Sales
and Products
, the related table has the arrow pointing towards it.
Relationships are important because they define how data is filtered and calculated when it is presented to the user.

Relationships are the primary mechanisms with the model that are used to filter data and perform calculations. That is, the relationship defines how data is filtered when values are shown to the user. Although this is a new concept, the concept of relationships is important because they have important implications with the way that the model determines what data to show to the user. Consider the pivot table shown in the following screenshot—Subcategory
on rows and Sum of total_price
, Count of Product ID
, and Count of category_id
as measures:

Now, consider the relationship defined in the model. This is summarized in the following screenshot:

The rows in the pivot show the subcategory which defines a filter for each row (that is a filter for each subcategory). This filter can then be applied to the Products
table, which in turn is applied to the Sales
table. It might be better to say that the rows of the Sales
table are filtered by the Products
table and then those rows are filtered by the Subcategory
table. This is why the calculations Sum of total_price
and Count of Product ID
show the correct values. The filter on rows of the Sales
table and rows of the Products
table can be applied in the direction of the arrows of the relationships.
However, this is not the case when Subcategory
is shown with data from the Category
table—a filter will only be applied in the direction that a relationship is created. This is why the calculation Count of category_id
shows the same number for each subcategory. With the subcategory on rows, a filter is created which can filter the Products
table but this filter cannot then applied in an upstream manner to the Category
table.
The application of filters may seem unintuitive at first, especially with a relationship design such as the one among Products
, Category
, and Subcategory
, but in reality the model should be designed so that the filters can be applied in a single direction. There is also, the question of unmatched values between fields used in the relationship and how they are treated by the model. For example, what would happen if we had a product_id
field in the Sales
table that did not have a match in the Products
table? Would this even be allowed in the model? The tabular model handles this situation very elegantly. The model allows this situation (without error), and unmatched values are assigned to a blank placeholder. For example, if there was a product in the Sales
table and no product in the Products
table, it would be shown as blank when Products
, Category
, or Subcategory
is used in the pivot.
We have also indicated that the model automatically created implicit measures. The term measure is common in business intelligence tools to specify that a calculated value is returned. Often, this can be a simple calculation, for example, the count of rows or the sum of a field. The important thing to remember is that measure is a single value that is returned from the model (when the model is filtered). Usually, measures are defined by a model designer, but they need not be. This is the case with an implicit measure. An implicit measure is defined automatically, depending on the data type of the column that is being used. Numeric columns are automatically summed, whereas text columns are automatically counted.
The aggregation function of an implicit measure is initially set by the underlying data type. However, the user can change this within the pivot table by editing the measure in the pivot table. This can be done in the following manner:
Alter the pivot table so that it shows subcategory on rows and
total_profit
as values. By default, the measure will showSum of total_profit
. Right-click on the measure and select Edit Measure… from the pop-up window.A new window will open, displaying the aggregation function with the measure. Select Average from the function list and change the measure name to
Average Profit
. The Measure Settings window should look like the following screenshot:Also, notice that a formula is used to define the measure, for
Average Profit
, the formula is=AVERAGE('Sales'[total_profit])
. Click on the OK button and note that the pivot table now contains the new measureAverage Profit
.
Implicit measures that have been created in the model can be seen by exposing the measures in the Advanced tab of the PowerPivot window (the Advanced tab must be activated). This is shown in the following screenshot:
