-
Book Overview & Buying
-
Table Of Contents
Microsoft Power BI Quick Start Guide - Second Edition
By :
Now that you should be more comfortable working within the Power Query Editor, let's take the next step and discuss more advanced options. Often, you will find the need to go beyond these basic transforms when dealing with data that requires more care. In this section, you will learn about some common advanced transforms that you may have a need for, which include Conditional Columns, Fill Down, Unpivot, Merge Queries, and Append Queries.
Using the Power Query Editor Conditional Columns functionality is a great way to add new columns to your query that follow logical if/then/else statements. This concept of if/then/else is common across many programming languages, including Excel formulas. Let's review a real-world scenario where you would be required to do some data cleansing on a file before it could be used. In this example, you will be provided with a file of all the counties in the United States, and you must create a new column that extracts the state name from the county column and places it in its own column:
FIPS_CountyName.txt file that is found in the book files using the Text/CSV connector.Column1 to Text. When you do this, you will be prompted to replace an existing type conversion. You can accept this by clicking Replace current.Column2, filter out the value UNITED STATES from the column by clicking the arrow next to the column header and unchecking UNITED STATES. Then, click OK.Column2 by right-clicking on the column header and selecting Split Column | By Delimiter. Choose -- Custom -- for the delimiter type, and type , before then clicking OK, as shown in Figure 2.7:
Figure 2.7: Splitting a column based on a delimiter
Column1, Column 2.1, and Column 2.2, to County Code, County Name, and State Abbreviation, respectively.State Name and implement the logic If State Abbreviation equals null Then return County Name Else return null as shown in Figure 2.8. To return the value from another column, you must select the icon in the Output property, then choose Select a column. Once this is complete, click OK:
Figure 2.8: Adding a conditional column
This results in a new column called State Name, which has the fully spelled-out state name only appearing on rows where the State Abbreviation is null:

Figure 2.9: End result of following these steps
This is only setting the stage to fully scrub this dataset. To complete the data cleansing process for this file, read on to the next section about Fill Down. However, for the purposes of this example, you have now learned how to leverage the capabilities of the Conditional Column transform in the Power Query Editor.
Fill Down is a rather unique transform in how it operates. By selecting Fill Down on a particular column, a value will replace all null values below it until another non-null appears. When another non-null value is present, that value will then fill down to all subsequent null values. To examine this transform, you will pick up from where you left off in the Conditional Column example in the previous section:
Alabama switches to Alaska when appropriate.null values that appear in the State Abbreviation column. The final result should look like Figure 2.10, as follows:
Figure 2.10: End result of following these steps
In this example, you learned how you can use Fill Down to replace all of the null values below a non-null value. You can also use Fill Up to do the opposite, which would replace all the null values above a non-null value. One important thing to note is that the data must be sorted properly for Fill Down or Fill Up to be successful. In the next section, you will learn about another advanced transform, known as Unpivot.
The Unpivot transform is an incredibly powerful transform that allows you to reorganize your dataset into a more structured format best suited for BI. Let's discuss this by visualizing a practical example to help understand the purpose of Unpivot. Imagine you are provided with a file that contains the populations of US states over the last three years, and looks as in Figure 2.11:

Figure 2.11: Example data that will cause problems in Power BI
The problem with data stored like this is you cannot very easily answer simple questions. For example, how would you answer questions like, What was the total population for all states in the US in 2018? or What was the average state population in 2016? With the data stored in this format, simple reports are made rather difficult to design. This is where the Unpivot transform can be a lifesaver. Using Unpivot, you can change this dataset into something more acceptable for an analytics project, as shown in Figure 2.12:

Figure 2.12: Results of unpivoted data
Data stored in this format can now easily answer the questions posed earlier by simply dragging a few columns into your visuals. To accomplish this in other programming languages would often require fairly complex logic, while the Power Query Editor does it in just a few clicks.
There are three different methods for selecting the Unpivot transform that you should be aware of, and they include the following options:
Let's walk through two examples of using the Unpivot transform to show you the first two of these methods, and provide an understanding of how this complex problem can be solved with little effort in Power BI. The third method mentioned for doing Unpivot will not be shown since it's so similar to the first option:
Income Per Person.xlsx found in the book source files. Once you select this workbook, choose the spreadsheet called Data in the Navigator window, and then select Transform Data to launch the Power Query Editor. Figure 2.13 shows what our data looks like before the Unpivot operation:
Figure 2.13: Example before Unpivot is performed
Attribute and Value to Year and Income, respectively.
Figure 2.14: Results of unpivoted data
This first method walked you through what can often be the fastest method for performing an Unpivot transform, which is by using the Unpivot Other Columns option. In this next example, you will learn how to use the Unpivot Columns method as well:
Total Population.xlsx workbook from the book source files. Once you select this workbook, choose the spreadsheet called Data in the Navigator window, and then select OK. Figure 2.15 shows the dataset before Unpivot has been added:
Figure 2.15: Example before Unpivot is performed

Figure 2.16: Using the Unpivot Other Columns transform

Figure 2.17: Shows the final result of these steps
In this section, you learned about two different methods for performing an Unpivot. To complete the data cleansing process on these two datasets, it's recommended that you continue through the next section on merging queries.
Another common requirement when building BI solutions is the need to join two tables together to form a new outcome that includes some columns from both tables in the result. Fortunately, Power BI makes this task very simple with the Merge Queries feature. Using this feature requires that you select two tables and then determine which column or columns will be the basis of how the two queries are merged. After determining the appropriate columns for your join, you will select a join type. The join types are listed here with the description that is provided within the product:
Many of you may already be very familiar with these different join terms from SQL programming you have learned in the past. However, if these terms are new to you, I recommend reviewing Visualizing Merge Join Types in Power BI, courtesy of Jason Thomas in the Power BI Data Story Gallery: https://community.powerbi.com/t5/Data-Stories-Gallery/Visualizing-Merge-Join-Types-in-Power-BI/m-p/219906. This visual aid is a favorite of many users that are new to these concepts.
To examine the Merge Queries option, you will pick up from where you left off with the Unpivot examples in the previous section:

Figure 2.18: Configuring a merge between two queries
Merge1 that combines the results of the two queries. Go ahead and rename this query Country Stats.Income that has a value of Table for each row. This column is actually representative of the entire Income query that you joined to. To choose which columns you want from this query, click the Expand button on the column header. After clicking the Expand button, uncheck Country, Year, and Use original column name as prefix, then click OK.Income.1 to Income. Figure 2.19 shows this step completed:
Figure 2.19: Configuring a merge between two queries

Figure 2.20: Uncheck to disable the loading of this query into the data model
To begin using this dataset in a report, you would click Close & Apply. You will learn more about building reports in Chapter 5, Visualizing Data.
By default, merging queries together relies on exact matching values between your join column(s). However, you may work with data that does not always provide perfect matching values. For example, a user enters data and misspells their country as "Unite States" instead of United States. In those cases, you may consider the more advanced feature called Fuzzy Matching. With Fuzzy Matching, Power BI can perform an approximate match and still join on these two values based on the similarity of the values. In this section, you learned how the Merge Queries option is ideal for joining two queries together. In the next section, you will learn how you could solve the problem of performing a union of two or more queries.
Occasionally, you will work with multiple datasets that need to be appended to each other. Here's a scenario: you work for a customer service department for a company that provides credit or loans to customers. You are regularly provided with .csv and .xlsx files that give summaries of customer complaints regarding credit cards and student loans. You would like to analyze both of these data extracts at the same time but, unfortunately, the credit card and student loan complaints are provided in two separate files. In this example, you will learn how to solve this problem by performing an append operation on these two different files:
Student Loan Complaints.xlsx found in the book source files. Once you select this workbook, choose the spreadsheet called Student Loan Complaints in the Navigator window, and then select Transform Data to launch the Power Query Editor.Credit Card Complaints.csv found in the book source files. Click OK to bring this data into the Power Query Editor.Credit Card Complaints query selected, find and select Append Queries | Append Queries as New on the Home ribbon.
Figure 2.21: Configuring an append between two queries

Figure 2.22: Configuring an append between two queries
Now that you have learned about the various methods for combining data, the next section will discuss a more advanced method of working with data using the R programming language.
Change the font size
Change margin width
Change background colour