Book Image

Extending Power BI with Python and R

By : Luca Zavarella
Book Image

Extending Power BI with Python and R

By: Luca Zavarella

Overview of this book

Python and R allow you to extend Power BI capabilities to simplify ingestion and transformation activities, enhance dashboards, and highlight insights. With this book, you'll be able to make your artifacts far more interesting and rich in insights using analytical languages. You'll start by learning how to configure your Power BI environment to use your Python and R scripts. The book then explores data ingestion and data transformation extensions, and advances to focus on data augmentation and data visualization. You'll understand how to import data from external sources and transform them using complex algorithms. The book helps you implement personal data de-identification methods such as pseudonymization, anonymization, and masking in Power BI. You'll be able to call external APIs to enrich your data much more quickly using Python programming and R programming. Later, you'll learn advanced Python and R techniques to perform in-depth analysis and extract valuable information using statistics and machine learning. You'll also understand the main statistical features of datasets by plotting multiple visual graphs in the process of creating a machine learning model. By the end of this book, you’ll be able to enrich your Power BI data models and visualizations using complex algorithms in Python and R.
Table of Contents (22 chapters)
Section 1: Best Practices for Using R and Python in Power BI
Section 2: Data Ingestion and Transformation with R and Python in Power BI
Section 3: Data Enrichment with R and Python in Power BI
Section 3: Data Visualization with R in Power BI

Anonymizing data in Power BI

One of the possible scenarios that could happen to you during your career as a report developer in Power BI is the following. Imagine you are given an Excel dataset to import into Power BI in order to create a report to show to another department of your company. The Excel dataset contains sensitive personal data, such as names and email addresses of people who have made multiple attempts to pay for an order with a credit card. The following is an example of the contents of the Excel file:

Figure 6.4 – Excel data to be anonymized

You are asked to create the report while anonymizing the sensitive data.

The first thing that jumps out at you is that, not only do you have to anonymize the Name and Email columns, but some names or email addresses can be included in the text of some Notes. While locating email addresses is fairly easy using regular expressions, it is not as easy to locate person names in free text. For this...