Book Image

Tableau Prep Cookbook

By : Hendrik Kleine
Book Image

Tableau Prep Cookbook

By: Hendrik Kleine

Overview of this book

Tableau Prep is a tool in the Tableau software suite, created specifically to develop data pipelines. This book will describe, in detail, a variety of scenarios that you can apply in your environment for developing, publishing, and maintaining complex Extract, Transform and Load (ETL) data pipelines. The book starts by showing you how to set up Tableau Prep Builder. You’ll learn how to obtain data from various data sources, including files, databases, and Tableau Extracts. Next, the book demonstrates how to perform data cleaning and data aggregation in Tableau Prep Builder. You’ll also gain an understanding of Tableau Prep Builder and how you can leverage it to create data pipelines that prepare your data for downstream analytics processes, including reporting and dashboard creation in Tableau. As part of a Tableau Prep flow, you’ll also explore how to use R and Python to implement data science components inside a data pipeline. In the final chapter, you’ll apply the knowledge you’ve gained to build two use cases from scratch, including a data flow for a retail store to prepare a robust dataset using multiple disparate sources and a data flow for a call center to perform ad hoc data analysis. By the end of this book, you’ll be able to create, run, and publish Tableau Prep flows and implement solutions to common problems in data pipelines.
Table of Contents (11 chapters)

Connecting to SAS, SPSS, and R files

In this recipe, we'll connect to a statistical file. Tableau Prep offers fantastic integration with popular statistical files from SAS (.sas7bdat), SPSS (.sav), and R (.rdata, .rda).

I advocate the use of open file formats such as CSV or commonly used standards such as Excel. However, if you are unable to obtain your data in such a format from your data science partner, this connector may offer a solution.

Getting ready

In this recipe, we'll connect to an R file using the statistical file connector. In order to follow along, download the Sample Files 2.3 folder from the book's GitHub repository.

How to do it…

To get started, ensure you have the sample RData file available on your computer. From the Tableau Prep home screen follow these steps:

  1. Click the Connect to Data button and select Statistical file.
  2. From the browse file window, locate and open our statistical file named December 2016 Sales.Rdata.

    And with just these few steps, Tableau Prep Builder has added the statistical file source to a new flow:

Figure 2.18 – Flow with a statistical file connection

Figure 2.18 – Flow with a statistical file connection

Most options in the bottom pane are identical to those when processing Excel files. However, there is a small but important feature absent. You cannot alter the data type of the fields in the statistical file connection step. In order to do this, you have to use a cleaning step, which we'll discuss in Chapter 3, Cleaning Transformations:

Figure 2.19 – Data types cannot be altered directly in the statistical file connection step

Figure 2.19 – Data types cannot be altered directly in the statistical file connection step

In this recipe, you have learned how to add Tableau Prep to a data science workflow by connecting to data produced by popular statistics applications.

How it works…

Tableau Prep unpacks statistical files when you connect to them and, from that moment on, allows you to leverage them like any other connection.

There's more…

There are some limitations when it comes to connecting to statistical files. If you run into any connection issues, I recommend you refer to the following section of the Tableau documentation online: https://help.tableau.com/current/pro/desktop/en-us/examples_statfile.htm.