Book Image

Mastering Tableau

By : David Baldwin
Book Image

Mastering Tableau

By: David Baldwin

Overview of this book

Tableau has emerged as one of the most popular Business Intelligence solutions in recent times, thanks to its powerful and interactive data visualization capabilities. This book will empower you to become a master in Tableau by exploiting the many new features introduced in Tableau 10.0. You will embark on this exciting journey by getting to know the valuable methods of utilizing advanced calculations to solve complex problems. These techniques include creative use of different types of calculations such as row-level, aggregate-level, and more. You will discover how almost any data visualization challenge can be met in Tableau by getting a proper understanding of the tool’s inner workings and creatively exploring possibilities. You’ll be armed with an arsenal of advanced chart types and techniques to enable you to efficiently and engagingly present information to a variety of audiences through the use of clear, efficient, and engaging dashboards. Explanations and examples of efficient and inefficient visualization techniques, well-designed and poorly designed dashboards, and compromise options when Tableau consumers will not embrace data visualization will build on your understanding of Tableau and how to use it efficiently. By the end of the book, you will be equipped with all the information you need to create effective dashboards and data visualization solutions using Tableau.
Table of Contents (18 chapters)
Mastering Tableau
Credits
About the Author
www.Packtpub.com
Preface

Connecting Tableau to your data


At the time of writing, Tableau's Data Connection menu includes 50 different connection types, and that is somewhat of an understatement since some of those types contain multiple options. For example, the selection choice, Other Files, includes 21 options. Of course, we won't cover the details for every connection type, but we will cover the basics.

Upon opening a new instance of Tableau Desktop, you will note a link in the upper left-hand corner of the workspace. Clicking on that link will enable you to connect to data. Alternatively, you can simply click on the New Data Source icon on the toolbar:

Although subsequent chapters will consider connecting to other data sources, here we will limit the discussion to considerations when connecting to Excel and text files.

Excel and text files

Upon choosing to connect to an Excel or text file, the Tableau author is presented with two choices. Note that those choices are somewhat hidden. As shown in the following screenshot, you will need to click on the arrow next to the Open button to access them:

The Open option uses a native Tableau driver. The Open with Legacy Connection option accesses the Microsoft JET driver. Let's compare and contrast some of the differences between these two drivers.

Comparing and contrasting Native Tableau Driver and MS Jet Driver

Native Tableau Driver

MS Jet Driver

More set capabilities such as in/out and combined sets

Limited set capabilities

Count Distinct is allowed

Count Distinct is disallowed

Allows more than 255 columns

Columns are capped at 255

Special characters, such as brackets and quotation marks, are allowed in file and field names

Special characters are disallowed in file and field names

When connecting to Excel, the data type is determined by 95% of the first 10,000 rows

When connecting to Excel, the data type is determined by the first eight rows

Cannot connect to .xlsb files

Can connect to .xlsb files

File names can be any length

File names are limited to 64 characters

Custom SQL is not allowed

Custom SQL is allowed

Left and inner joins are allowed

Left, inner, and right joins are allowed

Pivot data from rows to columns

No pivoting feature

Improved header auto-detection

Note that the preceding table is not complete. There are many other differences between the functionality of Native Tableau Driver and MS Jet Driver. Most of those, however, are less consequential.

So, when should you use Native Tableau Driver versus MS JET Driver? In short, use the native Tableau driver! In almost every case it will provide better performance and more functionality. One exception is when custom SQL is required. Tableau Software does not recommend using custom SQL in most cases because Tableau-generated SQL will run more efficiently; however, in some cases it may be necessary.

Connecting to a Tableau Server

Connecting to Tableau Server is perhaps the single most important server connection type to consider, since it is frequently used to provide better performance than may otherwise be possible. Additionally, connecting to Tableau Server enables the author to receive not only data, but information regarding how that data is to be interpreted, for example, whether a given field should be considered a measure or a dimension. Let's explore this further via two exercises.

Exercise - observing metadata differences

As a precursor to connecting to Tableau Server, let's compare and contrast the instance of the Superstore data source represented in the workbook associated with this chapter (that is, the Chapter 1 workbook) with a new connection to the same data.

Exercise steps
  1. In a new instance of Tableau, navigate to Data | New Data Source | Excel to connect to the Sample - Superstore dataset that installs with Tableau desktop (it should be located on your hard drive under My Tableau Repository | Datasources).

  2. Double-click on the Orders sheet.

  3. Click on the Sheet 1 tab.

  4. Place Discount on the Text shelf.

  5. Double-click on Profit and Sales.

  6. Compare the results of the new worksheet to that of the worksheet entitled Observing Metadata Differences in the Chapter 1 workbook:

  • A: The data source name has been altered in the Chapter 1 workbook

  • B: In the Chapter 1 workbook, the default aggregation of Discount is AVG. In the unaltered instance the default is SUM

  • C: Product Hierarchy exists only in the Chapter 1 workbook

  • D: The format of Discount, Profit, and Sales differs between the two instances

  • E: Profit Ratio exists only in the Chapter 1 workbook

Exercise - connecting to Tableau Server

In order to complete this exercise, access to an instance of Tableau Server is necessary. If you do not have access to Tableau Server, consider installing a trial version on your local computer:

  1. In the workbook associated with this chapter, navigate to the Connecting to Tableau Server worksheet.

  2. Right-click on the Superstore data source and select Publish to Server.

  3. Log in to Tableau Server and follow the prompts to complete the publication of the data source.

  4. After the data source has been published, open a new instance of Tableau Desktop and navigate to Data | New Data Source | Tableau Server to connect to the data source published in the previous step.

  5. Click on Sheet 1 in the new workbook and observe that the changes made in the Chapter 1 workbook have been preserved.

  6. Within the Data pane, right-click on Profit Ratio and note that it is not directly editable.

Having completed the previous two exercises, let's discuss the most germane point; that is, metadata. Metadata is often defined as data about the data. In the preceding case, the data source name, default aggregation, default number formatting, and hierarchy are all examples of Tableau remembering changes made to the metadata. This is important because publishing a data connection allows for consistency across multiple Tableau authors. For example, if your company has a policy regarding the use of decimal points when displaying currency, that policy will be easily adhered to if all Tableau authors start building workbooks by pointing to data sources where all formatting has been predefined.

In the last exercise, the fact that the Profit Ratio calculated field was not directly editable when accessed via connecting to Tableau Server as a data source has important implications. Imagine the problems that would ensue if different Tableau authors defined Profit Ratio differently. End users would have no way of understanding what Profit Ratio truly means. However, by creating a workbook based on a published data source, the issue is alleviated. One version of Profit Ratio is defined and it can only be altered by changing the data source. This functionality can greatly improve consistency across the enterprise.

Connecting to saved data sources

Connecting to a saved data source on a local machine is much like connecting to a data source published on Tableau Server. Metadata definitions associated with the local data source are preserved, just like they are on Tableau Server. Of course, since the data source is local instead of remote, the publication process is different. Let's explore this via an exercise.

Exercise - creating a local data connection

  1. In the workbook associated with this chapter, navigate to the tab entitled Local Data Connection.

  2. In the Data pane, right-click on the Superstore data source and select Add to Saved Data Sources.

  3. Using the resulting dialog box, save the data source as Superstore in My Tableau Repository | Datasources, located on your hard drive.

  4. Click on the Go to Start icon located in the top-left corner of your screen and observe the new saved data source:

Tip

Note that you can save a local data source that points to a published data source on Tableau Server. First, connect to a published data source on Tableau Server. Next, right-click on the data source in your workspace and choose Add to Saved Data Sources. Now you can connect to Tableau Server directly from your Start page!