Mastering Tableau 2023 - Fourth Edition

By : Marleen Meier

Mastering Tableau 2023 - Fourth Edition

By: Marleen Meier

Overview of this book

This edition of the bestselling Tableau guide will teach you how to leverage Tableau's newest features and offerings in various paradigms of the BI domain. Updated with fresh topics, including the newest features in Tableau Server, Prep, and Desktop, as well as up-to-date examples, this book will take you from mastering essential Tableau concepts to advance functionalities. A chapter on data governance has also been added. Throughout this book, you'll learn how to use Tableau Hyper files and Prep Builder to easily perform data preparation and handling, as well as complex joins, spatial joins, unions, and data blending tasks using practical examples. You'll also get to grips with executing data densification and explore other expert-level examples to help you with calculations, mapping, and visual design using Tableau extensions. Later chapters will teach you all about improving dashboard performance, connecting to Tableau Server, and understanding data visualization with examples. Finally, you'll cover advanced use cases, such as self-service analysis, time series analysis, geo-spatial analysis, and how to connect Tableau to Python and R to implement programming functionalities within Tableau. By the end of this book, you'll have mastered Tableau 2023 and be able to tackle common and advanced challenges in the BI domain.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Reviewing the Basics

Creating worksheets and dashboards

Connecting Tableau to your data

Measure Names and Measure Values

Three essential Tableau concepts

Exporting data to other devices

Summary

Free Chapter

Getting Your Data Ready

Understanding Hyper

Focusing on data preparation

Summary

Using Tableau Prep Builder

Connecting to data

The Prep GUI

Data quality

Additional options with Prep

Exporting data

Summary

Learning about Joins, Blends, and Data Structures

Relationships

Joins

Unions

Blends

Understanding data structures

Summary

Introducing Table Calculations

Partition and direction of addressing

Directional and non-directional addressing

Application of functions

Guidelines: a reminder

Summary

Utilizing OData, Data Densification, Big Data, and Google BigQuery

Using the OData connector

Introducing data densification

Tableau and big data

Summary

Practicing Level of Detail Calculations

Introducing LODs

FIXED and EXCLUDE

INCLUDE

Building practical applications with LODs

Summary

Going Beyond the Basics

Improving popular visualizations

Custom background images

Tableau Exchange

Einstein Discovery

Summary

Working with Maps

Extending Tableau’s mapping capabilities without leaving Tableau

Creating custom polygons

Heatmaps

Dual axes and layering maps

Extending Tableau mapping with other technology

Swapping maps

Custom geocoding

Summary

Presenting with Tableau

Getting the best images out of Tableau

From Tableau to PowerPoint

Embedding Tableau into PowerPoint

Embedding Tableau into Google Slides

Animating Tableau

Story points and dashboards for presentations

Presentation resources

Summary

Designing Dashboards and Best Practices for Visualizations

Visualization design theory

Formatting rules

Color rules

Visualization type rules

Keep visualizations simple

Dashboard design

Dashboard best practices

Summary

Leveraging Advanced Analytics

Visualizing world indices correlations

Geo-spatial analytics with Chicago traffic violations

Building a map of intersections

Extending geo-spatial analytics with distance measures

Summary

Improving Performance

Understanding the performance-recording dashboard

Hardware and on-the-fly techniques

Connecting to data sources

Working with extracts

Using filters wisely

Efficient calculations

Other ways to improve performance

Summary

Exploring Tableau Server and Tableau Cloud

Publishing a data source to Tableau Server

Web authoring

Maintaining workbooks on Tableau Server

More Tableau Server settings and features

Summary

Integrating Programming Languages

Integrating programming languages

R installation and integration

Implementing R functionality

Python installation and integration

Implementing Python functionality

Summary

Developing Data Governance Practices

What is data governance?

Data governance principles

Data governance in Tableau

Follow-along examples

Other Books You May Enjoy

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Understanding Hyper

In this section, we will explore Tableau’s data-handling engine, and how it enables structured yet organic data mining processes in enterprises. Since the release of Tableau 10.5, we can make use of Hyper, a high-performing database, allowing us to query source data faster than ever before. Hyper is usually not well understood, even by advanced developers, because it’s not an overt part of day-to-day activities; however, if you want to truly grasp how to prepare data for Tableau, this understanding is crucial.

Hyper originally started as a research project at the University of Munich in 2008. In 2016, it was acquired by Tableau and appointed as the dedicated data engine group of Tableau, maintaining its base and employees in Munich. Initially in Tableau 10.5, Hyper replaced the earlier data-handling engine only for extracts. It is still true that live connections are not touched by Hyper, but Tableau Prep Builder now runs on the Hyper engine too, with more use cases to follow. As stated on tableau.com, “Hyper can slice and dice massive volumes of data in seconds, you will see up to 5X faster query speed and up to 3X faster extract creation speed.” And if you still can’t get enough, there is always the option to use Hyper through API calls in your preferred programming language: https://help.tableau.com/current/api/hyper_api/en-us/docs/hyper_api_reference.html.

But what makes Hyper so fast? Let’s have a look under the hood!

The Tableau data-handling engine

The vision shared by the founders of Hyper was to create a high-performing, next-generation database—one system, one state, no trade-offs, and no delays. And it worked—today, Hyper can serve general database purposes, data ingestion, and analytics at the same time.

Memory prices have decreased exponentially. The same goes for CPUs; transistor counts increased according to Moore’s law, while other features stagnated. Memory is cheap but processing still needs to be improved.

Moore’s Law is the observation made by Intel co-founder Gordon Moore that the number of transistors on a chip doubles every two years while the costs are halved. Information on Moore’s Law can be found on Investopedia at https://www.investopedia.com/terms/m/mooreslaw.asp.

While experimenting with Hyper, the founders measured that handwritten C code is faster than any existing database engine, so they came up with the idea to transform Tableau queries into C code and optimize it simultaneously, all behind the scenes, so the Tableau user won’t notice it. This translation and optimization come at a cost; traditional database engines can start executing code immediately. Tableau needs to first translate queries into code, optimize that code, then compile it into machine code, after which it can be executed. The big question is, is it still faster? As proven by many tests on Tableau Public and other workbooks, the answer is yes!

Furthermore, if there is a query estimated to be faster if executed without the compilation to machine code, Tableau has its own virtual machine (VM) on which the query will be executed right away. And next to this, Hyper can utilize 99% of available CPU computing power, whereas other parallel processes can only utilize 29% of available CPU compute. This is due to the unique and innovative technique of morsel-driven parallelization.

For those of you that want to know more about morsel-driven parallelization, a paper, which later on served as a baseline for the Hyper engine, can be found at https://15721.courses.cs.cmu.edu/spring2016/papers/p743-leis.pdf.

If you want to know more about the Hyper engine, I highly recommend the following video at https://youtu.be/h2av4CX0k6s.

Hyper parallelizes three steps of traditional data warehousing operations:

Transactions and Continuous Data Ingestion (Online Transaction Processing, or OLTP)
Analytics (Online Analytical Processing, or OLAP)
Beyond Relational (Online Beyond Relational Processing, or OBRP)

Executing those steps simultaneously makes Hyper more efficient and more performant, as opposed to traditional systems where those three steps are separated and executed one after the other.

To sum up, Hyper is a highly specialized database engine that allows us as users to get the best out of our queries. If you recall, in Chapter 1, Reviewing the Basics, we already saw that every change on a sheet or dashboard, including drag and drop pills, filters, and calculated fields, among others, is translated into a query. Those queries are pretty much SQL lookalikes; however, in Tableau we call the querying engine VizQL.

VizQL, another hidden gem on your Tableau Desktop, is responsible for visualizing data in a chart format and is fully executed in memory. The advantage is that no additional space on the database side is required here. VizQL is generated when a user places a field on a shelf. VizQL is then translated into SQL, MDX, or Tableau Query Language (TQL) and passed to the backend data source with a driver.

Hyper takeaways

This overview of the Tableau data-handling engine demonstrates a flexible approach to interfacing with data. Knowledge of the data-handling engine can reduce data preparation and data modeling efforts, thus helping us streamline the overall data mining life cycle. Don’t worry too much about data types and data that can be calculated based on the fields you have in your database. Tableau can do all the work for you in this respect. In the next section, we will discuss what you should consider from a data source perspective.

Mastering Tableau 2023 - Fourth Edition

By : Marleen Meier

Mastering Tableau 2023 - Fourth Edition

By: Marleen Meier

Overview of this book

Related Content you might be interested in

Current Title:

Mastering Tableau 2023 - Fourth Edition

Learning Tableau 2022

Getting Started with Tableau 2019.2

Learning Tableau 2019

Understanding Hyper

The Tableau data-handling engine

Hyper takeaways