Book Image

Mastering Tableau 2019.1 - Second Edition

By : Marleen Meier, David Baldwin
Book Image

Mastering Tableau 2019.1 - Second Edition

By: Marleen Meier, David Baldwin

Overview of this book

Tableau is one of the leading business intelligence (BI) tools used to solve BI and analytics challenges. With this book, you will master Tableau's features and offerings in various paradigms of the BI domain. This book is also the second edition of the popular Mastering Tableau series, with new features, examples, and updated code. The book covers essential Tableau concepts and its advanced functionalities. Using Tableau Hyper and Tableau Prep, you’ll be able to handle and prepare data easily. You’ll gear up to perform complex joins, spatial joins, union, and data blending tasks using practical examples. Following this, you’ll learn how to perform data densification to make displaying granular data easier. Next, you’ll explore expert-level examples to help you with advanced calculations, mapping, and visual design using various Tableau extensions. With the help of examples, you’ll also learn about improving dashboard performance, connecting Tableau Server, and understanding data visualizations. In the final chapters, you’ll cover advanced use cases such as Self-Service Analytics, Time Series Analytics, and Geo-Spatial Analytics, and learn to connect Tableau to R, Python, and MATLAB. By the end of this book, you’ll have mastered the advanced offerings of Tableau and be able to tackle common and not-so-common challenges faced in the BI domain.
Table of Contents (20 chapters)
Free Chapter
1
Section 1: Tableau Concepts, Basics
9
Section 2: Advanced Calculations, Mapping, Visualizations
16
Section 3: Connecting Tableau to R, Python, and Matlab

Massively parallel processing

Big data may be semi-structured or unstructured. The massively parallel processing (MPP) architecture structures big data to enable easy querying for reporting and analytic purposes. MPP systems are sometimes referred to as shared nothing systems. This means that data is partitioned across many servers (otherwise known as nodes) and each server processes queries locally.

Let's explore MPP in detail using the following diagram as a point of reference:

Please see following, an explanation of the diagram:

  • The process begins by the Client issuing a query that is then passed to the Master Node.
  • The Master Node contains information, such as the data dictionary and session information, which it uses to generate an execution plan designed to retrieve the needed information from each underlying Node.
  • Parallel Execution represents the implementation...