Book Image

Tableau 2019.x Cookbook

By : Dmitry Anoshin, Teodora Matic, Slaven Bogdanovic, Tania Lincoln, Dmitrii Shirokov
Book Image

Tableau 2019.x Cookbook

By: Dmitry Anoshin, Teodora Matic, Slaven Bogdanovic, Tania Lincoln, Dmitrii Shirokov

Overview of this book

Tableau has been one of the most popular business intelligence solutions in recent times, thanks to its powerful and interactive data visualization capabilities. Tableau 2019.x Cookbook is full of useful recipes from industry experts, who will help you master Tableau skills and learn each aspect of Tableau's ecosystem. This book is enriched with features such as Tableau extracts, Tableau advanced calculations, geospatial analysis, and building dashboards. It will guide you with exciting data manipulation, storytelling, advanced filtering, expert visualization, and forecasting techniques using real-world examples. From basic functionalities of Tableau to complex deployment on Linux, you will cover it all. Moreover, you will learn advanced features of Tableau using R, Python, and various APIs. You will learn how to prepare data for analysis using the latest Tableau Prep. In the concluding chapters, you will learn how Tableau fits the modern world of analytics and works with modern data platforms such as Snowflake and Redshift. In addition, you will learn about the best practices of integrating Tableau with ETL using Matillion ETL. By the end of the book, you will be ready to tackle business intelligence challenges using Tableau's features.
Table of Contents (18 chapters)

Creating sample data

Next, we want to create a Hive external table on top of S3 logs and use EMR to compute the results. We can do this using the following three different methods:

  • Using EMR CLI
  • Using EMR console
  • Using web GUI

How to do it...

It depends on your preferences. In my example, I will use EMR CLI. We should already be connected to the EMR cluster via SSH. Let's start to work with Hive:

  1. In EMR CLI type hive and it will launch Hive.
  2. Next, we can execute SQL commands. Let's create the table on top of the CloudFront logs that are stored in the S3 bucket. We will run this DDL, as follows:
hive>CREATE EXTERNAL TABLE IF NOT EXISTS cloudfront_logs (
DateObject Date,
Time STRING,
Location STRING,
Bytes...