Book Image

Apache Superset Quick Start Guide

By : Shashank Shekhar

Book Image

Apache Superset Quick Start Guide

By: Shashank Shekhar

Overview of this book

Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset. First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe. You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data. Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.

Preface

Who this book is for

What this book covers

To get the most out of this book

Free Chapter

Getting Started with Data Exploration

Getting Started with Data Exploration

Installing Superset

Sharing Superset

Configuring Superset

Adding a database

Creating a visualization

Uploading a CSV

Configuring the table schema

Customizing the visualization

Making a dashboard

Configuring Superset and Using SQL Lab

Configuring Superset and Using SQL Lab

Setting the web server

Creating the metadata database

Migrating data from SQLite to PostgreSQL

Setting up an NGINX reverse proxy

Setting up HTTPS or SSL certification

Flask-AppBuilder permissions

Securing session data

Caching queries

Mapbox access token

Long-running queries

Main configuration file

User Authentication and Permissions

User Authentication and Permissions

Security features

Setting up OAuth Google sign-in

List Users page

List Base Permissions page

Views/Menus page

List Permissions on Views/Menus pages

Alpha and gamma – building blocks for custom roles

User Statistics page

Visualizing Data in a Column

Visualizing Data in a Column

Distribution – histogram

Comparison – relationship between feature values

Comparison – box plots for groups of feature values

Comparison – side-by-side visualization of two feature values

Summary statistics – headline

Comparing Feature Values

Comparing Feature Values

Comparing multiple time series

Comparing two time series

Identifying differences in trends for two feature values

Drawing Connections between Entity Columns

Drawing Connections between Entity Columns

Directed force networks

Sankey's diagram

Mapping Data That Has Location Information

Mapping Data That Has Location Information

Building Dashboards

Building Dashboards

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Summary

That must have felt productive, since we were able to create our dashboard from nothing in Superset.

Before we summarize what we have just finished in this chapter, it is important that we discuss when Superset might not be the right visualization tool for a data analysis project.

Visualization of data requires data aggregation. Data aggregation is a function of one or more column values in tables. A group by operation is applied on a particular column to create groups of observations, which are then replaced with the summary statistics defined by the data aggregation function. Superset provides many data aggregation functions; however, it has limited usability when hierarchical data aggregation is required for visualizations.

Hierarchical data aggregation is the process of taking a large amount of rows in a table and displaying summaries of partitions and their sub-partitions. This is not an option in Superset for most of the visualizations.

Also, Superset has limited customization options on the design and formatting of visualizations. It supports changes in color schemes and axis label formatting. Individuals or teams who want to tinker and optimize the visual representation of their data will find Superset very limited for their needs.

Finally, it's time to summarize our achievements. We have been able to install Superset, add a database, create a dashboard, and share it with users. We are now ready to add additional databases and tables, and create new visualizations and dashboards. Exploring data and telling data stories with Superset dashboards is one of your skill sets now!