Sign In Start Free Trial

Book Overview & Buying
Table Of Contents

Apache Superset Quick Start Guide

By : Shashank Shekhar

3.5 (2)

Apache Superset Quick Start Guide

3.5 (2)

By: Shashank Shekhar

Overview of this book

Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset. First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe. You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data. Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.

Preface

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Getting Started with Data Exploration

Getting Started with Data Exploration

Datasets

Installing Superset

Sharing Superset

Configuring Superset

Adding a database

Adding a table

Creating a visualization

Uploading a CSV

Configuring the table schema

Customizing the visualization

Making a dashboard

Summary

Configuring Superset and Using SQL Lab

Configuring Superset and Using SQL Lab

Setting the web server

Creating the metadata database

Migrating data from SQLite to PostgreSQL

Web server

Setting up an NGINX reverse proxy

Setting up HTTPS or SSL certification

Flask-AppBuilder permissions

Securing session data

Caching queries

Mapbox access token

Long-running queries

Main configuration file

SQL Lab

Summary

User Authentication and Permissions

User Authentication and Permissions

Security features

Setting up OAuth Google sign-in

List Users page

List Base Permissions page

Views/Menus page

List Permissions on Views/Menus pages

Alpha and gamma – building blocks for custom roles

User Statistics page

Action log

Summary

Visualizing Data in a Column

Visualizing Data in a Column

Dataset

Distribution – histogram

Comparison – relationship between feature values

Comparison – box plots for groups of feature values

Comparison – side-by-side visualization of two feature values

Summary statistics – headline

Summary

Comparing Feature Values

Comparing Feature Values

Dataset

Comparing multiple time series

Comparing two time series

Identifying differences in trends for two feature values

Summary

Drawing Connections between Entity Columns

Drawing Connections between Entity Columns

Datasets

Directed force networks

Chord diagrams

Sunburst chart

Sankey's diagram

Partitioning

Summary

Mapping Data That Has Location Information

Mapping Data That Has Location Information

Data

Scatter point

Scatter grid

Arcs

Path

Summary

Building Dashboards

Building Dashboards

Charts

Dashboards

Summary

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Summary

That must have felt productive, since we were able to create our dashboard from nothing in Superset.

Before we summarize what we have just finished in this chapter, it is important that we discuss when Superset might not be the right visualization tool for a data analysis project.

Visualization of data requires data aggregation. Data aggregation is a function of one or more column values in tables. A group by operation is applied on a particular column to create groups of observations, which are then replaced with the summary statistics defined by the data aggregation function. Superset provides many data aggregation functions; however, it has limited usability when hierarchical data aggregation is required for visualizations.

Hierarchical data aggregation is the process of taking a large amount of rows in a table and displaying summaries of partitions and their sub-partitions. This is not an option in Superset for most of the visualizations.

Also, Superset has limited customization options on the design and formatting of visualizations. It supports changes in color schemes and axis label formatting. Individuals or teams who want to tinker and optimize the visual representation of their data will find Superset very limited for their needs.

Finally, it's time to summarize our achievements. We have been able to install Superset, add a database, create a dashboard, and share it with users. We are now ready to add additional databases and tables, and create new visualizations and dashboards. Exploring data and telling data stories with Superset dashboards is one of your skill sets now!

CONTINUE READING

83

Tech Concepts

36

Programming languages

73

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Apache Superset Quick Start Guide

Search

Your notes and bookmarks