Book Image

Apache Superset Quick Start Guide

By : Shashank Shekhar
Book Image

Apache Superset Quick Start Guide

By: Shashank Shekhar

Overview of this book

Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset. First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe. You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data. Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.
Table of Contents (10 chapters)

Installing Superset

Let's get started by making a Superset web app server. We will cover security, user roles, and permissions for the web app in the next chapter.

Instead of a local machine, one can also choose to set up Superset in the cloud. This way, we can even share our Superset web app with authenticated users via an internet browser (for example, Firefox or Chrome).

We will be using Google Compute Engine (GCE) for the Superset server. You can use the link https://console.cloud.google.com and set up your account.

After you have set up your account, go to the URL https://console.cloud.google.com/apis/credentials/serviceaccountkey to download a file, `<project_id>.json`. Save this somewhere safe. This is the Google Cloud authorization JSON key file. We will copy the contents of this file to our GCE instance after we launch it. Superset uses the information in this file to authenticate itself to Google BigQuery.

GCE instances are very easy to configure and launch. Anyone with a Google account can use it. After logging in to you Google account, use this URL: https://console.cloud.google.com/compute/instances. Here, launch a g1-small (1 vCPU, 1.7 GB memory) instance with default settings. When we have to set up Superset for a large number of concurrent users (greater than five), we should choose higher compute power instances.

After launching, on the VM instances screen we can see our g1-small GCE instance is up and running:

GCE dashboard on Google Cloud Platform