Book Image

Redash v5 Quick Start Guide

By : Alexander Leibzon, Yael Leibzon
Book Image

Redash v5 Quick Start Guide

By: Alexander Leibzon, Yael Leibzon

Overview of this book

Data exploration and visualization is vital to Business Intelligence, the backbone of almost every enterprise or organization. Redash is a querying and visualization tool developed to simplify how marketing and business development departments are exposed to data. If you want to learn to create interactive dashboards with Redash, explore different visualizations, and share the insights with your peers, then this is the ideal book for you. The book starts with essential Business Intelligence concepts that are at the heart of data visualizations. You will learn how to find your way round Redash and its rich array of data visualization options for building interactive dashboards. You will learn how to create data storytelling and share these with peers. You will see how to connect to different data sources to process complex data, and then visualize this data to reveal valuable insights. By the end of this book, you will be confident with the Redash dashboarding tool to provide insight and communicate data storytelling.
Table of Contents (10 chapters)

Meeting Redash

Back in 2013, a company named EverythingMe was facing all the preceding challenges and yearned for a tool that would have an ideal set of features and fit in with our well-established data-driven culture.

After trying several legacy BI suites, a decision was made to create an easier, more collaborative, and faster tool having JSFiddle as inspiration.

These conditions stimulated the creation of Redash to target those requirements.

Redash was created during a hackathon by Arik Fraimovich, who then became the founder and lead developer of Redash.

While initially built to allow rapidly querying and visualizing data from Amazon Redshift (hence the name Re:Dash = Redshift + Dashboard), Redash quickly grew to become the company’s main data analysis, visualization, and dashboarding tool, serving all of the departments in the company.

More Data Sources and visualization types were added, people started to contribute to the source code, and eventually Redash was released as an open source tool, and later developed into a separate independent company, with Redash as its main product. Its main goal was to help other companies to become more data-driven with little to no effort, just as we did back in the days at EverythingMe.

What exactly is Redash?

In this paragraph, we will go over the key features of Redash to understand its possibilities and how it can fit into various departments within the company:

  • Redash is an open source tool that is used to create, visualize, and share queries and dashboards.
  • It works in the browser, so there's no need to install anything on a user’s computer: just click the link and log in.
  • It's easy to set up and can provide any team member with the immediate power to analyze data.
  • Redash is very easy and intuitive to use.

Even if a team member is not familiar with SQL syntax, they can utilize query parameters that they can easily modify to get the desired results, alternatively they can easily Fork (Duplicate—exactly as you would in GitHub), an existing query/visualization, and modify it according to their needs.

Both the query parameters and Forking work best as a quick intro into the Redash world.

  • Redash allows you to share and embed queries/visualizations/dashboards, which is as easy as sending a URL.
  • Redash supports many Data Sources. Whether it's RDBMS, BigData NoSQL, or REST API, you’ve got them all (a full list and further details are available in Chapter 4, Connecting to Data Sources). You can even define a query result as a separate datasource and use it later in other queries.
  • There are a handful of various visualizations, so everyone in the company will find one that suits their mission best.
  • Visualizations can be exported as PNGs, PDFs, and so on.
  • Data can be exported as CSV/JSON and Excel.
  • Redash includes query scheduling and an auto refresh mechanism.
  • Redash provides you with an alerting mechanism, where you can define alerts (for example, if the new daily user numbers are below a certain threshold), and then get notifications about it via email/chat/a custom defined webhook.
  • Redash provides live auto complete in the query editor and keyboard shortcuts.
  • Automatic schema discovery for all Data Sources.
  • Results are cached for minimal running times and rapid response. Results from the same query are reused; there is no resource wasting and needless query execution!
  • There's SSO, access control, and many other great features for enterprise-friendly workflows, in regard to user management.
  • Regarding the API, Redash provides a REST API that allows you to access all of its features programmatically, as well as pass dynamic parameters to queries. This can be used to extend functionality and tailor it to your own department's specific needs.

One example of this concerns data export for external clients, such as sending an automatic daily revenue report. Another great example of API usage is slack chatbot integration, which allows you to easily bring data into team conversations. A proper example of self-service is where any team member can fetch data insights from within a chat window, no coding is required, and there is no need to open tickets to the BI team: just type your request inside the chat and get the results!

  • In addition to the API, Redash is open source, which means that you can extend any part you want (this will be covered in Chapter 8, Customizing Redash).

What if you need a different visualization type? You got it! You need a new datasource? This is a piece of cake. You need a new alert or a new API call? Everything is at your fingertips.

  • With over 200 contributors and over 10,000 stars on GitHub, Redash has got a strong and vibrant community, and the project constantly evolves.

In summary, Redash improves—and makes more transparent to peers —a company's decision-making process, based on an easy and speedy creation of deep-dive dashboards over a company's Data Sources.

This speed comes from several aspects:

  • There's no entry level barrier, as literally every team member can log in to Redash and start getting insights.
  • There are suitable solutions across all the verticals and departments within the company, which eliminates the need to wait for other departments to get the data ready. This is tailored to your needs.
  • Self-service is king, as you can fork the queries and modify them as required. You are in full control of your data and visualizations.
  • You can query all the possible Data Sources from a single place, and join multiple Data Sources into a single dashboard.
  • You can create crucial business alerts to keep you posted as they happen.
  • Results are cached for faster retrieving and avoid the generation of useless loads on your Data Sources.
  • Instant sharing! Just send the link and you're set.

Redash has two options that you can use: hosted (with monthly subscription plans) or self-hosted Open Source version (free, and you get to maintain it yourself).

The hosted version is suitable for companies who don't want the hassle of hosting and managing the Redash service (and the surrounding components such as Redis/PostgreSQL/Celery) by themselves (usually, this requires at least one or two dedicated employees, and not everyone has them available immediately).

The self-hosted version suits you best when you have the necessary resources (both machine and human) and:

  • You want to extend Redash to your own specific needs
  • You want to contribute to the open source community
  • You want full control over your own data

The hosted and self-hosted versions are identical, and you can always switch back and forth.

It is recommended, however, to try the self-hosted Redash in development mode at the very least, as this way you can gain a better understanding of the internals of the tool that is about to change your company's data culture!

This book will cover only the self-hosted version of Redash (v5), all the covered themes are identical to the hosted version of Redash. Most of the themes are fully backwards compatible to versions older than v5, please refer to Redash official website to check the difference.

Redash architecture

Redash is a single-page web app, with JS frontend and backend.

Originally having the frontend written in AngularJS, since V5, it's in transition to React:

Redash itself is written in Python.

The UI (frontend) is AngularJS, which is responsible for all the visualizations, dashboards, and the query editor. The regular user interacts with this the most.

The server (backend) is a Flask App, which uses the Celery Distributed Task Queue as its task worker engine (Celery workers are responsible for query execution).

The server handles the actual query execution requests on various Data Sources, such as dashboard refresh requests, both from the frontend and from API calls (for example, slack bots, advanced user's webhooks, and so on).

The PostgreSQL database is used to store all the necessary application metadata and configurations (users/groups/datasource definitions/queries/dashboards).

Redis in the memory datastore serves as both the Celery Message Broker (Celery requires a message broker service to send and receive messages).