Book Image

Interactive Visualization and Plotting with Julia

By : Diego Javier Zea
Book Image

Interactive Visualization and Plotting with Julia

By: Diego Javier Zea

Overview of this book

The Julia programming language offers a fresh perspective into the data visualization field. Interactive Visualization and Plotting with Julia begins by introducing the Julia language and the Plots package. The book then gives a quick overview of the Julia plotting ecosystem to help you choose the best library for your task. In particular, you will discover the many ways to create interactive visualizations with its packages. You’ll also leverage Pluto notebooks to gain interactivity and use them intensively through this book. You’ll find out how to create animations, a handy skill for communication and teaching. Then, the book shows how to solve data analysis problems using DataFrames and various plotting packages based on the grammar of graphics. Furthermore, you’ll discover how to create the most common statistical plots for data exploration. Also, you’ll learn to visualize geographically distributed data, graphs and networks, and biological data. Lastly, this book will go deeper into plot customizations with Plots, Makie, and Gadfly—focusing on the former—teaching you to create plot themes, arrange multiple plots into a single figure, and build new plot types. By the end of this Julia book, you’ll be able to create interactive and publication-quality static plots for data analysis and exploration tasks using Julia.
Table of Contents (19 chapters)
1
Section 1 – Getting Started
6
Section 2 – Advanced Plot Types
12
Section 3 – Mastering Plot Customization

Drawing regression lines

Drawing regression lines is an excellent way to visualize the association between two non-independent variables. Gadfly and AlgebraOfGraphics offer easy ways to create such plots. There are two kinds of regression lines we can make with these packages. The first is a classical linear regression to visualize the linear association between two variables. The second is a local regression, usually using a locally estimated scatterplot smoothing (LOESS) method. The LOESS method performs polynomial regressions on subsets of the data points. Therefore, they have two important parameters: the bandwidth or smoothing parameter and the degree of the polynomials. The bandwidth determines the sizes of the data subsets. Therefore, small bandwidth values create regressions that are sensitive to the local variations.

We can create these plots in Gadfly using Stat.smooth. This function takes a model keyword argument to select between a linear model, passing the :lm symbol...