Book Image

Mastering Julia - Second Edition

By : Malcolm Sherrington
Book Image

Mastering Julia - Second Edition

By: Malcolm Sherrington

Overview of this book

Julia is a well-constructed programming language which was designed for fast execution speed by using just-in-time LLVM compilation techniques, thus eliminating the classic problem of performing analysis in one language and translating it for performance in a second. This book is a primer on Julia’s approach to a wide variety of topics such as scientific computing, statistics, machine learning, simulation, graphics, and distributed computing. Starting off with a refresher on installing and running Julia on different platforms, you’ll quickly get to grips with the core concepts and delve into a discussion on how to use Julia with various code editors and interactive development environments (IDEs). As you progress, you’ll see how data works through simple statistics and analytics and discover Julia's speed, its real strength, which makes it particularly useful in highly intensive computing tasks. You’ll also and observe how Julia can cooperate with external processes to enhance graphics and data visualization. Finally, you will explore metaprogramming and learn how it adds great power to the language and establish networking and distributed computing with Julia. By the end of this book, you’ll be confident in using Julia as part of your existing skill set.
Table of Contents (14 chapters)

Distributed data sources

The JuliaData group includes a package called JuliaDB, which was heralded as a package for working with large persistent datasets. However, the GitHub pages state that there’s a caveat – it is now unmaintained and has been (at the time of writing) for over 2 years, which corresponds to version 1.4.x.

It suggests that it is preferable to use the DTables package instead, which is what we are going to do here.

To move away from some of the datasets we have used previously, I am going to look at some statistics from Football (soccer for those in the US).

We will need a few packages that need to be available and can be defined in the Project.toml file for this chapter. They can be accessed in the usual way:

julia> import Pkg; Pkg.activate(".")
julia> using Distributions, StatsBase, OnlineStats
julia> using DataFrames, DTables, Query, CSV, Printf

In the CSV folder of the DataSources directory are a couple of files referring...