Book Image

Julia 1.0 Programming Complete Reference Guide

By : Ivo Balbaert, Adrian Salceanu
Book Image

Julia 1.0 Programming Complete Reference Guide

By: Ivo Balbaert, Adrian Salceanu

Overview of this book

Julia offers the high productivity and ease of use of Python and R with the lightning-fast speed of C++. There’s never been a better time to learn this language, thanks to its large-scale adoption across a wide range of domains, including fintech, biotech and artificial intelligence (AI). You will begin by learning how to set up a running Julia platform, before exploring its various built-in types. This Learning Path walks you through two important collection types: arrays and matrices. You’ll be taken through how type conversions and promotions work, and in further chapters you'll study how Julia interacts with operating systems and other languages. You’ll also learn about the use of macros, what makes Julia suitable for numerical and scientific computing, and how to run external programs. Once you have grasped the basics, this Learning Path goes on to how to analyze the Iris dataset using DataFrames. While building a web scraper and a web app, you’ll explore the use of functions, methods, and multiple dispatches. In the final chapters, you'll delve into machine learning, where you'll build a book recommender system. By the end of this Learning Path, you’ll be well versed with Julia and have the skills you need to leverage its high speed and efficiency for your applications. This Learning Path includes content from the following Packt products: • Julia 1.0 Programming - Second Edition by Ivo Balbaert • Julia Programming Projects by Adrian Salceanu
Table of Contents (18 chapters)

Using simple statistics to better understand our data

Now that it's clear how the data is structured and what is contained in the collection, we can get a better understanding by looking at some basic stats.

To get us started, let's invoke the describe function:

julia> describe(iris)

The output is as follows:

This function summarizes the columns of the iris DataFrame. If the columns contain numerical data (such as SepalLength), it will compute the minimum, median, mean, and maximum. The number of missing and unique values is also included. The last column reports the type of data stored in the row.

A few other stats are available, including the 25th and the 75th percentile, and the first and the last values. We can ask for them by passing an extra stats argument, in the form of an array of symbols:

julia> describe(iris, stats=[:q25, :q75, :first, :last]) 

The...