Book Image

Julia 1.0 Programming Cookbook

By : Bogumił Kamiński, Przemysław Szufel
Book Image

Julia 1.0 Programming Cookbook

By: Bogumił Kamiński, Przemysław Szufel

Overview of this book

Julia, with its dynamic nature and high-performance, provides comparatively minimal time for the development of computational models with easy-to-maintain computational code. This book will be your solution-based guide as it will take you through different programming aspects with Julia. Starting with the new features of Julia 1.0, each recipe addresses a specific problem, providing a solution and explaining how it works. You will work with the powerful Julia tools and data structures along with the most popular Julia packages. You will learn to create vectors, handle variables, and work with functions. You will be introduced to various recipes for numerical computing, distributed computing, and achieving high performance. You will see how to optimize data science programs with parallel computing and memory allocation. We will look into more advanced concepts such as metaprogramming and functional programming. Finally, you will learn how to tackle issues while working with databases and data processing, and will learn about on data science problems, data modeling, data analysis, data manipulation, parallel processing, and cloud computing with Julia. By the end of the book, you will have acquired the skills to work more effectively with your data
Table of Contents (18 chapters)
Title Page
Copyright and Credits
Dedication
About Packt
Contributors
Preface
Index

Preface

The Julia programming language, with its dynamic nature and high performance, reduces the time that needs to be taken for the development of computational models with easy-to-maintain computational code. Julia 1.0 Programming Cookbook will be your solution-based guide, and will take you through different programming aspects with Julia.

Starting with the new features of Julia 1.0, each recipe addresses a specific problem, along with a discussion that explains the solution and offers insight into how it works. You will work with the powerful Julia tools and data structures, along with the most popular Julia packages. You will learn how to create vectors, handle variables, and work with functions. You will be introduced to  various recipes for numerical computing, distributed computing, and achieving high performance. You'll see how to optimize data science programs with parallel computing and memory allocation. Moving forward, we will look into more advanced concepts, such as metaprogramming and functional programming. Finally, you will learn how to tackle issues while working with databases and data processing, and will learn about data science problems, data modeling, data analysis, data manipulation, parallel processing, and cloud computing with Julia.

By the end of the book, you will have the skills you need to work more effectively with your data.

Who this book is for

The target audience of this book is data scientists or programmers that want to improve their skills in working with the Julia programming language.

It is recommended that the user has a little experience with Julia or intermediate-level experience with other programming languages such as Python, R, or MATLAB.

What this book covers

Chapter 1Installing and Setting Up Julia, introduces the use of the Julia command line and the setup of the entire Julia computational infrastructure, including building Julia, optimizing performance, and configuring Julia for the cloud. 

Chapter 2, Data Structures and Algorithms, contains practical examples of how custom algorithms can be implemented, while also taking advantage of the built-in functionality.

Chapter 3, Data Engineering in Julia, explains that working with data requires good understanding of streams and data sources. In this chapter, the reader will learn how to write data to IO streams with Julia and how to handle web transfers.

Chapter 4Numerical Computing with Julia, contains recipes showing how computing tasks can be performed in the Julia language. Each recipe implements a relatively simple and standard algorithm to show a specific feature of the language. Therefore, the reader can concentrate on the implementation issues.

Chapter 5, Variables, Types, and Functions, presents topics related to variables and their scoping, Julia type systems and processing functions, and exceptions in Julia.

Chapter 6Metaprogramming and Advanced Typing, presents various advanced programming topics in Julia.

Chapter 7, Handling Analytical Data, presents the DataFrames.jl package, providing a rich set of functionalities for working with them—manipulating rows and columns, handling categorical and missing data, and various standard transformations of tables (filtering, sorting, joins, wide-long transformation, and tabulation).

Chapter 8Julia Workflow, explains the recommended workflow and shows how to build it using modules.

Chapter 9, Data Science, explains that Julia provides great support for various numerical and data science tasks. It allows us to define and optimize models in a very flexible solver-agnostic way. Julia also contains a huge toolbox for visualizing data and machine learning. 

Chapter 10Distributed Computing, shows how to use Julia for parallel and distributed computing tasks. An important feature of Julia is the ability to scale up computations across many processes, threads, and up to distributed computational clusters.

To get the most out of this book

Some understanding of Julia would be a bonus.

In this book, we use many Julia packages. Here, we provide an installation script for those packages.

This script can be also found in the GitHub repository in the cookbookconf.jl file.

All packages are installed and pinned in a concrete version that ensures that they will work correctly with the recipes given in the book. More information about managing packages can be found in the Managing Packages recipe in Chapter 1Installing and Setting Up Julia.

If you do not pin the packages to the required version, they might still work, but it is possible that newer versions of some packages introduce non-compatible API changes, in which case the codes of the recipes might need small alterations to make them run.

We have divided packages into three groups:

  • Packages that do not depend on external software
  • Packages that can optionally depend on an external Anaconda Python installation
  • Packages that require external software to be run.

For each package group, we provide an installation script that installs exactly the same version that we have used in the book.

All the packages listed are installed by calling the following function:

using Pkg

function addandpin(spec)
    Pkg.add(PackageSpec(; spec...))
    Pkg.pin(spec.name)
end

Packages that do not depend on external software can be installed with the following commands:

pkg1 = [(name="StatsBase", version="0.26.0"),
        (name="TimeZones", version="0.8.1"),
        (name="BSON", version="0.2.1"),
        (name="Revise", version="0.7.12"),
        (name="Distributions", version="0.16.4"),
        (name="Clp", version="0.5.0"),
        (name="HTTP", version="0.7.1"),
        (name="Gumbo", version="0.5.1"),
        (name="StringEncodings", version="0.3.1"),
        (name="ZMQ", version="1.0.0"),
        (name="CodecZlib", version="0.5.0"),
        (name="JSON", version="0.19.0"),
        (name="BenchmarkTools", version="0.4.1"),
        (name="JuliaWebAPI", version="0.5.0"),
        (name="FileIO", version="1.0.2"),
        (name="ProfileView", version="0.4.0"),
        (name="StaticArrays", version="0.8.3"),
        (name="ForwardDiff", version="0.9.0"),
        (name="Optim", version="0.17.1"),
        (name="JuMP", version="0.18.4"),
        (name="JLD2", version="0.1.2"),
        (name="XLSX", version="0.4.2"),
        (name="Cbc", version="0.4.2"),
        (name="DataFrames", version="0.14.1"),
        (name="CSV", version="0.4.3"),
        (name="DataFramesMeta", version="0.4.0"),
        (name="Feather", version="0.5.0"),
        (name="FreqTables", version="0.3.0"),
        (name="OnlineStats", version="0.19.1"),
        (name="MySQL", version="0.7.0"),
        (name="Cascadia", version="0.4.0"),
        (name="UnicodePlots", version="0.3.1"),
        (name="ParallelDataTransfer", version="0.5.0")]]

foreach(addandpin, pkg1)

Packages that can optionally depend on an external Python Anaconda installation (refer to the Calling Python from Julia recipe in Chapter 8Julia Workflow, for details) can be installed with the following commands:

pkg2 = [(name="Conda", version="1.0.2"),
        (name="PyCall", version="1.18.4"),
        (name="PyPlot", version="2.6.3"),
        (name="Plots", version="0.20.5"),
        (name="StatPlots", version="0.8.1")]

foreach(addandpin, pkg2)

Some packages require external software to be installed. This includes the RCall.jlJDBC.jlLibPQ.jl and Gurobi.jl packages. Before installing those packages, make sure that the required software is installed on your system. For the RCall.jl package, check the Calling R from Julia recipe in Chapter 8Julia Workflow, for JDBC.jl and LibPQ.jl, check the Working with databases in Julia recipe, and for Gurobi.jl, check the Optimization using JuMP recipe; both recipes can be found in Chapter 9Data Science. Once you make sure that all software dependencies are installed, you can use the following commands:

pkg3 = [(name="RCall", version="0.12.1"),
        (name="JDBC", version="0.4.0"),
        (name="LibPQ", version="0.5.0"),
        (name="Gurobi", version="0.5.3")]

foreach(addandpin, pkg3)

Download the example code files

The example code files are organized in folders representing chapters and recipes. For each recipe, there is a commands.txt file that contains commands that should be typed-in by the reader. Every entry in this file is prepended by an appropriate prompt (example, $, julia>) to make sure that the user knows in which environment the command should be executed (typically the OS shell, the Julia command line). Most recipes also contain additional files, for example, source codes of Julia programs. A full list of files along with their contents is given in the Getting ready section of every recipe.

You can download the example code files for this book from your account at www.packt.com.If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packt.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Julia-1.0-Programming-Cookbook. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it https://www.packtpub.com/sites/default/files/downloads/9781788998369_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, path names, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."

 

A block of code is set as follows:

html, body, #map {
 height: 100%; 
 margin: 0;
 padding: 0
}

For each block of code, we explain how it should be used (for example, maybe it should be pasted into a file or executed on the console).

 

Any command-line input is written in bold and the command-line output is written in normal font:

julia> collect(1:5)
5-element Array{Int64,1}:
 1
 2
 3
 4
 5

julia> sin(1)
0.8414709848078965

julia>

Code that is executed in an OS shell (for Linux or Windows) is indicated with the $ sign (for Windows, this will be C:\). You should write commands that follow this sign (without the sign itself). For example, this command would give information about files in a current working directory:

$ ls

All single commands passed to the Julia command line are prepended with the julia> prompt marker. For example, this is a minimal Julia session:

$ julia --banner=no
julia> 1+2
3

julia> exit()

$

We have started Julia from the OS shell using the julia command. Then, we have entered 1+2 in the Julia command line and Julia printed 3. Finally, we have entered exit() in Julia to terminate the the Julia command-line session and go back to the shell (so we have the shell $ prompt in the last line of the output). Please note that for the aforementioned blocks of code, separate, instructions might be given that might also include copying and pasting them to the console.

In several recipes, we also discuss non-standard prompts in the Julia command line (for example, package manager mode or shell mode). They are explained in the relevant recipes.

 

All examples in this book have been tested on Linux Ubuntu 18.04 LTS and Windows 10. Please note that users of other Linux distributions will need to update their scripts (for example, Linux distributions from the Red Hat family use yum instead of apt). Most Linux-related commands should also work on macOS; however, they have not been tested on this OS.

Bold: Indicates a new term, an important word, or words that you see on screen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Note

Warnings or important notes appear like this.

Note

Tips and tricks appear like this.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it...How it works...There's more..., and See also).

To give clear instructions on how to complete a recipe, use these sections as follows:

Getting ready

This section tells you what to expect in the recipe and describes how to set up any software or any preliminary settings required for the recipe.

How to do it...

This section contains the steps required to follow the recipe.

How it works...

This section usually consists of a detailed explanation of what happened in the previous section.

 

 

There's more...

This section consists of additional information about the recipe in order to make you more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.