Book Image

Julia Cookbook

By : Raj R Jalem, Jalem Raj Rohit
Book Image

Julia Cookbook

By: Raj R Jalem, Jalem Raj Rohit

Overview of this book

Want to handle everything that Julia can throw at you and get the most of it every day? This practical guide to programming with Julia for performing numerical computation will make you more productive and able work with data more efficiently. The book starts with the main features of Julia to help you quickly refresh your knowledge of functions, modules, and arrays. We’ll also show you how to utilize the Julia language to identify, retrieve, and transform data sets so you can perform data analysis and data manipulation. Later on, you’ll see how to optimize data science programs with parallel computing and memory allocation. You’ll get familiar with the concepts of package development and networking to solve numerical problems using the Julia platform. This book includes recipes on identifying and classifying data science problems, data modelling, data analysis, data manipulation, meta-programming, multidimensional arrays, and parallel computing. By the end of the book, you will acquire the skills to work more effectively with your data.
Table of Contents (12 chapters)

Working with databases in Julia


In this section, we will explain ways to handle data stored in databases: MySQL and PostgreSQL.

Getting ready

MySQL is an open source relational database. To be able to interact with your MySQL databases from Julia, the database server (along with the relevant Julia package) needs to be installed. Assuming that the database is already set up and the MySQL session is already up and running, install the MySQL bindings for Julia by directly cloning the repository:

Pkg.clone("https://github.com/JuliaComputing/MySQL.jl")

PostgreSQL is an open source object relational database. Similar to the MySQL setup, the server of the PostgreSQL database should be up and running with a session. Now, install the PostgreSQL bindings for Julia by following the given instructions:

  1. Install the DBI package. The DBI package is a database-independent API that complies with almost all database drivers.

  2. The DBI package from Julia can be installed by directly cloning it from its repository using the following statement:

    Pkg.clone("https://github.com/JuliaDB/DBI.jl")
    
  3. Then, install the PostgreSQL library by directly cloning the library's repository using the following statement:

    Pkg.clone("https://github.com/JuliaDB/PostgreSQL.jl")
    
  4. SQLite is a light, server-less, self-contained, transactional SQL database engine. To interact with data in SQLite databases, one has to first install the SQLite server and make sure that it is up and running and displaying a prompt like this:

  5. Now, the SQLite bindings for Julia can be installed through the following steps:

    1. Add the SQLite Julia package by running the following standard package installation command:

      Pkg.add("SQLite")
      

How to do it...

Here, you will learn about connecting to databases and executing queries to manipulate and analyze data. You will also learn about the various protocols and libraries in Julia that will help you interact with databases.

MySQL

A MySQL database can be connected by a simple command that takes in the host, username, password, and database name as parameters. Let's take a look at the following steps:

  1. First, import the MySQL package:

    using MySQL
    
  2. Set up the connection to a MySQL database by including all the required parameters to establish a connection:

    conn = mysql_connect(host, user_name, password, dbname)
    
  3. Now, let's write and run a basic table creation query:

    1. Assign the query statement to a variable.

      query = """ CREATE TABLE Student
                       (
                           ID INT NOT NULL AUTO_INCREMENT,
                           Name VARCHAR(255),
                           Attendance FLOAT,
                           JoinDate DATE,
                           Enrolments INT,
                           PRIMARY KEY (ID)
                       );"""
      
    2. Now to make sure that the query is successfully created, we can get back the response from the connection.

      response = mysql_query(conn, query)
      
    3. Check for a successful connection through conditional statements:

      if (response == 0)
              println("Connection successful. Table created")
      else
          println("Connection failed. Table not created.")
      end
      
  4. Queries on the database can be executed by the execute_query() command, which takes the connection variable and the query as parameters. A sample SELECT query can be executed through the following steps:

    query = """SELECT * FROM Student;"""
    data = execute_query(conn, query)
    
  5. To get the query results in the form of a Julia array, an extra parameter called opformat should be specified:

    data_array = execute_query(conn, query, opformat = MYSQL_ARRAY)
    
  6. Finally, to execute multiple queries at once, use the mysql_execute_multi_query() command:

    query = """INSERT INTO Student (Name) VALUES ('');
    UPDATE Student SET JoinDate = '08-07-15' WHERE LENGTH(Name) > 5;"""
    rows = mysql_execute_multi_query(conn, query)
    println("Rows updated by the query: $rows")
    

PostgreSQL

Data handling within a PostgreSQL database can be done by connecting to the database. Firstly, make sure that the database server is up and running. Now, the data in the database can be handled through the following procedure:

  1. Firstly, import the requisite packages, which are the DBI and the PostgreSQL databases, using the import statements:

    using DBI
    using PostgreSQL
    
  2. In addition, the required packages for the PostgreSQL library are as follows:

    • DataFrames.jl: This has already been installed previously.

    • DataArrays.jl: This can be installed by running the statement Pkg.add("DataArrays")).

  3. Make a connection to a PostgreSQL database of your choice. It is done through the connect function, which takes in the type of database, the username, the password, the port number, and the database name as input parameters. So, the connection can be established using the following statement:

    conn = connect(Postgres, "localhost", "password", "testdb", 5432)
    
  4. If the connection is successful, a message similar to this appears on the screen:

    PostgreSQL.PostgresDatabaseHandle(Ptr{Void}
    
            @0x00007  fa8a559f160,0x00000000,false)
    
  5. Now, prepare the query and tag it to the connection we prepared in the previous step. This can be done using the prepare function, which takes the connection and the query as parameters. So, the execution statement looks something like this:

    query = prepare(conn,  "SELECT 1::int, 2.0::double precision, 
            'name'::character varying, " *  "'name'::character(20);"))
    
  6. As the query is prepared, let's now execute it, just like we did for MySQL. To do this, we have to enter the query variable, which we created in the previous step, into the execute function. It is done as follows:

    result = execute(query)
    
  7. Now that the query execution is over, the connection can be disconnected using the finish and disconnect functions, which take the query and the connection variables as the input parameters, respectively. The statements can be executed as follows:

    finish(query)
    disconnect(conn)
    
  8. Now, the results of the query are in the result variable, which can be used for analytics by either moulding it into a dataframe or any other data structure of your choice. The same method can be used for all operations on PostgreSQL databases, which include addition, updating, and deleting.

  9. This resource would help you better understand the Database-Independent API (DBI), which we use to connect local PostgreSQL databases such as SQLite.

  10. Import the SQLite package into the current session and ensure that the SQLite server is up and running. The package can be imported by running the following command:

    using SQLite
    
  11. Now, a connection to any database can be made through the SQLiteDB() function in Julia Version 3 and the SQLite.DB() function in Julia Version 4.

  12. The connection can be made in Julia version 4 as follows:

    db = SQLite.DB("dbname.sqlite")
    
  13. The connection can be made in Julia version 3 as follows:

    db = SQLiteDB("dbname.sqlite")
    
  14. Now, as the connection is made, queries can be executed using the query() function in Version 3 and the SQLite.query() function in Version 4.

    • In Version 3:

      query(db, "A SQL query")
      
    • In Version 4:

      SQLite.query(db, "A SQL query")
      

The SQLite.jl package also allows the user to use macros and registers for manipulating and using data. However, the concepts are beyond the scope of this chapter.

So, these are some of the ways through which data can be handled in Julia. There are a lot of databases whose connectors directly connect to DBI, such as SQlite, MySQL, and so on, and through which queries and their execution can be carried out, as shown in the PostgreSQL section. Similarly, data can be scraped from the Internet and used for analytics, which can be achieved through a combination of Julia libraries, but that is beyond the scope of this book.

There's more...

MySQL

The following resource helps you learn more about its advanced features and provides information about the MySQL.jl library of Julia. This includes performance benchmarks and details, as well as information on CRUD and testing:

https://github.com/JuliaDB/MySQL.jl

PostgreSQL

Visit https://github.com/JuliaDB/DBI.jl to understand better the DBI we use to connect local PostgreSQL databases:

Visit https://github.com/JuliaDB/DBI.jl for extended and in-depth documentation on the PostgreSQL.jl library, which includes dealing with Amazon web services, and so on.

SQLite

Now, as you have learned the ways in which data can be extracted, manipulated, and worked on from various external sources, there are some more interesting things that the database drivers of Julia can do apart from just executing queries. You can find those at https://github.com/JuliaDB/SQLite.jl/blob/master/OLD_README.md#custom-scalar-functions .