Book Image

Clojure Data Analysis Cookbook - Second Edition

By : Eric Richard Rochester
Book Image

Clojure Data Analysis Cookbook - Second Edition

By: Eric Richard Rochester

Overview of this book

Table of Contents (19 chapters)
Clojure Data Analysis Cookbook Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Reading data from JDBC databases


Reading data from a relational database is only slightly more complicated than reading from Excel, and much of the extra complication involves connecting to the database.

Fortunately, there's a Clojure-contributed package that sits on top of JDBC (the Java database connector API, http://www.oracle.com/technetwork/java/javase/jdbc/index.html) and makes working with databases much easier. In this example, we'll load a table from an SQLite database (http://www.sqlite.org/), which stores the database in a single file.

Getting ready

First, list the dependencies in your Leiningen project.clj file. We will also need to include the database driver library. For this example, it is org.xerial/sqlite-jdbc:

(defproject getting-data "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [incanter "1.5.5"]
                 [org.clojure/java.jdbc "0.3.3"]
                 [org.xerial/sqlite-jdbc "3.7.15-M1"]])

Then, load the modules into your REPL or script file:

(require '[incanter.core :as i]
         '[clojure.java.jdbc :as j])

Finally, get the database connection information. I have my data in an SQLite database file named data/small-sample.sqlite, as shown in the following screenshot. You can download this from http://www.ericrochester.com/clj-data-analysis/data/small-sample.sqlite.

How to do it…

Loading the data is not complicated, but we'll make it easier with a wrapper function:

  1. We'll create a function that takes a database connection map and a table name and returns a dataset created from this table:

    (defn load-table-data
      "This loads the data from a database table."
      [db table-name]
      (i/to-dataset
      (j/query db (str "SELECT * FROM " table-name ";"))))
  2. Next, we define a database map with the connection parameters suitable for our database:

    (defdb {:subprotocol "sqlite"
             :subname "data/small-sample.sqlite"
             :classname "org.sqlite.JDBC"})
  3. Finally, call load-table-data with db and a table name as a symbol or string:

    user=> (load-table-data db 'people)
    
    |   :relation | :surname | :given_name |
    |-------------+----------+-------------|
    |      father |   Addams |       Gomez |
    |      mother |   Addams |    Morticia |
    |     brother |   Addams |     Pugsley |||
    …

How it works…

The load-table-data function passes the database connection information directly through to clojure.java.jdbc/query.query. It creates an SQL query that returns all of the fields in the table that is passed in. Each row of the result is a sequence of hashes mapping column names to data values. This sequence is wrapped in a dataset by incanter.core/to-dataset.

See also

Connecting to different database systems using JDBC isn't necessarily a difficult task, but it's dependent on which database you wish to connect to. Oracle has a tutorial for how to work with JDBC at http://docs.oracle.com/javase/tutorial/jdbc/basics, and the documentation for the clojure.java.jdbc library has some good information too (http://clojure.github.com/java.jdbc/). If you're trying to find out what the connection string looks like for a database system, there are lists available online. The list at http://www.java2s.com/Tutorial/Java/0340__Database/AListofJDBCDriversconnectionstringdrivername.htm includes the major drivers.