Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Examining a JSON file with the aeson package


JavaScript Object Notation (JSON) is a way to represent key-value pairs in plain text. The format is described extensively in RFC 4627 (http://www.ietf.org/rfc/rfc4627).

In this recipe, we will parse a JSON description about a person. We often encounter JSON in APIs from web applications.

Getting ready

Install the aeson library from hackage using Cabal.

Prepare an input.json file representing data about a mathematician, such as the one in the following code snippet:

$ cat input.json

{"name":"Gauss", "nationality":"German", "born":1777, "died":1855}

We will be parsing this JSON and representing it as a usable data type in Haskell.

How to do it...

  1. Use the OverloadedStrings language extension to represent strings as ByteString, as shown in the following line of code:

    {-# LANGUAGE OverloadedStrings #-}
  2. Import aeson as well as some helper functions as follows:

    import Data.Aeson
    import Control.Applicative
    import qualified Data.ByteString.Lazy as B
  3. Create the data type corresponding to the JSON structure, as shown in the following code:

    data Mathematician = Mathematician 
                         { name :: String
                         , nationality :: String
                         , born :: Int
                         , died :: Maybe Int
                         } 
  4. Provide an instance for the parseJSON function, as shown in the following code snippet:

    instance FromJSON Mathematician where
      parseJSON (Object v) = Mathematician
                             <$> (v .: "name")
                             <*> (v .: "nationality")
                             <*> (v .: "born")
                             <*> (v .:? "died")
  5. Define and implement main as follows:

    main :: IO ()
    main = do
  6. Read the input and decode the JSON, as shown in the following code snippet:

      input <- B.readFile "input.json"
    
      let mm = decode input :: Maybe Mathematician
    
      case mm of
        Nothing -> print "error parsing JSON"
        Just m -> (putStrLn.greet) m
  7. Now we will do something interesting with the data as follows:

    greet m = (show.name) m ++ 
              " was born in the year " ++ 
              (show.born) m
  8. We can run the code to see the following output:

    $ runhaskell Main.hs
    
    "Gauss" was born in the year 1777
    

How it works...

Aeson takes care of the complications in representing JSON. It creates native usable data out of a structured text. In this recipe, we use the .: and .:? functions provided by the Data.Aeson module.

As the Aeson package uses ByteStrings instead of Strings, it is very helpful to tell the compiler that characters between quotation marks should be treated as the proper data type. This is done in the first line of the code which invokes the OverloadedStrings language extension.

Tip

Language extensions such as OverloadedStrings are currently supported only by the Glasgow Haskell Compiler (GHC).

We use the decode function provided by Aeson to transform a string into a data type. It has the type FromJSON a => B.ByteString -> Maybe a. Our Mathematician data type must implement an instance of the FromJSON typeclass to properly use this function. Fortunately, the only required function for implementing FromJSON is parseJSON. The syntax used in this recipe for implementing parseJSON is a little strange, but this is because we're leveraging applicative functions and lenses, which are more advanced Haskell topics.

The .: function has two arguments, Object and Text, and returns a Parser a data type. As per the documentation, it retrieves the value associated with the given key of an object. This function is used if the key and the value exist in the JSON document. The :? function also retrieves the associated value from the given key of an object, but the existence of the key and value are not mandatory. So, we use .:? for optional key value pairs in a JSON document.

There's more…

If the implementation of the FromJSON typeclass is too involved, we can easily let GHC automatically fill it out using the DeriveGeneric language extension. The following is a simpler rewrite of the code:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}
import Data.Aeson
import qualified Data.ByteString.Lazy as B
import GHC.Generics

data Mathematician = Mathematician { name :: String
                                   , nationality :: String
                                   , born :: Int
                                   , died :: Maybe Int
                                   } deriving Generic

instance FromJSON Mathematician

main = do
  input <- B.readFile "input.json"
  let mm = decode input :: Maybe Mathematician
  case mm of
    Nothing -> print "error parsing JSON"
    Just m -> (putStrLn.greet) m
    
greet m = (show.name) m ++" was born in the year "++ (show.born) m

Although Aeson is powerful and generalizable, it may be an overkill for some simple JSON interactions. Alternatively, if we wish to use a very minimal JSON parser and printer, we can use Yocto, which can be downloaded from http://hackage.haskell.org/package/yocto.