Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Haskell Data Analysis cookbook
  • Table Of Contents Toc
Haskell Data Analysis cookbook

Haskell Data Analysis cookbook

By : Nishant Shukla
3.7 (6)
close
close
Haskell Data Analysis cookbook

Haskell Data Analysis cookbook

3.7 (6)
By: Nishant Shukla

Overview of this book

Step-by-step recipes filled with practical code samples and engaging examples demonstrate Haskell in practice, and then the concepts behind the code. This book shows functional developers and analysts how to leverage their existing knowledge of Haskell specifically for high-quality data analysis. A good understanding of data sets and functional programming is assumed.
Table of Contents (14 chapters)
close
close
13
Index

Understanding how to perform HTTP GET requests

One of the most resourceful places to find good data is online. GET requests are common methods of communicating with an HTTP web server. In this recipe, we will grab all the links from a Wikipedia article and print them to the terminal. To easily grab all the links, we will use a helpful library called HandsomeSoup, which lets us easily manipulate and traverse a webpage through CSS selectors.

Getting ready

We will be collecting all links from a Wikipedia web page. Make sure to have an Internet connection before running this recipe.

Install the HandsomeSoup CSS selector package, and also install the HXT library if it is not already installed. To do this, use the following commands:

$ cabal install HandsomeSoup
$ cabal install hxt

How to do it...

  1. This recipe requires hxt for parsing HTML and requires HandsomeSoup for the easy-to-use CSS selectors, as shown in the following code snippet:
    import Text.XML.HXT.Core
    import Text.HandsomeSoup
  2. Define and implement main as follows:
    main :: IO ()
    main = do
  3. Pass in the URL as a string to HandsomeSoup's fromUrl function:
        let doc = fromUrl "http://en.wikipedia.org/wiki/Narwhal"
  4. Select all links within the bodyContent field of the Wikipedia page as follows:
        links <- runX $ doc >>> css "#bodyContent a" ! "href"
        print links

How it works…

The HandsomeSoup package allows easy CSS selectors. In this recipe, we run the #bodyContent a selector on a Wikipedia article web page. This finds all link tags that are descendants of an element with the bodyContent ID.

See also…

Another common way to obtain data online is through POST requests. To find out more, refer to the Learning how to perform HTTP POST requests recipe.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Haskell Data Analysis cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon