Book Image

Mastering Clojure Data Analysis

By : Eric Richard Rochester
Book Image

Mastering Clojure Data Analysis

By: Eric Richard Rochester

Overview of this book

Table of Contents (17 chapters)
Mastering Clojure Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Topic modeling descriptions


Another way to gain a better understanding of the descriptions is to use topic modeling. We learned about this text mining and machine learning algorithm in Chapter 3, Topic Modeling – Changing Concerns in the State of the Union Addresses. In this case, we'll see if we can use it to create topics over these descriptions and to pull out the differences, trends, and patterns from this set of texts.

First, we'll create a new namespace to handle our topic modeling. We'll use the src/ufo_data/tm.clj file. The following is the namespace declaration for it:

(ns ufo-data.tm
  (:require [clojure.java.io :as io]
            [clojure.string :as str]
            [clojure.pprint :as pp])
  (:import [cc.mallet.util.*]
           [cc.mallet.types InstanceList]
           [cc.mallet.pipe
            Input2CharSequence TokenSequenceLowercase
            CharSequence2TokenSequence SerialPipes
            TokenSequenceRemoveStopwords
            TokenSequence2FeatureSequence]
   ...