Index
A
- airplane analogy / The airplane analogy
- average function
- creating / Creating our average function
- average of list
- about / Computing the mean of a list
C
- call patterns
- about / Our first Haskell program
- column index
- finding, of specified column / Finding the column index of the specified column
- CSV
- CSV files
- working with / Working with csv files
- environment, preparing / Preparing our environment
- needs, describing / Describing our needs
- solution, crafting / Crafting our solution
- function, applying to specified column / Applying a function to a specified column
- converting, to SQLite3 / Converting csv files to the SQLite3 format
D
- data
- problem, solving / Welcome to Haskell and data analysis!
- filtering, regular expressions used / Filtering data using regular expressions
- plotting, with EasyPlot / Plotting data with EasyPlot
- plotting, from SQLite3 database / Plotting data from a SQLite3 database
- plotting, through function / Plotting data passed through a function
- data access
- simplifying, in SQLite3 / Simplifying access to data in SQLite3
- data analysis
- and pattern recognition, comparing / How data analysis differs from pattern recognition
- dataset
- Integer / Inspecting column information
- Decimal / Inspecting column information
- String / Inspecting column information
- ISO 8601 Timestamp Strings / Inspecting column information
- column headers, in csv file / Plotting data from a SQLite3 database
- data type information, earthquake dataset
- about / Inspecting column information
- data types, SQLite3
- data visualization
- about / Plotting data with EasyPlot
E
- earthquake dataset
- EasyPlot
- about / Welcome to Haskell and data analysis!
- data, plotting with / Plotting data with EasyPlot
- URL / Exploring the EasyPlot library
- EasyPlot library
- exploring / Exploring the EasyPlot library
- eigenvalue / Eigenvalues and eigenvectors
- eigenvalue decomposition / Eigenvalues and eigenvectors
- eigenvector / Eigenvalues and eigenvectors
- Either monad
- about / The Maybe and Either monads
- elinks
- about / Tmux
- empty fields
- locating, in CSV file / Locating empty fields in a csv file based on a regular expression
- Error Function (Erf) module / The confidence interval
F
- filter command
- Fractional type
- defining / Introducing the Fractional class
- frequency function / A frequency study of tweets
- fromIntegral function / The fromIntegral and realToFrac functions
G
- genericLength function / The genericLength function
- Glasgow Haskell Compiler (GHC)
- about / Computing the sum of a list
- gnuplot
- about / Welcome to Haskell and data analysis!, Gnuplot
- URL / Gnuplot
- Graph2D type
- about / Exploring the EasyPlot library
- Graph3D type
- about / Exploring the EasyPlot library
H
- HashMap object / Creating our feature vectors
- Haskell
- using / Why Haskell?
- features / Why Haskell?
- about / Type is king – the implications of strict types in Haskell
- mean of list, computing / Computing the mean of a list
- sum of list, computing / Computing the sum of a list
- length of list, computing / Computing the length of a list
- mean results in error, computing / Attempting to compute the mean results in an error
- Fractional type, defining / Introducing the Fractional class
- fromIntegral function / The fromIntegral and realToFrac functions
- realToFrac function / The fromIntegral and realToFrac functions
- average function, creating / Creating our average function
- genericLength function / The genericLength function
- metadata, defining / Metadata is just as important as data
- grep, creating / Creating a simplified version of grep in Haskell
- Haskell Database Connectivity (HDBC)
- about / Preparing our environment
- Haskell interactive command line
- about / Interactive Haskell
- introductory problem / An introductory problem
- Haskell platform
- installing / Getting ready
- installing, on Linux / Installing the Haskell platform on Linux
- Haskell program
- about / Our first Haskell program
- implementing / Our first Haskell program
- HDBC statements
- defining / Crafting our functions
- connectSqlite3 statement / Crafting our functions
- stmt <- prepare conn insertStatement statement / Crafting our functions
- executeMany stmt statement / Crafting our functions
- commit conn statement / Crafting our functions
- disconnect conn statement / Crafting our functions
- where clause / Crafting our functions
- home-field advantage
- about / Does a home-field advantage really exist?
- data, converting to SQLite3 / Converting the data to SQLite3
- data, exploring / Exploring the data
- data, plotting / Plotting what looks interesting
- sample mean, computing / Returning to our test
- standard deviation / The standard deviation
- standard error, computing / The standard error
- confidence interval / The confidence interval
- Error Function (Erf) module / An introduction to the Erf module
- Erf, using for testing claim / Using Erf to test the claim, A discussion of the test
- hypothesis testing, magic coin theory / Hypothesis test
I
- introductory problem, Haskell language
- about / An introductory problem
- inverse normal cumulative density function (invnormcdf) / An introduction to the Erf module
J
- JSON format
L
- Lambda / Eigenvalues and eigenvectors
- LAPACK
- length of list
- computing / Computing the length of a list
- linear algebra, performing
- about / Performing linear algebra in Haskell
- covariance matrix of dataset, computing / Computing the covariance matrix of a dataset
- eigenvalues, discovering / Discovering eigenvalues and eigenvectors in Haskell
- eigenvectors, discovering / Discovering eigenvalues and eigenvectors in Haskell
- Linux
- Haskell platform, installing on / Installing the Haskell platform on Linux
M
- magic coin theory
- data / Data in a coin
- hypothesis testing / Hypothesis test
- test, establishing / Establishing the magic coin test
- data variance / Understanding data variance
- probability mass function / Probability mass function
- test interval, determining / Determining our test interval
- parameters, establishing / Establishing the parameters of the experiment
- System.Random module, using / Introducing System.Random
- experiment, performing / Performing the experiment
- market capitalization
- Maybe monad
- about / The Maybe and Either monads
- mean of list
- computing / Computing the mean of a list
- mean results, in error
- metadata
- defining / Metadata is just as important as data
- moving average
- plotting / Plotting a moving average
- multiple datasets
- plotting / Plotting multiple datasets
- multivariate data
- working with / Working with multivariate data
- describing / Describing bivariate and multivariate data
- bivariate, describing / Describing bivariate and multivariate data
- eigenvectors / Eigenvalues and eigenvectors
- eigenvalues / Eigenvalues and eigenvectors
N
- Naive Bayes classification
- about / An introduction to Naive Bayes classification
- prior knowledge / Prior knowledge
- likelihood / Likelihood
- evidence / Evidence
- implementing / Putting the parts of the Bayes theorem together
- NaN (Not A Number)
- about / Creating our average function
- New York Stock Exchange (NYSE)
- about / Plotting a subset of a dataset
- normal cumulative density function (normcdf) / An introduction to the Erf module
- null hypothesis / Hypothesis test
- number of fields, in each record
- counting / Counting the number of fields in each record
O
- OAuth / Communicating with Twitter
- open source software packages
- using / The software used in addition to Haskell
- SQLite3 / SQLite3
- gnuplot / Gnuplot
- LAPACK / LAPACK
P
- pattern recognition
- and data analysis, comparing / How data analysis differs from pattern recognition
- about / How data analysis differs from pattern recognition
- Pearson r2
- Pearson r correlation coefficient
- percentChange function
- benefits / Plotting multiple datasets
- point cloud datasets
- about / Exploring the EasyPlot library
- population dataset / Does a home-field advantage really exist?
- Principal Component Analysis
- Principal Component Analysis (PCA)
- about / LAPACK
R
- realToFrac function / The fromIntegral and realToFrac functions
- Real World Haskell
- recommendation engine
- frequency of words, analyzing in tweets / Analyzing the frequency of words in tweets
- stop words, removing / A note on the importance of removing stop words
- multivariate data, working with / Working with multivariate data
- environment, preparing / Preparing our environment
- linear algebra, performing in Haskell / Performing linear algebra in Haskell
- building / Building a recommendation engine
- nearest neighbors , finding / Finding the nearest neighbors
- testing / Testing our recommendation engine
- regression analysis
- about / Regression analysis
- regression equation line / The regression equation line
- regression equation, estimating / Estimating the regression equation
- formulas, translating to Haskell / Translate the formulas to Haskell
- baseball analysis / Returning to the baseball analysis
- baseball analysis, plotting with regression line / Plotting the baseball analysis with the regression line
- pitfalls / The pitfalls of regression analysis
- regular expression
- fields, searching on / Searching fields based on a regular expression
- crafting, to match dates / Crafting a regular expression to match dates
- about / A crash course in regular expressions
- crash course / A crash course in regular expressions
- repetition modifiers / The three repetition modifiers
- anchors / Anchors
- the dot / The dot
- character class / Character classes
- groups / Groups
- alternations / Alternations
- note / A note on regular expressions
- regular expressions
- used, for filtering data / Filtering data using regular expressions
- about / Filtering data using regular expressions
- grep, creating in Haskell / Creating a simplified version of grep in Haskell
- customer database / Exhibit A – a horrible customer database
S
- sample dataset / Does a home-field advantage really exist?
- scatterplot
- plotting / Plotting a scatterplot
- scoring and winning, correlation
- about / Study – is there a connection between scoring and winning?
- consideration / A consideration before we dive in – do any games end in a tie?
- essential data, compiling / Compiling the essential data
- outliers, searching / Searching for outliers
- runs per game, versus win percentage of each team / Plot – runs per game versus the win percentage of each team
- correlation analysis, performing / Performing correlation analysis
- share history, Google
- share history, Microsoft
- simple linear regression / Regression analysis
- solution, crafting
- input parameters / Crafting our solution
- SQLite3
- about / Welcome to Haskell and data analysis!, SQLite3
- URL / SQLite3
- CSV files, converting to / Converting csv files to the SQLite3 format
- environment, preparing / Preparing our environment
- needs, describing / Describing our needs
- column information, inspecting / Inspecting column information
- functions, crafting / Crafting our functions
- data access, simplifying / Simplifying access to data in SQLite3
- SQLite3 database
- data, plotting from / Plotting data from a SQLite3 database
- EasyPlot library, exploring / Exploring the EasyPlot library
- subset of dataset, plotting / Plotting a subset of a dataset
- data, plotting through function / Plotting data passed through a function
- SQLite binaries
- stop words / A note on the importance of removing stop words
- structured data
- about / Structured versus unstructured datasets
- creating / Creating your own structured data
- structured dataset
- versus unstructured dataset / Structured versus unstructured datasets
- subset of dataset
- plotting / Plotting a subset of a dataset
- sum of list
- computing / Computing the sum of a list
- System.Random module
- using / Introducing System.Random
T
- terminology, correlation and regression
- about / The terminology of correlation and regression
- expectation of variable / The expectation of a variable
- variance of variable / The variance of a variable
- variable, normalizing / Normalizing a variable
- covariance, of two variables / The covariance of two variables
- Pearson r correlation coefficient, finding / Finding the Pearson r correlation coefficient
- Pearson r2, finding / Finding the Pearson r2 correlation coefficient
- formulas, expressing to Haskell / Translating what we've learned to Haskell
- Tmux
- about / Tmux
- tools
- tuple
- about / An introductory problem
- Twitter application
- creating / Creating a Twitter application
- Twitter, communicating with / Communicating with Twitter
- database, creating for collecting tweets / Creating a database to collect tweets
- frequency study, of tweets / A frequency study of tweets
- tweets, cleaning / Cleaning our tweets
- feature vectors, creating / Creating our feature vectors
- code, creating for Bayes theorem / Writing the code for the Bayes theorem
- Naive Bayes classifie, creating with multiple features / Creating a Naive Bayes classifier with multiple features
- classifier, testing / Testing our classifier
U
- unique identifier column
- United States Geological Survey (USGS)
- about / Describing our needs
- unstructured data
- unstructured dataset
- versus structured dataset / Structured versus unstructured datasets
V
- version control software
- about / Version control software – Git
W
- workflow, for branchless version control
- about / Version control software – Git
Y
- Yahoo! Finance