Index
A
- A/B test
- parameters, of web page / Defining A/B testing
- conducting / Conducting an A/B test
- experiment, planning / Planning the experiment
- statistics, framing / Framing the statistics
- experiment, building / Building the experiment
- test site, building / Building the test site
- results, viewing / Viewing the results
- results, viewing as user / Looking at A/B testing as a user
- results, analyzing / Analyzing the results
- results, testing / Testing the results
- A/B testing
- defining / Defining A/B testing
- implementing, on server / Implementing A/B testing on the server
- implementing / Implementing A/B testing
- adjacency list
- about / Implementing the graphs
- alternative hypothesis
- Amazon
- about / Improving the results
- American National Corpus (ANC) / Getting the data
- answers
- processing / Processing the answers
- accepted answer, predicting / Predicting the accepted answer
- Apache POI project
- URL / Parsing the Excel files
- ArcGIS
- about / Understanding GIS
- working with / Working with ArcGIS
- average path length metric, social network graphs / Average path length
B
- base map
- finding / Finding a base map
- basics, stock data modeling project
- library, setting up / Setting up the library
- data, obtaining / Getting the data
- Bayesian inference
- Benford's Law
- about / Learning about Benford's Law
- applying, to compound interest / Applying Benford's law to compound interest
- failing / Failing Benford's Law
- case studies / Case studies
- between-subjects experiment design
- about / Defining A/B testing
- betweenness centrality
- about / Centrality
- Big Data
- about / Getting the data
- bigrams / Preparing the data
- Bing Map
- about / Understanding GIS
- BitTorrent
- URL / Getting the data
- Boudewijn F. Roukema
- URL / Case studies
- breadth-first function / Implementing the graphs, Paths
- breadth-first walk
- about / Implementing the graphs
- burglary rates
- about / Understanding burglary rates
- data, obtaining / Getting the data
- Excel files, parsing / Parsing the Excel files
- raw data, pulling out / Pulling out raw data
- data, exploring / Exploring the data
- summary statistics, generating / Generating summary statistics
- experiment, conducting / Conducting the experiment
- results, interpreting / Interpreting the results
C
- 0.circles
- about / Getting the data
- case studies, Benford's Law / Case studies
- centrality metric, social network graphs / Centrality
- CharSequence2TokenSequence / Loading the data into MALLET
- Class and communities in a Norwegian island parish article
- classification algorithms
- error rates, calculating on / Calculating error rates
- classifier interface
- coding / Coding the classifier interface
- classifier interface, coding
- training / Training
- classifying / Classifying
- validating / Validating
- climate change, mapping
- about / Mapping the climate change
- data, downloading / Downloading and extracting the data
- data, extracting / Downloading and extracting the data
- files, downloading / Downloading the files
- files, extracting / Extracting the files
- data, transforming / Transforming the data – filtering
- averages, rolling / Rolling averages
- data, reading / Reading the data
- sample points, interpolating / Interpolating sample points and generating heat maps using inverse distance weighting (IDW)
- heat maps, generating with inverse distance weighting (IDW) / Interpolating sample points and generating heat maps using inverse distance weighting (IDW)
- Clojure library
- about / Implementing the graphs
- ClojureScript
- URL / Implementing the graphs
- setting up / Setting up ClojureScript
- data, visualizing with / Visualizing with D3 and ClojureScript
- ClojureScript support
- closeness centrality
- about / Centrality
- clustering coefficient metric, social network graphs / Clustering coefficient
- coin tosses
- testing / Testing coin tosses
- comma-separated values (CSV) format / Working with stock data
- Compojure
- compound interest
- Benford's Law, applying to / Applying Benford's law to compound interest
- conditional probability
- about / Classifying the data
- confirmatory data analysis
- confusion matrix
- about / Evaluating the outcome
- content distribution network (CDN) / Visualizing with D3 and ClojureScript
- control page / Planning the experiment
- CSV (comma-separated values) files / Preparing for visualizations
- CSV file
- about / Getting the data
- cumulative distribution
D
- D3
- URL / Visualizing the graph, Visualizing with D3 and ClojureScript, Visualizing UFO data
- about / Setting up ClojureScript
- data, visualizing with / Visualizing with D3 and ClojureScript
- data
- obtaining / Getting the data, Getting the data
- understanding, in SOTU addresses / Understanding data in the State of Union addresses
- loading, into MALLET / Loading the data into MALLET
- visualizing, with D3 / Visualizing with D3 and ClojureScript
- visualizing, with ClojureScript / Visualizing with D3 and ClojureScript
- data, burglary rates
- obtaining / Getting the data
- exploring / Exploring the data
- charts, generating / Generating more charts and graphs
- graphs, generating / Generating more charts and graphs
- data analysis
- data classification, hoaxes
- about / Classifying the data
- classifier interface, coding / Coding the classifier interface
- classifier, running / Running the classifier and examining the results
- results, examining / Running the classifier and examining the results
- data preparation, hoaxes
- about / Preparing the data
- data, reading into sequence of data records / Reading the data into a sequence of data records
- NUFORC comments, splitting out / Splitting the NUFORC comments
- documents, categorizing based on comments / Categorizing the documents based on the comments
- documents, partitioning into directories based on categories / Partitioning the documents into directories based on the categories
- data, dividing into training set / Dividing them into training and test sets
- data, dividing into test set / Dividing them into training and test sets
- degrees-between function / Degrees of separation
- degrees metric, social network graphs / Degrees
- degrees of separation metric, social network graphs / Degrees of separation
- density metric, social network graphs / Density
- depth-first walk
- about / Implementing the graphs
- description
- about / Description
- Dijkstra's algorithm / Paths
- dis legomena
- about / Hapax and Dis Legomena
- double-blind experiments
- about / Defining A/B testing
E
- 0.edges
- about / Getting the data
- 0.egofeat
- about / Getting the data
- Eckert IV projection / Working with map projections
- Edmunds
- Enclog
- URL / Setting up the library
- enlive/html-resource function / Getting the data
- Enlive library
- URL / Getting the data
- error rates
- calculating, on classification algorithms / Calculating error rates
- ESRI
- about / Understanding GIS
- experiment, A/B test
- planning / Planning the experiment
- building / Building the experiment
- options, for building site / Looking at options to build the site
- experiment, burglary rates
- conducting / Conducting the experiment
- initial hypothesis, formulating / Formulating an initial hypothesis
- alternative hypothesis, stating / Stating the null and alternative hypotheses
- null hypothesis, stating / Stating the null and alternative hypotheses
- statistical assumptions, identifying in sample / Identifying the statistical assumptions in the sample
- tests appropriateness, determining / Determining which tests are appropriate
- significance level, selecting / Selecting the significance level
- critical region, determining / Determining the critical region
- probability, calculating / Calculating the test statistic and its probability
- test statistic, calculating / Calculating the test statistic and its probability
- rejection, deciding for null hypothesis / Deciding whether to reject the null hypothesis or not
- exploration versus exploitation problem
- about / Defining A/B testing
- exploratory data analysis
- extract-text method / Getting the data
F
- 0.feat
- about / Getting the data
- 0.featnames
- about / Getting the data
- facebook.tar.gz file
- about / Getting the data
- features, GIS
- view-shed analysis / Understanding GIS
- topological modeling / Understanding GIS
- hydrological modeling / Understanding GIS
- geocoding / Understanding GIS
- feature vector functions
- feature vectors
- about / Creating feature vectors
- creating / Creating feature vectors
- FileListIterator function / Loading the data into MALLET
- financial data analysis
- financial modeling / Related to machine learning and market modeling in general
- First In, First Out (FIFO) queue / Implementing the graphs
- flipping coins, null hypothesis testing
- about / Flipping coins
- initial hypothesis, formulating / Formulating an initial hypothesis
- null hypothesis, stating / Stating the null and alternative hypotheses
- alternative hypothesis, stating / Stating the null and alternative hypotheses
- statistical assumptions, identifying in sample / Identifying the statistical assumptions in the sample
- tests appropriateness, determining / Determining appropriate tests
- force-directed layout
- about / A force-directed layout
- frequentist approach
- frequentist inference
- FS library / Extracting the files
- function words / Stop lists
- future prediction, stock data modeling project
- about / Predicting the future
- stock prices, loading / Loading stock prices
- news articles, loading / Loading news articles
- training sets, creating / Creating training and test sets
- test sets, creating / Creating training and test sets
- best parameter, finding of neural network / Finding the best parameters for the neural network
- neural network, training / Training and validating the neural network
- neural network, validating / Training and validating the neural network
- network, running on new data / Running the network on new data
G
- Gall-Peters projection / Working with map projections
- GDAL
- URL / Understanding GIS
- geocoding
- about / Understanding GIS
- GeoServer
- about / Understanding GIS
- URL / Understanding GIS
- GeoTIFF
- about / Finding a base map
- GeoTools
- URL / Understanding GIS
- get-edges function / Implementing the graphs
- get-index-links function / Getting the data
- GIS
- overview / Understanding GIS
- Global Summary of the Day
- GNI data
- summarizing / Summarizing World Bank land area and GNI data
- Goode homolosine projection / Working with map projections
- Google Finance
- URL / Getting the data
- Google Map
- about / Understanding GIS
- GPS
- about / Understanding GIS
- graph
- visualizing / Visualizing the graph
- graph implementation
- data, loading / Loading the data
- graphs
- overview / Understanding graphs
- implementing / Implementing the graphs
- graph visualization
- about / Visualizing the graph
- ClojureScript, setting up / Setting up ClojureScript
- force-directed layout / A force-directed layout
- hive plot / A hive plot
- pie chart / A pie chart
- gzip utility / Extracting the files
H
- H2 embedded database
- hapax legomena
- about / Hapax and Dis Legomena
- heat map
- generating, inverse distance weighting (IDW) used / Interpolating sample points and generating heat maps using inverse distance weighting (IDW)
- hive plot
- about / A hive plot
- hoaxes
- about / Hoaxes
- data, preparing / Preparing the data
- data, classifying / Classifying the data
- Homebrew
- URL / Getting the data
- hotel review data
- obtaining / Getting hotel review data
- exploring / Exploring the data
- preparing / Preparing the data
- tokenizing / Tokenizing
- feature vectors, creating / Creating feature vectors
- feature vector functions, creating / Creating feature vector functions and POS tagging
- POS tagging / Creating feature vector functions and POS tagging
- results, cross validating / Cross-validating the results
- experiment, running / Running the experiment
- results, examining / Examining the results
- results, improving / Improving the results
- HTML5 Boilerplate template
- URL / Setting up ClojureScript
- hydrological modeling
- about / Understanding GIS
I
- Incanter / Interpolating sample points and generating heat maps using inverse distance weighting (IDW)
- Incanter library
- URL / Implementing A/B testing
- Infochimps
- URL / Getting the data
- about / Getting the data
- URL, for dataset / Getting the data
- InstanceList object
- creating / Creating the InstanceList object
- Internet Archive
- URL / Getting the data
- inverse distance weighting (IDW)
J
- Johnson's algorithm / Paths
- jQuery
K
- K-fold cross validation
- about / Cross-validating the results
- knowledge-based social networks
- about / Understanding social network data, Understanding knowledge-based social networks
- StackExchange / Understanding social network data
- StackOverflow / Understanding social network data
- Quora / Understanding social network data
- Korma
L
- LDA
- lein-cljsbuild plugin / Setting up ClojureScript
- Leiningen 2
- Leiningen 2 project.clj file
- about / Implementing the graphs
- LIFO (Last In, First Out) queue / Implementing the graphs
- load-topic-dists function / Visualizing with D3 and ClojureScript
- Luminus
- Luminus web framework
M
- machine learning / Related to machine learning and market modeling in general
- MALLET
- data, loading into / Loading the data into MALLET
- about / Hoaxes, Predicting the accepted answer
- map projections
- working with / Working with map projections
- maximum entropy (maxent) classifiers
- me.raynes file utility library / Implementing the graphs
- Mechanical Turk
- URL / Preparing the data
- Mercator projection / Working with map projections
- messy data
- dealing with / Dealing with messy data
- metrics, social network graphs
- density / Density
- degrees / Degrees
- paths / Paths
- average path length / Average path length
- network diameter / Network diameter
- clustering coefficient / Clustering coefficient
- centrality / Centrality
- degrees of separation / Degrees of separation
- monotonic function
N
- naive Bayesian classifiers
- network diameter metric, social network graphs / Network diameter
- networking-oriented social networks
- about / Understanding social network data
- Facebook / Understanding social network data
- LinkedIn / Understanding social network data
- Twitter / Understanding social network data
- Sina Weibo / Understanding social network data
- neural nets
- text, analyzing / Analyzing both text and stock features together with neural nets
- stock features, analyzing / Analyzing both text and stock features together with neural nets
- about / Understanding neural nets
- setting up / Setting up the neural net
- training / Training the neural net
- running / Running the neural net
- validating / Validating the neural net
- best parameters, finding / Finding the best parameters
- news articles
- working with / Working with news articles
- loading / Loading news articles
- noir
- NUFORC
- NUFORC comments
- splitting out / Splitting the NUFORC comments
- null-hypothesis test
- about / Framing the statistics
- null hypothesis
- null hypothesis process
- using / Understanding the process
- initial hypothesis, formulating / Formulating an initial hypothesis
- tests appropriateness, determining / Determining appropriate tests
- significance level, selecting / Selecting the significance level
- critical region, determining / Determining the critical region
- probability, calculating / Calculating the test statistics and its probability
- test statistic, calculating / Calculating the test statistics and its probability
- rejection, deciding / Deciding whether to reject the null hypothesis or not
- null hypothesis testing
- about / Introducing confirmatory data analysis, Understanding null hypothesis testing
- flipping coins / Flipping coins
O
- online-controlled experiments
- about / Defining A/B testing
- Open ANC (OANC)
- about / Getting the data
- URL / Getting the data
- OpenNLP library
- URL / Tokenizing
- OpinRank Review dataset
P
- p-value
- Pareto Principle
- about / Introducing the 80/20 rule
- part-of-speech annotated unigrams / Preparing the data
- partition-all function
- about / Cross-validating the results
- partition-spread function
- about / Cross-validating the results
- partition function
- about / Cross-validating the results
- paths metric, social network graphs / Paths
- perform-test function
- about / Viewing the results
- pie chart / A pie chart
- POS
- POS tagging / Creating feature vector functions and POS tagging
- prior or assumed probability / Classifying the data
- process-speech-page function / Getting the data
- project
- setting up, for topic modeling / Setting up the project
Q
- Quantum GIS
- about / Understanding GIS
- URL / Understanding GIS
- quintiles
- about / Matching the 80/20 rule
- Quora
R
- 80/20 rule
- about / Introducing the 80/20 rule
- data, obtaining / Getting the data
- looking, at data amount / Looking at the amount of data
- looking, at data format / Looking at the data format
- data, defining / Defining and loading the data
- data, loading / Defining and loading the data
- frequencies, counting / Counting frequencies
- data, sorting / Sorting and ranking
- data, ranking / Sorting and ranking
- patterns, finding of participation / Finding the patterns of participation
- matching / Matching the 80/20 rule
- random-controlled experiments
- about / Defining A/B testing
- ranks, combining
- about / Combining ranks
- looking at those who only post questions / Looking at those who only post questions
- looking at those who only post answers / Looking at those who only post answers
- looking at those who post both questions and answers / Looking at those who post both questions and answers
- raw data, burglary rates
- about / Pulling out raw data
- data tree, building / Growing a data tree
- data tree, cutting down / Cutting down the data tree
- implementing / Putting it all together
- data, transforming / Transforming the data
- data sources, joining / Joining the data sources
- data, pivoting / Pivoting the data
- missing data, filtering / Filtering the missing data
- wrapper function, creating / Putting it all together
- read-eval-print loop (REPL) / Loading the data
- reducers / Reading the data
- results, A/B test
- viewing / Viewing the results
- viewing, as user / Looking at A/B testing as a user
- analyzing / Analyzing the results
- testing / Testing the results
- results, burglary rates
- interpreting / Interpreting the results
- results, hotel review data
- cross validating / Cross-validating the results
- examining / Examining the results
- error rates, combining / Combining the error rates
- improving / Improving the results
S
- scaffolded site
- select-keys function
- about / Implementing A/B testing
- Selmer
- sentiment analysis
- overview / Understanding sentiment analysis
- server
- A/B testing, implementing on / Implementing A/B testing on the server
- Simple Logging Facade for Java library
- URL / Implementing the graphs
- about / Implementing the graphs
- Sina Weibo
- single-blind experiments
- about / Defining A/B testing
- Six Degrees of Kevin Bacon game
- about / Analyzing social networks
- Slate
- URL / Getting the data
- social data participation analysis project
- about / Setting up the project
- analyses, understanding / Understanding the analyses
- social network data, understanding / Understanding social network data
- knowledge-based social networks, understanding / Understanding knowledge-based social networks
- 80/20 rule, introducing / Introducing the 80/20 rule
- 80/20 rule, matching / Matching the 80/20 rule
- looking, for 20 percent of questioners / Looking for the 20 percent of questioners
- looking, for 20 percent who answer questions / Looking for the 20 percent of respondents
- ranks, combining / Combining ranks
- up-voted answers, finding / Finding the up-voted answers
- answers, processing / Processing the answers
- setting up / Setting up
- InstanceList object, creating / Creating the InstanceList object
- training sets / Training sets and Test sets, Training
- test sets / Training sets and Test sets, Testing
- outcome, evaluating / Evaluating the outcome
- social network graphs
- measuring / Measuring social network graphs
- social networks
- analyzing / Analyzing social networks
- SOTU
- SOTU address
- data, understanding / Understanding data in the State of Union addresses
- about / Understanding data in the State of Union addresses
- graph, for increase in word count / Understanding data in the State of Union addresses
- Spearman's rank correlation coefficient
- StackExchange
- URL / Understanding social network data
- URL, for periodic data dump / Getting the data
- StackOverflow
- Stanford Large Network Dataset Collection
- about / Getting the data
- URL / Getting the data
- statistics, A/B test
- framing / Framing the statistics
- stock data
- working with / Working with stock data
- stock data modeling
- data, preparing / Getting prepared with data
- working, with news articles / Working with news articles
- stock data modeling project
- basics, setting up / Setting up the basics
- text, analyzing / Analyzing the text
- stock prices, inspecting / Inspecting the stock prices
- text, and stock features merging / Merging text and stock features
- text, analyzing with neutral nets / Analyzing both text and stock features together with neural nets
- stock features analyzing with neutral nets / Analyzing both text and stock features together with neural nets
- future, predicting / Predicting the future
- weakness / Related to this project
- stock features
- analyzing, with neural nets / Analyzing both text and stock features together with neural nets
- stock prices
- inspecting / Inspecting the stock prices
- loading / Loading stock prices
- stop lists / Stop lists
- subdirectories, Luminus project
- resources / Understanding the scaffolded site
- src / Understanding the scaffolded site
- src/web_ab/models/ / Understanding the scaffolded site
- src/web_ab/routes/ / Understanding the scaffolded site
- src/web_ab/views/templates/ / Understanding the scaffolded site
- test/web_ab/test/ / Understanding the scaffolded site
- sum of squared errors (SSE)
- about / Validating the neural net
T
- ?Torrent
- URL / Getting the data
- t-test
- overview / Understanding the t-test
- coin tosses, testing / Testing coin tosses
- t-test function
- about / Implementing A/B testing
- tab-separated values (TSV)
- about / Preparing the data
- test-on utility
- about / Validating the neural net
- test page / Planning the experiment
- tests appropriateness, flipping coins
- significance level, selecting / Selecting the significance level
- critical region, determining / Determining the critical region
- test statistic, calculating / Calculating the test statistic and its probability
- probability, calculating / Calculating the test statistic and its probability
- rejection, deciding for null hypothesis / Deciding whether to reject the null hypothesis or not
- test site, A/B test
- building / Building the test site
- text
- analyzing, with neural nets / Analyzing both text and stock features together with neural nets
- text, and stock features
- merging / Merging text and stock features
- text, stock data modeling project
- graph, viewing of frequencies / Hapax and Dis Legomena
- text analysis, stock data modeling project
- about / Analyzing the text
- vocabulary, analyzing / Analyzing vocabulary
- stop lists / Stop lists
- graph, viewing of frequencies / Hapax and Dis Legomena
- hapax legomena / Hapax and Dis Legomena
- dis legomena / Hapax and Dis Legomena
- TF-IDF / TF-IDF
- tf-idf
- about / Analyzing the text
- TF-IDF
- about / TF-IDF
- tf-idf-freqs function / TF-IDF
- tokenizing / Tokenizing
- about / Analyzing vocabulary
- TokenSequence2FeatureSequence / Loading the data into MALLET
- TokenSequenceLowercase / Loading the data into MALLET
- TokenSequenceRemoveStopwords / Loading the data into MALLET
- tools, for GIS specialists
- ArcGIS / Understanding GIS
- Quantum GIS / Understanding GIS
- GeoServer / Understanding GIS
- GDAL / Understanding GIS
- GeoTools / Understanding GIS
- topic 26
- exploring / Exploring topic 26
- topic 42
- exploring / Exploring topic 42
- topic 43
- exploring / Exploring topic 43
- topic model
- about / Understanding topic modeling
- topic modeling
- overview / Understanding topic modeling
- URLs, for articles / Understanding topic modeling
- project, setting up for / Setting up the project
- topic modeling descriptions
- about / Topic modeling descriptions
- topics
- exploring / Exploring the topics
- topological modeling
- about / Understanding GIS
- trigrams / Preparing the data
- TripAdvisor
- TSV file
- about / Getting the data
- type one error
- about / Testing the results
U
- UFO data
- visualizing / Visualizing UFO data
- UFO sightings
- data, obtaining / Getting the data
- data, extracting / Extracting the data
- dealing, with messy data / Dealing with messy data
- unigrams / Preparing the data
- United Nations Office on Drugs and Crime
- URL / Getting the data
- UNODC crime data
- summarizing / Summarizing UNODC crime data
- up-voted answers
- finding / Finding the up-voted answers
- US National Oceanic and Atmospheric Administration (NOAA) / Mapping the climate change
- US Topo Maps / Working with ArcGIS
V
- view-shed analysis
- about / Understanding GIS
- visualizations
- preparing for / Preparing for visualizations
- vocabulary
- analyzing / Analyzing vocabulary
W
- Weka, and cross validation
- connecting / Connecting Weka and cross-validation
- Weka machine learning library
- when-is-over function
- about / Viewing the results
- World Bank land area
- summarizing / Summarizing World Bank land area and GNI data
- World DataBank
- URL / Looking at the world population data
- downloading / Looking at the world population data
- world population data
- viewing / Looking at the world population data
Y
- Yahoo Answers
Z
- 7-zip site
- URL / Getting the data