Index
A
- actions / Actions
- asynchronous actions / Asynchronous actions
- actors
- as people / Actors as people
- constructing / Actor construction, Anatomy of an actor, Follower network crawler
- fetcher / Fetcher actors
- aggregate functions
- URL / Aggregation operations
- aggregation operations
- about / Aggregation operations
- aggregations
- with Group by / Aggregations with "Group by"
- Akka documentation / What we have not talked about
- Akka library / Futures example – stock price fetcher
- Amazon Web Services (AWS)
- Apache Parquet
- about / Parquet files
- APIs
- creating, with Play / Creating APIs with Play: a summary
- application
- building / Building an application
- applications
- Bootstrapping / Bootstrapping the applications
- Arrays / A whirlwind tour of JSON
- arrays
- authentication
- HTTP headers, adding / Authentication – adding HTTP headers
B
- backend
- need for / Do I need a backend?
- BinaryClassificationMetrics instance
- URL / Evaluation
- BLAS library / Basic Breeze data types
- Body Mass Index (BMI) / DataFrames – a whirlwind introduction
- BooleanColumnExtensionMethods class
- URL / Operations on columns
- Bootstrap layouts
- Breeze
- code, examples / Code examples
- installing / Installing Breeze
- help, getting / Getting help on Breeze
- Wiki page, on GitHub / Getting help on Breeze
- data types / Basic Breeze data types
- alternatives / Alternatives to Breeze
- URL / References
- API documents, URL / References
- diving into / Diving into Breeze
- Breeze-viz
- about / Managing without documentation
- URL / Managing without documentation
- reference / Breeze-viz reference
C
- Casbah
- URL / Casbah query DSL, References
- about / Beyond Casbah
- Casbah query DSL
- about / Casbah query DSL
- case classes
- used, for pattern matching / JSON in Scala – an exercise in pattern matching
- used, for extraction / Extraction using case classes
- as messages / Case classes as messages
- client-server applications
- about / Client-server applications
- client-side program
- architecture / Client-side program architecture
- model, designing / Designing the model
- event bus / The event bus
- AJAX calls, thorugh JQuery / AJAX calls through JQuery
- response views / Response views
- collision / Transformers
- complex queries / Complex queries
- configuration options
- Connection class
- API documentation, URL / References
- context bound / Coding against type classes
- cross-validation
- and model selection / Cross-validation and model selection
- custom supervisor strategies / Custom supervisor strategies
- custom type serialization
- about / Custom type serialization
D
- data access layer
- about / Creating a data access layer
- database metadata
- accessing / Accessing database metadata
- DataFrames
- about / DataFrames – a whirlwind introduction
- joining, together / Joining DataFrames together
- custom functions / Custom functions on DataFrames
- immutability / DataFrame immutability and persistence
- persistence / DataFrame immutability and persistence
- SQL statements / SQL statements on DataFrames
- data mapper pattern
- URL / References
- data science
- about / Data science
- programming in / Programming in data science
- dataset
- data shuffling
- about / Data shuffling and partitions
- data sources
- interacting with / Interacting with data sources
- JSON files / JSON files
- Parquet files / Parquet files
- data types
- data types, Breeze
- about / Basic Breeze data types
- vectors / Vectors
- matrices / Matrices
- vectors, building / Building vectors and matrices
- matrices, building / Building vectors and matrices
- indexing / Advanced indexing and slicing
- slicing / Advanced indexing and slicing
- vectors, mutating / Mutating vectors and matrices
- matrices, mutating / Mutating vectors and matrices
- matrix multiplication / Matrix multiplication, transposition, and the orientation of vectors
- matrix transposition / Matrix multiplication, transposition, and the orientation of vectors
- vectors, orientation / Matrix multiplication, transposition, and the orientation of vectors
- data preprocessing / Data preprocessing and feature engineering
- feature engineering / Data preprocessing and feature engineering
- function optimization / Breeze – function optimization
- numerical derivatives / Numerical derivatives
- regularization / Regularization
- DenseVector or DenseMatrix
- URL / Vectors
- directed acyclic graph (DAG) / Lifting the hood
- documents
- inserting / Inserting documents
- drivers
- URL / Importing Slick
- dynamic routing
- about / Dynamic routing
E
- element-wise operators
- pitfalls / Vectors
- estimators
- about / Estimators
- evaluation
- about / Evaluation
- event bus / The event bus
- example data
- acquiring / Acquiring the example data
- execution contexts
- parallel execution, controlling with / Controlling parallel execution with execution contexts
- extraction
- used, for case classes / Extraction using case classes
F
- Federal Election Commission (FEC)
- Federal Election Commission (FEC) data
- about / FEC data
- URL / FEC data
- Slick, importing / Importing Slick
- schema, defining / Defining the schema
- database, connecting to / Connecting to the database
- tables, creating / Creating tables
- inserting / Inserting data
- querying / Querying data
- floating point format
- URL / Defining the schema
- follower network crawler / Follower network crawler, Fault tolerance
- function optimization / Breeze – function optimization
- futures
- about / Futures
- URL / Futures, References
- result, using / Future composition – using a future's result
- blocking until completion / Blocking until completion
- parallel execution, controlling with execution contexts / Controlling parallel execution with execution contexts
- stock price fetchers example / Futures example – stock price fetcher
- concurrency and exception handling / Concurrency and exception handling with futures
G
- GitHub
- follower's graph / GitHub follower graph
- URL / JavaScript dependencies through web-jars
- GitHub API
- URL / References
- GitHub servers
- GitHub user data
- about / GitHub user data
- URL / GitHub user data
- Group by
- aggregations with / Aggregations with "Group by"
H
- HashingTF / Transformers
- headers
- adding, to HTTP requests in Scala / Adding headers to HTTP requests in Scala
- Hello world
- with Akka / Hello world with Akka
- HTML templates
- HTTP
- about / HTTP – a whirlwind overview
- HTTP headers
- adding / Authentication – adding HTTP headers
I
- indexing / Advanced indexing and slicing
- invokers
- about / Invokers
J
- java.sql.Types package
- API documentation, URL / JDBC summary
- JavaScipt dependencies
- through web-jars / JavaScript dependencies through web-jars
- JDBC
- about / Interacting with JDBC
- first steps / First steps with JDBC
- database server, connecting to / Connecting to a database server
- tables, creating / Creating tables
- data, inserting / Inserting data
- data, reading / Reading data
- summary / JDBC summary
- functional wrappers / Functional wrappers for JDBC
- connections, with loan pattern / Safer JDBC connections with the loan pattern
- connections enriching, with pimp my library pattern / Enriching JDBC statements with the "pimp my library" pattern
- result sets in stream, wrapping / Wrapping result sets in a stream
- API documentation, URL / References
- versus Slick / Slick versus JDBC
- JFreeChart documentation
- URL / Customizing plots
- JSON
- about / A whirlwind tour of JSON
- interacting with / Interacting with JSON
- external APIs, querying / Querying external APIs and consuming JSON
- consuming / Querying external APIs and consuming JSON
- parsing / Parsing JSON
- JSON4S types / JSON4S types
- JSON files
- about / JSON files
- JSON in Scala
- about / JSON in Scala – an exercise in pattern matching
- JSON4S types / JSON4S types
- fields extracting, XPath used / Extracting fields using XPath
K
- k-fold cross-validation / Cross-validation and model selection
L
- L-BFGS method / Breeze – function optimization
- LAPACK library / Basic Breeze data types
- lazy computation
- about / Towards re-usable code
- LET IT CRASH blog
- URL / References
- life-cycle hooks
- about / Life-cycle hooks
- line type
- customizing / Customizing the line type
- Ling-Spam dataset
- Ling-Spam email dataset
- loan pattern / Reading data
- JDBC connections with / Safer JDBC connections with the loan pattern
- logistic regression
- about / An example – logistic regression, Beyond logistic regression
- regularization / Regularization in logistic regression
- looser coupling
- with type classes / Looser coupling with type classes
- type classes / Type classes
- coding, against type classes / Coding against type classes
- type classes, using / When to use type classes
- type classes, benefits / Benefits of type classes
M
- Machine Learning course
- URL / References
- maps
- about / Maps
- matrices
- about / Matrices
- building / Building vectors and matrices
- mutating / Mutating vectors and matrices
- message
- passing, between actors / Message passing between actors
- message sender
- accessing / Accessing the sender of a message
- MLlib / Breeze – function optimization
- spam classification / Introducing MLlib – Spam classification
- Model-View-Controller (MVC)
- architecture / Model-View-Controller architecture
- modular JavaScript
- through RequireJS / Modular JavaScript through RequireJS
- MongoDB
- about / MongoDB
- manual installation, URL / MongoDB
- connecting, with Casbah / Connecting to MongoDB with Casbah
- authentication, connecting with / Connecting with authentication
- reference documentation, URL / Complex queries
- MTable instances
- Mutual Information (MI) / Spam filtering
N
- NumericColumnExtensionMethods class
- URL / Operations on columns
- NVD3
- used, for drawing plots / Drawing plots with NVD3
- URL / Drawing plots with NVD3
O
- object-oriented design patterns
- URL / References
- objects
- extracting, from database / Extracting objects from the database
- Objects / A whirlwind tour of JSON
- operations
- on columns / Operations on columns
- Ordering
P
- package.scala source file
- URL / Breeze-viz reference
- PaintScale.scala source file
- parallel collections
- about / Parallel collections
- limitations / Limitations of parallel collections
- error handling / Error handling
- parallelism level, setting / Setting the parallelism level
- cross-validation with / An example – cross-validation with parallel collections
- parallel execution
- controlling, with execution contexts / Controlling parallel execution with execution contexts
- Parquet files
- URL / References
- parsers
- pattern matchin
- case classes used / JSON in Scala – an exercise in pattern matching
- Pattern matching
- for comprehensions / Pattern matching in for comprehensions
- internals / Pattern matching internals
- URL / Reference
- permanence spectrum / Programming in data science
- persistence level
- URL / Persisting RDDs
- Pimp my Library pattern
- URL / References
- pimp my library pattern
- pimp my library pattern
- used, for enriching JDBC statements / Enriching JDBC statements with the "pimp my library" pattern
- pipeline
- about / Pipeline components
- transformers / Transformers
- estimators / Estimators
- pipeline API
- URL / References
- Play framework / Futures example – stock price fetcher
- about / The Play framework
- URL / Dynamic routing
- plots
- customizing / Customizing plots
- drawing, with NVD3 / Drawing plots with NVD3
- PreparedStatement API documentation
- URL / Inserting data
- PreparedStatement class
- API documentation, URL / References
Q
- queue control
- and pull pattern / Queue control and the pull pattern
R
- receiver operating characteristic (ROC) curve / Evaluation
- regularization / Regularization
- in logistic regression / Regularization in logistic regression
- request
- parsing / Understanding and parsing the request
- RequireJS
- modular JavaScript through / Modular JavaScript through RequireJS
- resilient applications
- building / Futures
- Resilient distributed datasets (RDD)
- about / Resilient distributed datasets
- immutability / RDDs are immutable
- operations, executing / RDDs are lazy
- constructing / RDDs know their lineage
- resiliency / RDDs are resilient
- distribution / RDDs are distributed
- transformations / Transformations and actions on RDDs
- actions / Transformations and actions on RDDs
- operations, URL / Transformations and actions on RDDs
- persisting / Persisting RDDs
- Key-value / Key-value RDDs
- double / Double RDDs
- response
- composing / Composing the response
- response views / Response views
- Rest APIs
- about / Rest APIs: best practice
- results
- URL / Composing the response
- ResultSet interface
- API documentation, URL / References
- routing
- about / Routing
S
- Scala
- and data science / Data science
- uses / Why Scala?, Scala encourages immutability, Easier parallelism
- static typing and type inference / Static typing and type inference
- and functional programs / Scala and functional programs
- null pointer uncertainty / Null pointer uncertainty
- interoperability, with Java / Interoperability with Java
- drawbacks / When not to use Scala
- references / References
- URL / References
- Scala constructs
- URL / Reference
- scatter plot matrix plots
- scatter plots
- about / More advanced scatter plots
- schema
- defining / Defining the schema
- semantic URLs / Dynamic routing, References
- sequences
- extracting / Extracting sequences
- shuffling / Data shuffling and partitions
- single page applications
- about / Single page applications
- slicing / Advanced indexing and slicing
- Slick
- importing / Importing Slick
- arguments, URL / Defining the schema
- joins, URL / Invokers
- versus JDBC / Slick versus JDBC
- URL / References
- spam filtering
- about / Spam filtering
- Spark
- installing / Installing Spark
- URL / Installing Spark, SQL statements on DataFrames
- on EC2, URL / Running Spark applications on EC2
- data shuffling / Data shuffling and partitions
- Web UI, URL / Reference
- internals, URL / Reference
- Spark applications
- running, locally / Running Spark applications locally
- URL / Running Spark applications locally
- running, on EC2 / Running Spark applications on EC2
- Spark notebooks
- SQL statements
- on DataFrames / SQL statements on DataFrames
- stand-alone programs
- building / Building and running standalone programs
- standalone programs
- about / Standalone programs
- Stanford NLP toolkit
- URL / Spam filtering
- stateful actors / Stateful actors
- StringColumnExtensionMethods class
- URL / Operations on columns
- structs
- about / Structs
T
- tokenization
- about / Transformers
- tokens
- transformations
- URL / Key-value RDDs
- transformers
- about / Transformers
- URL / References
- try/catch statements
- versus Try type / Error handling
- Try type
- versus try/catch statements / Error handling
- URL / References
- tuning memory usage
- URL / Persisting RDDs
- type classes
- loose coupling with / Looser coupling with type classes
- about / Type classes
- coding against / Coding against type classes
- usage / When to use type classes
- benefits / Benefits of type classes
- URL / References
- Typesafe activators
- about / The Play framework
- URL / The Play framework
U
- URL design / Dynamic routing
- user-defined function (UDF) / Custom functions on DataFrames
- user-defined functions (UDFs) / Custom functions on DataFrames
V
- vectors
- about / Vectors
- dense / Dense and sparse vectors and the vector trait
- sparse / Dense and sparse vectors and the vector trait
- trait / Dense and sparse vectors and the vector trait
- building / Building vectors and matrices
- mutating / Mutating vectors and matrices
W
- web-jars
- JavaScipt dependencies through / JavaScript dependencies through web-jars
- web APIs
- querying / Querying web APIs
- web application
- web frameworks
- about / Introduction to web frameworks
- web services
- external web services, calling / Calling external web services
X
- XPath
- used, for extracting fields / Extracting fields using XPath
- XPath DSL / Extracting fields using XPath