Index
A
- Active donor
- about / How it works..., How to do it...
- aggregate
- used, for creating classification tree financial summary / Creating a classification tree financial summary using aggregate and an Excel Export node, How to do it..., How it works...
- aggregation
- flag variables, creating for / Creating flag variables for aggregation, How to do it..., There's more...
- allbutfirst function / How to do it...
- analytical results
- specific objectives, connecting to / Connecting specific objectives to analytical results
- ANN
- Association Rules
- used, for interaction detection / feature creation / Using Association Rules for interaction detection/feature creation, How to do it..., How it works..., There's more...
- assumptions
- reviewing / Reviewing requirements, assumptions, and constraints
- AUC / Success criteria for classification
- Auto Classifier
- balancing, evaluating with / Evaluating balancing with Auto Classifier, How to do it..., How it works...
- used, to tune models / Using Auto Classifier to tune models, How to do it..., How it works...
B
- bagged logistic regression models
- balancing
- evaluating, with Auto Classifier / Evaluating balancing with Auto Classifier, How to do it..., How it works...
- benefits
- evaluating / Evaluating costs and benefits
- BI (Business Intelligence) / Automating HTML reports and graphs
- bootstrap sample
- business goals
- about / Understanding the goals of the business
- commercial goals / Understanding the goals of the business
- service goals / Understanding the goals of the business
- scientific goals / Understanding the goals of the business
- business objectives
- about / Define business objectives by Tom Khabaza
- in data mining / The importance of business objectives in data mining
- business goals, understanding / Understanding the goals of the business
- client objectives, understanding / Understanding the objectives of your client
- specific objectives, connecting to analytical results / Connecting specific objectives to analytical results
- data mining goals, specifying / Specifying your data mining goals
- business understanding
- about / Introduction, Business understanding
C
- Cartesian product merge
- performing / Cartesian product merge using key-less merge by key, How to do it..., How it works...
- performing, dummy keys used / Multiplying out using Cartesian product merge, user source, and derive dummy, How to do it..., There's more...
- categorical values
- grouping / Grouping categorical values, How to do it..., There's more...
- CDF (cumulative distribution function) / Rolling your own modeling algorithm – Weibull analysis
- CHAID modeling Node
- used, for selecting variables / Selecting variables using the CHAID Modeling node, How to do it..., How it works..., There's more...
- CHAID stumps
- champion/challenger model
- classification tree financial summary
- creating, aggregate used / Creating a classification tree financial summary using aggregate and an Excel Export node, How to do it..., How it works...
- creating, Excel Export node used / Creating a classification tree financial summary using aggregate and an Excel Export node, How to do it..., How it works...
- classification trees
- used, to explore predictions of Neural Net / Using classification trees to explore the predictions of a Neural Network, How to do it..., How it works...
- clean downstream
- performing, Filter node used / Performing clean downstream of a calculation using a Filter node
- CLEM expression language
- functions / How it works...
- CLEM scripting
- about / Introduction
- best practices / CLEM scripting best practices
- shortcomings / CLEM scripting shortcomings
- cluster centers
- writing, to Excel for conditional formatting / Using aggregate to write cluster centers to Excel for conditional formatting, How to do it..., How it works...
- commercial goals / Understanding the goals of the business
- conditional formatting
- cluster centers, writing to Excel for / Using aggregate to write cluster centers to Excel for conditional formatting, How to do it..., How it works...
- constraints
- reviewing / Reviewing requirements, assumptions, and constraints
- contingencies
- defining / Identifying risks and defining contingencies
- conversion rates
- correlation matrices
- used, for removing redundant variables / Removing redundant variables using correlation matrices, How to do it..., How it works...
- costs
- evaluating / Evaluating costs and benefits
D
- data
- reformatting, for reporting with Transpose node / Reformatting data for reporting with a Transpose node, How to do it..., How it works..., There's more...
- preparing / Data preparation
- data integration
- missing data, evaluating / Running a Statistics node on anti-join to evaluate the potential missing data, Getting ready, How to do it..., How it works...
- Data miners
- about / Introduction
- data mining goals
- specifying / Specifying your data mining goals
- data understanding
- about / Data understanding
- date arithmetic
- performing / How to do it..., How it works...
- datetime_date() function / How it works...
- deployment
- about / Deployment
- Derive Count nodes / How it works...
- Derive node
- functions, nesting into / Nesting functions into one Derive node
- Derive State nodes / How it works...
- Dorian Pyle / Using the Feature Selection node creatively to remove or decapitate perfect predictors
- dummy keys
- used, for performing Cartesian product merge / Multiplying out using Cartesian product merge, user source, and derive dummy, How to do it..., There's more...
E
- else branch
- about / How it works...
- elseif branch
- about / How it works...
- empty aggregate
- used, to evaluate sample size / Using an empty aggregate to evaluate sample size , How to do it..., How it works..., A modified version
- evaluation
- about / Evaluation
- Excel Export node
- used, for creating classification tree financial summary / Creating a classification tree financial summary using aggregate and an Excel Export node, How to do it..., How it works...
F
- >< function / How it works...
- @FIELD function / How it works...
- Feature Selection node
- used, for detecting model instability / Detecting potential model instability early using the Partition node and Feature Selection node, How to do it..., How it works...
- used, to remove perfect predictors / Using the Feature Selection node creatively to remove or decapitate perfect predictors, How to do it..., There's more...
- field formatting
- changing, in Table node / Changing formatting of fields in a Table node, How to do it..., How it works..., There's more...
- Filter node
- used, for performing clean downstream / Performing clean downstream of a calculation using a Filter node
- First time donor
- about / How it works..., How to do it...
- flag variables
- creating, for aggregation / Creating flag variables for aggregation, How to do it..., There's more...
- full data model
- used, to address missing data / Using a full data model/partial data model approach to address missing data, How to do it..., How it works...
- functions
- nesting, into one Derive node / Nesting functions into one Derive node
G
- @GLOBAL variable / Imputing in-stream mean or median
- generated filters
- combining / Combining generated filters, How to do it..., How it works...
- graphs
- Grow button / How to do it...
H
- high skew variable
- transforming, with multiple Derive node / Transforming high skew and kurtosis variables with a multiple Derive node, How to do it..., How it works..., There's more...
- HTML reports
I
- IBM SPSS Modeler / Introduction
- if branch
- about / How it works...
- imbalanced target variable
- prior probabilities, incorporating for / How to do it..., How it works..., There's more...
- in-stream mean
- Inactive donor
- about / How it works..., How to do it...
- initial data
- interaction detection / feature creation
- Association Rules, using for / Using Association Rules for interaction detection/feature creation, How to do it..., How it works..., There's more...
- intof() function / How it works...
- iterative Neural network forecasts
J
- jackknife method
- Outliers, detecting with / Detecting outliers with the jackknife method, How to do it..., How it works..., Script section 1, Script section 2, Script section 3
K
- K-means cluster
- using, as alternative to anomaly detection / Using a single cluster K-means as an alternative to anomaly detection, How to do it..., How it works...
- K-means clustering
- K-means cluster solutions
- KNN
- used, to match similar cases / Using KNN to match similar cases, How to do it..., How it works...
- KPIs
- about / Understanding the objectives of your client
- kurtosis variable
- transforming, with multiple Derive node / Transforming high skew and kurtosis variables with a multiple Derive node, How to do it..., How it works..., There's more...
L
- @LAST_NON_BLANK field / How it works...
- Lapsing donor
- about / How it works..., How to do it...
- large datasets
- locchar function / How to do it...
- log transform / How it works...
- look-up table
- merging / Merging a lookup table, How to do it..., How it works...
M
- mean absolution percent error (MAPE) / Implementing champion/challenger model management
- Mean button / There's more...
- Means node
- used, for selecting variables / Selecting variables using the Means node, How to do it..., How it works..., There's more...
- Mean Squared Error (MSE) / Success criteria for estimation
- median
- Merge node
- speeding up, cache used / Speeding up merge with caching and optimization settings, How to do it..., How it works...
- speeding up, optimization settings used / Speeding up merge with caching and optimization settings, How to do it..., How it works...
- missing data
- exploring, @NULL multiple Derive used / Using an @NULL multiple Derive to explore missing data, How to do it..., How it works...
- evaluating, during data integration / Running a Statistics node on anti-join to evaluate the potential missing data, Getting ready, How to do it..., How it works...
- addressing, for binning scale variables / Binning scale variables to address missing data, How to do it..., How it works...
- addressing, full data model used / Using a full data model/partial data model approach to address missing data, How to do it..., How it works...
- addressing, partial data model used / Using a full data model/partial data model approach to address missing data, How to do it..., How it works...
- missing values
- imputing, from uniform distribution / Imputing missing values randomly from uniform or normal distributions, How to do it..., There's more...
- imputing, from normal distribution / Imputing missing values randomly from uniform or normal distributions, How to do it..., There's more...
- Modeling
- about / Introduction, Modeling
- model instability
- detecting, Partition node used / Detecting potential model instability early using the Partition node and Feature Selection node, How to do it..., How it works...
- detecting, Feature Selection node used / Detecting potential model instability early using the Partition node and Feature Selection node, How to do it..., How it works...
- models
- building, with outliers / Building models with and without outliers, How to do it..., How it works...
- building, without outliers / Building models with and without outliers, How to do it..., How it works...
- tuning, Auto Classifier used / Using Auto Classifier to tune models, How to do it..., How it works...
- Monte Carlo Simulation
- variable importance, quantifying with / Quantifying variable importance with Monte Carlo simulation, How to do it..., How it works..., Script section 2, There's more...
- MTTF (mean time to failure) / Rolling your own modeling algorithm – Weibull analysis
- multiple Derive nodes
- transformations, building with / Building transformations with multiple Derive nodes, How it works..., There's more...
- high skew variable, transforming with / Transforming high skew and kurtosis variables with a multiple Derive node, How to do it..., How it works..., There's more...
- kurtosis variable, transforming with / Transforming high skew and kurtosis variables with a multiple Derive node, How to do it..., How it works..., There's more...
N
- @NULL multiple Derive
- used, to explore missing data / Using an @NULL multiple Derive to explore missing data, How to do it..., How it works...
- Neural Net predictions
- classification trees, using / Using classification trees to explore the predictions of a Neural Network, How to do it..., How it works...
- neural network
- used, for searching similar records / Searching for similar records using a Neural Network for inexact matching, How to do it..., How it works...
- Neural Network Feature Selection
- neuro-fuzzy searching
- used, to find similar names / Using neuro-fuzzy searching to find similar names, How to do it..., There's more...
- about / Using neuro-fuzzy searching to find similar names
- New donor
- about / How it works..., How to do it...
- non-standard aggregation
- performing / Shuffle-down (nonstandard aggregation), How to do it..., How it works...
- nonstandard dates
- parsing / Parsing nonstandard dates, How to do it..., How it works...
- normal distribution
- missing values, imputing from / Imputing missing values randomly from uniform or normal distributions, How to do it..., There's more...
O
- OK button / How to do it..., How to do it..., How to do it...
- outlier report
- creating / How to do it..., How it works...
- Outliers
- detecting, with jackknife method / Detecting outliers with the jackknife method, How to do it..., How it works..., Script section 1, Script section 2, Script section 3
- outliers
- models, building with / Building models with and without outliers, How to do it..., How it works...
- models, building without / Building models with and without outliers, How to do it..., How it works...
P
- parameters
- used, in calculations / Using parameters instead of constants in calculations
- partial data model
- used, to address missing data / Using a full data model/partial data model approach to address missing data, How to do it..., How it works...
- Partition node
- used, for detecting model instability / Detecting potential model instability early using the Partition node and Feature Selection node, How to do it..., How it works...
- PCC / Success criteria for classification
- Percent Correct Classification (PCC) / There's more...
- perfect predictors
- removing, Feature Selection node used / Using the Feature Selection node creatively to remove or decapitate perfect predictors, How to do it..., There's more...
- Predictors button / How to do it...
- Preview button / How to do it...
- prior probabilities
- incorporating, for imbalanced target variable / How to do it..., How it works..., There's more...
Q
- quirk reports / Creating an Outlier report to give to SMEs
R
- random() / How it works...
- random()function / There's more...
- random imputation
- used, to match variable distribution / Using random imputation to match a variable's distribution, How to do it..., How it works..., There's more...
- Read Values button / How to do it..., How to do it..., How to do it..., How to do it...
- Read Vaues button / How to do it...
- Records field / How to do it...
- redundant variables
- removing, correlation matrices used / Removing redundant variables using correlation matrices, How to do it..., How it works...
- requirements
- reviewing / Reviewing requirements, assumptions, and constraints
- resources inventory
- taking / Taking inventory of resources
- Return On Investment (ROI) / Other customized success criteria
- risks
- identifying / Identifying risks and defining contingencies
- Run button / How to do it..., How to do it...
S
- sample size
- evaluating, empty aggregate used / Using an empty aggregate to evaluate sample size , How to do it..., How it works..., A modified version
- sampling
- need for, evaluating / Evaluating the use of sampling for speed, How to do it..., How it works..., There's more...
- scale variables
- binning, to address missing data / Binning scale variables to address missing data, How to do it..., How it works...
- scientific goals / Understanding the goals of the business
- Select button / How to do it..., How to do it...
- Sequence function @DIFF1 / How it works...
- Sequence function @OFFSET / How it works...
- Sequence function @SINCE / How it works...
- sequence processing
- using / Sequence processing, How to do it...
- working / How it works..., There's more...
- service goals / Understanding the goals of the business
- similar cases
- matching, KNN used / Using KNN to match similar cases, How to do it..., How it works...
- similar names
- finding, neuro-fuzzy searching used / Using neuro-fuzzy searching to find similar names, How to do it..., There's more...
- similar records
- searching, neural network used / Searching for similar records using a Neural Network for inexact matching, How to do it..., How it works...
- Single-Antecedent association rules
- used, for selecting variables / Selecting variables using single-antecedent Association Rules, How to do it..., How it works..., There's more...
- Soundex codes
- producing / Producing longer Soundex codes, How to do it..., How it works...
- specific objectives
- connecting, to analytical results / Connecting specific objectives to analytical results
- Star donor
- about / How it works..., How to do it...
- Subject Matter Experts (SMEs) / Introduction
- substring() function / How it works...
- substring_between() function / There's more...
- success criteria
- for classification / Success criteria for classification
- for estimation / Success criteria for estimation
- SuperNode button / How to do it...
- Support Vector Machines (SVMs) / Using Auto Classifier to tune models
T
- Table node
- field formatting, changing / Changing formatting of fields in a Table node, How to do it..., How it works..., There's more...
- target variables
- specifying / The key to the translation – specifying target variables
- terminology
- defining / Defining terminology
- time-aligned cohorts
- TIMELAG_null variable / How it works...
- time series forecasts
- timestamp variable
- Tom Khabaza
- URL / Introduction
- to_integer() function / There's more..., How it works...
- to_string() function / How it works...
- transformations
- building, with multiple Derive nodes / Building transformations with multiple Derive nodes, How it works..., There's more...
- Transpose node
- data reformatting, for reporting with / Reformatting data for reporting with a Transpose node, How to do it..., How it works..., There's more...
U
- uniform distribution
- missing values, imputing from / Imputing missing values randomly from uniform or normal distributions, How to do it..., There's more...
V
- validation
- variable distribution
- matching, random imputation used / Using random imputation to match a variable's distribution, How to do it..., How it works..., There's more...
- variable importance
- quantifying, with Monte Carlo Simulation / Quantifying variable importance with Monte Carlo simulation, How to do it..., How it works..., Script section 2, There's more...
- variable names
- changing, Derive node used / Changing large numbers of variable names without scripting, How to do it..., How it works...
- variables
- selecting, CHAID modeling Node used / Selecting variables using the CHAID Modeling node, How to do it..., There's more...
- selecting, Means node used / Selecting variables using the Means node, How to do it..., How it works..., There's more...
- selecting, Single-Antecedent association rules used / Selecting variables using single-antecedent Association Rules, How to do it..., How it works..., There's more...
W
- Wealth2_null variable / How it works...
- Weibull analysis