Book Image

Predictive Analytics Using Rattle and Qlik Sense

By : Ferran Garcia Pagans, Fernando G Pagans
Book Image

Predictive Analytics Using Rattle and Qlik Sense

By: Ferran Garcia Pagans, Fernando G Pagans

Overview of this book

Table of Contents (16 chapters)
Predictive Analytics Using Rattle and Qlik Sense
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Index

A

  • analytics / Analytics, predictive analytics, and data visualization
  • association analysis
    • about / Association analysis
  • associative logic
    • about / Associative logic
    • working / Associative logic
  • Attribute-Relation File Format (ARFF) / Loading data

B

  • bar chart
    • creating / Creating a bar chart
    • personalizing / Creating a bar chart
  • Bike Sharing Dataset
    • reference / Understanding the bike rental problem
  • bike sharing system
    • bike rental problem / Understanding the bike rental problem
    • data, exploring with Qlik Sense / Exploring the data with Qlik Sense
  • binning
    • about / Binning
  • boosting method / Boosting
  • Business Intelligence (BI)
    • about / In-memory analysis

C

  • casual users / Understanding the bike rental problem
  • categorical variable
    • about / Datasets, observations, and variables
  • categorical variables
    • about / Categorical variables
    • bar chart / Bar plots
    • mosaic plot / Mosaic plots
  • centroid / Centroid-based clustering the using K-means algorithm
  • charts
    • creating / Creating charts
    • pie chart, creating / Creating charts
    • bar chart, creating / Creating charts
    • Data menu / The Data menu
    • Sorting menu / The Sorting menu
    • Add-ons menu / The Add-ons menu
    • Appearance menu / The Appearance menu
  • classifiers performance
    • measuring / Measuring the performance of classifiers
    • confusion matrix / Confusion matrix, accuracy, sensitivity, and specificity
    • accuracy / Confusion matrix, accuracy, sensitivity, and specificity
    • sensitivity / Confusion matrix, accuracy, sensitivity, and specificity
    • specificity / Confusion matrix, accuracy, sensitivity, and specificity
    • types of predictions / Confusion matrix, accuracy, sensitivity, and specificity
    • Risk Chart, obtaining / Risk Chart
    • ROC Curve / ROC Curve
  • cleanup options
    • about / Cleaning up
  • cluster analysis
    • about / Cluster analysis
    • centroid-based clustering, K-means algorithm used / Centroid-based clustering the using K-means algorithm
    • customer segmentation, with K-means clustering / Customer segmentation with K-means clustering
    • data, preparing in Qlik Sense / Preparing the data in Qlik Sense
    • customer segmentation sheet, creating in Qlik Sense / Creating a customer segmentation sheet in Qlik Sense
  • Comma Separated Value (CSV) / Loading data
  • Complexity Parameter (CP) / Cross-validation
  • Comprehensive R Archive Network (CRAN) / Downloading and installing R
  • correlation, among input variables
    • about / Correlations among input variables
  • correlation analysis, with Rattle / Correlation Analysis with Rattle
  • correlation coefficient
    • about / Correlations among input variables
  • credit risks
    • classifying, with Decision Tree / Using a Decision Tree to classify credit risks
  • cross-validation
    • about / Cross-validation
    • implementing / Cross-validation
  • CSV file
    • loading / Loading a CSV File
  • customer buying behavior / Customer segmentation and customer buying behavior
  • customer segmentations
    • about / Customer segmentation and customer buying behavior
    • types / Customer segmentation and customer buying behavior

D

  • DAR methodology
    • reference link / Further learning
  • Dashboard Analysis and Reporting (DAR)
    • about / In-memory analysis, The DAR approach
  • dashboards
    • about / Data analysis, data applications, and dashboards, Data applications and dashboards
  • data
    • loading / Loading data, Loading data and creating a data model
    • rescaling / Rescaling data
    • Impute option, used for dealing with missing values / Using the Impute option to deal with missing values
    • exporting / Exporting data
    • preparing / Preparing the data
    • analyzing / Analyzing your data
  • data analysis, Qlik Sense
    • about / Qlik Sense data analysis
    • in memory analysis / In-memory analysis
    • associative logic / Associative experience
  • data applications
    • about / Data analysis, data applications, and dashboards, Data applications and dashboards
  • data exploring, with Qlik Sense
    • about / Exploring the data with Qlik Sense
    • temporal patterns, checking / Checking for temporal patterns
    • visual correlation analysis / Visual correlation analysis
  • data model
    • creating / Loading data and creating a data model, Preparing the data
    • checking / Preparing the data
  • Data Science
    • reference URL / Further learning
  • data science / Datasets, observations, and variables
  • dataset
    • about / Datasets, observations, and variables
    • variable description / Datasets, observations, and variables
    • reference / Customer segmentation with K-means clustering
    • instant / Understanding the bike rental problem
    • dteday / Understanding the bike rental problem
    • season / Understanding the bike rental problem
    • yr / Understanding the bike rental problem
    • mnth / Understanding the bike rental problem
    • hr / Understanding the bike rental problem
    • weekday / Understanding the bike rental problem
    • workingday / Understanding the bike rental problem
    • weathersit / Understanding the bike rental problem
    • temp / Understanding the bike rental problem
    • atemp / Understanding the bike rental problem
    • hum / Understanding the bike rental problem
    • windspeed / Understanding the bike rental problem
    • casual / Understanding the bike rental problem
    • registered / Understanding the bike rental problem
    • cnt / Understanding the bike rental problem
  • datasets
    • partitioning / Partitioning datasets and model optimization
  • data storytelling, Qlik Sense
    • about / Data storytelling with Qlik Sense
    • audience / Data storytelling with Qlik Sense
    • objective / Data storytelling with Qlik Sense
    • key messages / Data storytelling with Qlik Sense
    • story / Data storytelling with Qlik Sense
    • links / Data storytelling with Qlik Sense
    • reviewing / Data storytelling with Qlik Sense
    • new story, creating / Creating a new story
  • data transformation
    • about / Transforming data
    • Rattle, used / Transforming data with Rattle
    • variables, recoding / Recoding variables
    • binning / Binning
    • indicator variables / Indicator variables
  • data visualization / Analytics, predictive analytics, and data visualization
    • books, for references / Further learning
  • data visualization, Qlik Sense
    • about / Data visualization in Qlik Sense
    • visualization toolbox / Visualization toolbox
    • bar chart, creating / Creating a bar chart
  • Decision Tree
    • creating / Entropy and information gain
    • using, for credit risk classification / Using a Decision Tree to classify credit risks
    • URL / Using a Decision Tree to classify credit risks
    • loan applications, scoring with Rattle / Using Rattle to score new loan applications
    • Qlik Sense application, creating / Creating a Qlik Sense application to predict credit risks
  • Decision Tree Learning
    • about / Decision Tree Learning
    • advantages / Decision Tree Learning
    • disadvantages / Decision Tree Learning
  • Default? Attribute / Confusion matrix, accuracy, sensitivity, and specificity
  • default charts, Qlik Sense
    • Bar chart / Visualization toolbox
    • Combo chart / Visualization toolbox
    • Filter pane / Visualization toolbox
    • Line chart / Visualization toolbox
    • Map / Visualization toolbox
    • Pie chart / Visualization toolbox
    • Scatter plot / Visualization toolbox
    • Table / Visualization toolbox
    • Pivot Table / Visualization toolbox
    • Text & image / Visualization toolbox
    • Treemap / Visualization toolbox
    • Extensions / Visualization toolbox
  • dendrogram
    • about / Hierarchical clustering
  • descriptive analytics
    • about / Machine learning – unsupervised and supervised learning
  • disadvantages, Decision Tree Learning
    • unstable / Decision Tree Learning
    • overfitting / Decision Tree Learning
  • distributions
    • visualizing / Visualizing distributions
    • numeric variables / Numeric variables
    • categorical variables / Categorical variables

E

  • Ensemble methods
    • about / Ensemble classifiers
    • URL / Ensemble classifiers
    • boosting / Boosting
    • Random Forest / Random Forest
    • Supported Vector Machine (SVM) / Supported Vector Machines
  • entropy
    • about / Entropy and information gain
  • environment
    • installing / Installing the environment
  • error rate / Confusion matrix, accuracy, sensitivity, and specificity
  • Explore Missing option
    • about / The Explore Missing and Hierarchical options

F

  • fact table
    • about / Associative experience

G

  • General Public License (GNU) / Introducing R, Rattle, and Qlik Sense Desktop
  • Graphical User Interface (GUI) / Introducing R, Rattle, and Qlik Sense Desktop

H

  • hierarchical clustering
    • about / Hierarchical clustering
  • Hierarchical option
    • about / The Explore Missing and Hierarchical options

I

  • indicator variables
    • about / Indicator variables
    • Join Categories option / Join Categories
    • As Category option / As Category
    • As Numeric option / As Numeric
  • information gain
    • about / Entropy and information gain
  • input variables
    • about / Datasets, observations, and variables

K

  • Kaggle
    • about / Datasets, observations, and variables
    • URL / Datasets, observations, and variables, Regression performance
  • Key Performance Indicator (KPI)
    • about / Data analysis, data applications, and dashboards
  • Key Performance Indicators (KPI) / Exploring Qlik Sense Desktop
  • kurtosis
    • about / Measures of the shape of the distribution – skewness and kurtosis
    • URL / Measures of the shape of the distribution – skewness and kurtosis

L

  • labeled dataset
    • about / Machine learning – unsupervised and supervised learning
  • Logistic Regression / Linear and Logistic Regression
  • Lower Confidence Level / Measures of the shape of the distribution – skewness and kurtosis

M

  • Machine Learning (ML)
    • about / Machine learning – unsupervised and supervised learning
    • supervised learning / Machine learning – unsupervised and supervised learning
    • unsupervised learning / Machine learning – unsupervised and supervised learning
    • cluster analysis / Cluster analysis
    • hierarchical clustering / Hierarchical clustering
    • association analysis / Association analysis
  • measures of central tendency
    • mean / Measures of central tendency – mean, median, and mode
    • median / Measures of central tendency – mean, median, and mode
    • mode / Measures of central tendency – mean, median, and mode
  • measures of dispersion
    • about / Measures of dispersion – range, quartiles, variance, and standard deviation
    • range / Range
    • quartiles / Quartiles
    • variance / Variance
    • standard deviation / Standard deviation
  • menus, charts
    • Data menu / The Data menu
    • Sorting menu / The Sorting menu
    • Add-ons menu / The Add-ons menu
    • Appearance menu / The Appearance menu
  • model evaluation
    • about / Model evaluation
    • performing / Model evaluation
    • new data, scoring / Scoring new data
  • model optimization / Partitioning datasets and model optimization
  • models
    • Linear Regression / Linear and Logistic Regression
    • Logistic Regression / Linear and Logistic Regression
    • Neural Networks / Neural Networks
  • MOOC course
    • URL / Further learning
  • Multiple Linear Regression / Linear and Logistic Regression

N

  • Neural Network model
    • about / Neural Networks
    • input layer / Neural Networks
    • hidden layer / Neural Networks
    • output layer / Neural Networks
  • nominal categorical variable
    • about / Datasets, observations, and variables
  • numeric variable
    • about / Datasets, observations, and variables
  • numeric variables
    • about / Numeric variables
    • Box Plot / Box plots
    • histogram / Histograms
    • cumulative plot / Cumulative plots

O

  • Open Database Connectivity (ODBC) / Loading data
  • ordinal categorical variable
    • about / Datasets, observations, and variables
  • output variables
    • about / Datasets, observations, and variables
  • overfitting / Underfitting and overfitting

P

  • predictive analytics / Analytics, predictive analytics, and data visualization
    • about / Machine learning – unsupervised and supervised learning
  • predictive analytics process
    • steps / Analytics, predictive analytics, and data visualization

Q

  • Qlik
    • about / Visualization toolbox
  • Qlik Branch
    • URL / Visualization toolbox
  • Qlik Community
    • URL / Visualization toolbox
  • Qlik home page
    • URL / Installing Qlik Sense Desktop
  • Qlik Market
    • URL / Visualization toolbox
  • Qlik Sense
    • data visualization / Data visualization in Qlik Sense
    • default charts / Visualization toolbox
    • data analysis / Qlik Sense data analysis
    • data storytelling / Data storytelling with Qlik Sense
    • about / Scoring new data
    • references / Further learning
  • Qlik Sense application
    • creating, for predicting credit risks / Creating a Qlik Sense application to predict credit risks
    • creating / Creating a Qlik Sense App to control the activity
  • Qlik Sense Desktop
    • ways of using / Purpose of the book
    • about / Introducing R, Rattle, and Qlik Sense Desktop
    • installing / Installing Qlik Sense Desktop
    • exploring / Exploring Qlik Sense Desktop
    • URL / Further learning
  • Qlik Sense Desktop Tutorials
    • about / Visualization toolbox
  • quartiles
    • about / Quartiles
    • URL / Quartiles

R

  • R
    • about / Introducing R, Rattle, and Qlik Sense Desktop, Scoring new data
    • downloading / Downloading and installing R
    • installing / Downloading and installing R
    • installation, testing with R Console / Starting the R Console to test your R installation
  • R-Square / Predicted versus Observed Plot
  • Random Forest / Random Forest
  • range
    • about / Range
  • Rattle
    • about / Introducing R, Rattle, and Qlik Sense Desktop, Scoring new data
    • downloading / Downloading and installing Rattle
    • installing / Downloading and installing Rattle
    • used, for scoring loan applications / Using Rattle to score new loan applications
    • models / Other models
  • Rattle, using, for forecast
    • about / Using Rattle to forecast the demand
    • correlation analysis / Correlation Analysis with Rattle
    • model, creating / Building a model
    • performance, improving / Improving performance
  • R Console
    • starting, for testing R installation / Starting the R Console to test your R installation
  • registered users / Understanding the bike rental problem
  • regression performance
    • measuring / Regression performance
    • predicted, versus observed plot / Predicted versus Observed Plot
  • rescaling
    • about / Rescaling data
    • data / Rescaling data
  • Risk Chart
    • about / Risk Chart
    • obtaining / Risk Chart
  • ROC Curve
    • about / ROC Curve
  • roles, variable
    • input / Loading data
    • target / Loading data
    • risk / Loading data
    • identifier / Loading data
    • ident / Loading data
    • Ignore / Loading data

S

  • simple data app
    • creating / Creating a simple data app
  • Simple Linear Regression / Linear and Logistic Regression
  • skewness
    • about / Measures of the shape of the distribution – skewness and kurtosis
    • URL / Measures of the shape of the distribution – skewness and kurtosis
  • standard deviation
    • about / Standard deviation
  • Standard Error / Measures of the shape of the distribution – skewness and kurtosis
  • summary reports
    • about / Summary reports
    • measures of central tendency / Measures of central tendency – mean, median, and mode
    • measures of dispersion / Measures of dispersion – range, quartiles, variance, and standard deviation
    • measures of shape of distribution / Measures of the shape of the distribution – skewness and kurtosis
  • supervised learning
    • about / Machine learning – unsupervised and supervised learning
  • Supported Vector Machine (SVM) / Supported Vector Machines

T

  • target variables
    • about / Datasets, observations, and variables
  • text summaries
    • about / Text summaries
    • summary reports / Summary reports
    • missing values, displaying / Showing missing values
  • training dataset
    • about / Machine learning – unsupervised and supervised learning
  • types of predictions, classifiers performance
    • True Positive / Confusion matrix, accuracy, sensitivity, and specificity
    • False Positive / Confusion matrix, accuracy, sensitivity, and specificity
    • True Negative / Confusion matrix, accuracy, sensitivity, and specificity
    • False Negative / Confusion matrix, accuracy, sensitivity, and specificity

U

  • UCI Machine Learning Repository
    • reference / Regression performance
  • underfitting / Underfitting and overfitting
  • unlabeled dataset
    • about / Association analysis
  • unlabeled datasets
    • about / Machine learning – unsupervised and supervised learning
  • unsupervised learning
    • about / Machine learning – unsupervised and supervised learning
  • Upper Confidence Level / Measures of the shape of the distribution – skewness and kurtosis
  • user groups
    • executive management / Data analysis, data applications, and dashboards
    • middle managers / Data analysis, data applications, and dashboards
    • analysts / Data analysis, data applications, and dashboards

V

  • variable
    • about / Loading data
  • variance
    • about / Variance
  • visualization toolbox
    • about / Visualization toolbox

W

  • Weka / Loading data