Book Image

Data Analysis with IBM SPSS Statistics

By : Ken Stehlik-Barry, Anthony Babinec

Book Image

Data Analysis with IBM SPSS Statistics

By: Ken Stehlik-Barry, Anthony Babinec

Overview of this book

SPSS Statistics is a software package used for logical batched and non-batched statistical analysis. Analytical tools such as SPSS can readily provide even a novice user with an overwhelming amount of information and a broad range of options for analyzing patterns in the data. The journey starts with installing and configuring SPSS Statistics for first use and exploring the data to understand its potential (as well as its limitations). Use the right statistical analysis technique such as regression, classification and more, and analyze your data in the best possible manner. Work with graphs and charts to visualize your findings. With this information in hand, the discovery of patterns within the data can be undertaken. Finally, the high level objective of developing predictive models that can be applied to other situations will be addressed. By the end of this book, you will have a firm understanding of the various statistical analysis techniques offered by SPSS Statistics, and be able to master its use for data analysis with ease.

Preface

What this book covers

What you need for this book

Who this book is for

Reader feedback

Customer support

Free Chapter

Installing and Configuring SPSS

Installing and Configuring SPSS

The SPSS installation utility

Launching and using SPSS

Setting parameters within the SPSS software

Executing a basic SPSS session

Accessing and Organizing Data

Accessing and Organizing Data

Accessing and organizing data overview

Reading Excel files

Reading delimited text data files

Saving IBM SPSS Statistics files

Reading IBM SPSS Statistics files

Demo - first look at the data - frequencies

Variable properties

Statistics for Individual Data Elements

Statistics for Individual Data Elements

Getting the sample data

Descriptive statistics for numeric fields

Discovering coding issues using frequencies

Explore procedure

Dealing with Missing Data and Outliers

Dealing with Missing Data and Outliers

Visually Exploring the Data

Visually Exploring the Data

Graphs available in SPSS procedures

Sampling, Subsetting, and Weighting

Sampling, Subsetting, and Weighting

Select cases dialog box

Selecting a random sample of cases

Creating New Data Elements

Creating New Data Elements

Transforming fields in SPSS

The RECODE command

The COMPUTE command

The DO IF/ELSE IF command

General points regarding SPSS transformation commands

Adding and Matching Files

Adding and Matching Files

SPSS Statistics commands to merge files

Example of one-to-many merge - Northwind database

One-to-one merge - two data subsets from GSS2016

Example of combining cases using ADD FILES

Aggregating and Restructuring Data

Aggregating and Restructuring Data

Using aggregation to add fields to a file

Aggregating up one level

Second level aggregation

Matching the aggregated file back to find specific records

Restructuring rows to columns

Crosstabulation Patterns for Categorical Data

Crosstabulation Patterns for Categorical Data

Percentages in crosstabs

Comparing Means and ANOVA

Comparing Means and ANOVA

SPSS procedures for comparing Means

Post hoc comparisons

Correlations

Pearson correlations

Listwise versus pairwise missing values

Pivoting table editing to enhance correlation matrices

Visualizing correlations with scatterplots

Rank order correlations

Partial correlations

Linear Regression

Linear Regression

Assumptions of the classical linear regression model

Example - motor trend car data

Multiple regression - Model-building strategies

Principal Components and Factor Analysis

Principal Components and Factor Analysis

Choosing between principal components analysis and factor analysis

PCA example - violent crimes

Factor analysis - abilities

Clustering

Overview of cluster analysis

Overview of SPSS Statistics cluster analysis procedures

Hierarchical cluster analysis example

K-means cluster analysis example

Twostep cluster analysis example

Discriminant Analysis

Discriminant Analysis

Descriptive discriminant analysis

Predictive discriminant analysis

Assumptions underlying discriminant analysis

Statistical and graphical summary of the data

Discriminant analysis setup - key decisions

Examining the results

Scoring new observations

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Assumptions of the classical linear regression model

Multiple regression fits a linear model by relating the predictors to the target variable. The model has the following form:

Y = B0 + B1 * X1 + B2 * X2 + … + Bp * Xp + e

Here, Y is the target variable, the Xs are the predictors, and the e term is the random disturbance. The Bs are capitalized to indicate that the are population parameters. Estimates of the Bs are found from the sample such that the sum of squares of the sample errors is minimized. The term ordinary least squares regression captures this feature.

The assumptions of the classical linear regression model are as follows:

The target variable can be calculated as a linear function of a specific set of predictor variables plus a disturbance term. The coefficients in this linear function are constant.
The expected value of the disturbance term is zero.
The disturbance...