Book Image

Data Analysis with IBM SPSS Statistics

By : Ken Stehlik-Barry, Anthony Babinec
Book Image

Data Analysis with IBM SPSS Statistics

By: Ken Stehlik-Barry, Anthony Babinec

Overview of this book

SPSS Statistics is a software package used for logical batched and non-batched statistical analysis. Analytical tools such as SPSS can readily provide even a novice user with an overwhelming amount of information and a broad range of options for analyzing patterns in the data. The journey starts with installing and configuring SPSS Statistics for first use and exploring the data to understand its potential (as well as its limitations). Use the right statistical analysis technique such as regression, classification and more, and analyze your data in the best possible manner. Work with graphs and charts to visualize your findings. With this information in hand, the discovery of patterns within the data can be undertaken. Finally, the high level objective of developing predictive models that can be applied to other situations will be addressed. By the end of this book, you will have a firm understanding of the various statistical analysis techniques offered by SPSS Statistics, and be able to master its use for data analysis with ease.
Table of Contents (17 chapters)
4
Dealing with Missing Data and Outliers
10
Crosstabulation Patterns for Categorical Data

Selecting a random sample of cases

Sample permanently draws a random sample of cases to process in all the subsequent procedures. Use Sample to draw a random sample of cases.

Sample allows two different specifications. One way to run it is to specify a decimal value between 0 and 1 reflecting the approximate fraction of cases that you would like to see in the sample. The second is to select an exact-size random sample, specify a positive number that is less than the file size, and follow it with the keyword FROM and the active dataset size.

To illustrate sampling, suppose you want to draw an approximately 30 percent sample from the GSS2016 active file. We will demonstrate the effect of sampling by obtaining statistics on age before and after sampling.

Here is the SPSS code:

DESCRIPTIVES VARIABLES=age
/STATISTICS=MEAN STDDEV MIN MAX.
FILTER OFF.
USE ALL.
SAMPLE .30.
DESCRIPTIVES VARIABLES...