Book Image

IBM SPSS Modeler Cookbook

Book Image

IBM SPSS Modeler Cookbook

Overview of this book

IBM SPSS Modeler is a data mining workbench that enables you to explore data, identify important relationships that you can leverage, and build predictive models quickly allowing your organization to base its decisions on hard data not hunches or guesswork. IBM SPSS Modeler Cookbook takes you beyond the basics and shares the tips, the timesavers, and the workarounds that experts use to increase productivity and extract maximum value from data. The authors of this book are among the very best of these exponents, gurus who, in their brilliant and imaginative use of the tool, have pushed back the boundaries of applied analytics. By reading this book, you are learning from practitioners who have helped define the state of the art. Follow the industry standard data mining process, gaining new skills at each stage, from loading data to integrating results into everyday business practices. Get a handle on the most efficient ways of extracting data from your own sources, preparing it for exploration and modeling. Master the best methods for building models that will perform well in the workplace. Go beyond the basics and get the full power of your data mining workbench with this practical guide.
Table of Contents (17 chapters)
IBM SPSS Modeler Cookbook
Credits
Foreword
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Calculating and comparing conversion rates


There are times when you need to transform a variable to be able to better answer a question or to gain additional insight. In this recipe we will calculate the ratio of donors to total prospective donors. The data set already has a donate/not donate variable in the form of TARGET_B. We will calculate something similar for all of the campaigns, allowing us to present results on a line chart and look at trends.

Getting ready

We will start with the Conversion Rates.str stream.

How to do it...

  1. Open the stream and edit the Derive node. Note that it is a multiple derive and it is producing several new variables:

  2. Edit the Statistics node, verify that it is requesting Mean only and run:

  3. Add an Aggregate node with no key variables, but with all of the new campaign date variables from RDATE_7_CRate through RDATE_24_CRate:

  4. Add a Transpose node. We will use the prefix CRate and we only need one new variable:

  5. Add a Derive node with @INDEX as the formula. We will call...