Book Image

IBM SPSS Modeler Essentials

By : Jesus Salcedo, Keith McCormick
Book Image

IBM SPSS Modeler Essentials

By: Jesus Salcedo, Keith McCormick

Overview of this book

IBM SPSS Modeler allows users to quickly and efficiently use predictive analytics and gain insights from your data. With almost 25 years of history, Modeler is the most established and comprehensive Data Mining workbench available. Since it is popular in corporate settings, widely available in university settings, and highly compatible with all the latest technologies, it is the perfect way to start your Data Science and Machine Learning journey. This book takes a detailed, step-by-step approach to introducing data mining using the de facto standard process, CRISP-DM, and Modeler’s easy to learn “visual programming” style. You will learn how to read data into Modeler, assess data quality, prepare your data for modeling, find interesting patterns and relationships within your data, and export your predictions. Using a single case study throughout, this intentionally short and focused book sticks to the essentials. The authors have drawn upon their decades of teaching thousands of new users, to choose those aspects of Modeler that you should learn first, so that you get off to a good start using proven best practices. This book provides an overview of various popular data modeling techniques and presents a detailed case study of how to use CHAID, a decision tree model. Assessing a model’s performance is as important as building it; this book will also show you how to do that. Finally, you will see how you can score new data and export your predictions. By the end of this book, you will have a firm understanding of the basics of data mining and how to effectively use Modeler to build predictive models.
Table of Contents (19 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Dedication
Preface

Modeler stream rules


You may have noticed that in the previous example, we connected the Var. File node to the Table node and this worked fine. However, what if instead we tried to connect the Table node to the Var. File node? Let's try it:

  1. Right-click the Table node.
  2. Select Connect from the Context menu (notice that the Connect option does not exist).

Let's try something different:

  1. Bring a Statistics File node onto the canvas.
  2. Right-click on the Var. File node.
  3. Select Connect from the Context menu.
  4. Click the Statistics File node (notice that you get an error message when you try to connect these two nodes).

The reason we are experiencing these issues is that there are rules for creating Modeler streams.

Modeler streams are typically comprised of three types of nodes: Source, Process, and Terminal nodes. Connecting nodes in certain ways makes sense in the context of Modeler, and other connections are not allowed.

In terms of general rules, streams always start with a Source node (a node from the Sources...