Book Image

Mastering SQL Server 2014 Data Mining

By : Amarpreet Singh Bassan, Debarchan Sarkar
Book Image

Mastering SQL Server 2014 Data Mining

By: Amarpreet Singh Bassan, Debarchan Sarkar

Overview of this book

<p>Whether you are new to data mining or are a seasoned expert, this book will provide you with the skills you need to successfully create, customize, and work with Microsoft Data Mining Suite. Starting with the basics, this book will cover how to clean the data, design the problem, and choose a data mining model that will give you the most accurate prediction.</p> <p>Next, you will be taken through the various classification models such as the decision tree data model, neural network model, as well as Naïve Bayes model. Following this, you'll learn about the clustering and association algorithms, along with the sequencing and regression algorithms, and understand the data mining expressions associated with each algorithm. With ample screenshots that offer a step-by-step account of how to build a data mining solution, this book will ensure your success with this cutting-edge data mining system.</p>
Table of Contents (17 chapters)
Mastering SQL Server 2014 Data Mining
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

The Microsoft Clustering algorithm


The data mining models based on the Microsoft Clustering algorithm is targeted towards identifying the relationships between different entities of the dataset and dividing them into logically related groups. This algorithm differs from other algorithms in such a way that these do not require any predictable columns as their prime motive is to identify the groups of data, rather than to predict the value of an attribute. These groupings can then be used to make predictions, identify exceptions, and so on. Thus, the prime usage of this algorithm lies mainly in the data analysis phase where the focus is mainly on the existing/current data to test our hypothesis about the relationships between entities in the data and determine any exceptions (hidden relationships).

The following screenshot shows a data mining model based on the Microsoft Clustering algorithm. This can be seen in the SSDT Mining Models tab.

An important observation regarding the preceding screenshot...