Silhouette information is a measurement to validate a cluster of data. In the previous recipe, we mentioned that the measurement of a cluster involves the calculation of how closely the data is clustered within each cluster, and measures how far different clusters are apart from each other. The silhouette coefficient combines the measurement of the intracluster and intercluster distance. The output value typically ranges from 0 to 1; the closer to 1, the better the cluster is. In this recipe, we will introduce how to compute silhouette information.
Machine Learning with R Cookbook
By :
Machine Learning with R Cookbook
By:
Overview of this book
<p>The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.</p>
<p>This book covers the basics of R by setting up a user-friendly programming environment and performing data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.</p>
Table of Contents (21 chapters)
Machine Learning with R Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Practical Machine Learning with R
Data Exploration with RMS Titanic
R and Statistics
Understanding Regression Analysis
Classification (I) – Tree, Lazy, and Probabilistic
Classification (II) – Neural Network and SVM
Model Evaluation
Ensemble Learning
Clustering
Association Analysis and Sequence Mining
Dimension Reduction
Big Data Analysis (R and Hadoop)
Resources for R and Machine Learning
Dataset – Survival of Passengers on the Titanic
Index
Customer Reviews