Book Image

R for Data Science

By : Dan Toomey
Book Image

R for Data Science

By: Dan Toomey

Overview of this book

Table of Contents (19 chapters)

Questions


Factual

  • Attempt to use an array of iterations when determining the clusters present.

  • Try using some of the other, non-default methods to determine clusters.

  • Which clustering method would work best with your data?

When, how, and why?

  • From package to package, we arrived at a different number of proposed clusters. How would you decide the number of clusters to use with your data?

  • Several of the methods appeared to be overwhelmed by the contributions of the various data points in the wine data (as can be seen by many of the subgraphs that are nearly completely filled in). Is there a way to make the clustering more discriminatory?

Challenges

  • Many of the clustering methods are memory-intensive. It was necessary to store the data being used in the R format on the disk and reload in order to free up some space. R does have memory management functions available that might have made that process easier. Investigate being able to use the raw CSV file.

  • With such an array of values available for...