Andrews curve
Let's start with something easy. We will see how to use Andrews curve to visualize our Iris data. Andrews curve is a very simple visualization method yet it is sometimes an effective way to spot clusters in multi-dimensional data.
The way it works is really simple. Each row is plotted as a separate curve. Suppose we have a row , where is the value of the ith attribute for that row. Then, the curve corresponding to this row is given here:
, where .
Therefore, each row defines a finite Fourier series. This curve is then plotted. We will have as many curves as there are rows in our file. So in our case there will be 150 curves.
The curves that form clusters in the data will then form groups in the plot as well. This plot can sometimes be useful in exploratory data analysis. It may allow one to identify clusters in the data. It is especially useful because it works well with large numbers of attributes. The following is our implementation of the Andrews curve:
import org.jfree.chart...