Histograms are one of the best ways for visualizing and finding out the three main statistics of a dataset: the mean, median, and mode. Histograms also help analysts get a very clear understanding of the distribution of data. The ability to plot categorical data as well as numerical data is what makes the histogram unique.
We will use the Gadfly
library, which we used for understanding and plotting data in the preceding recipes. So, to install the library, you can follow the installation steps mentioned in the previous recipes.
A basic histogram is a simple set of stacked bars, which shows the distribution of a particular feature in a dataset. This can be plotted using the
plot()
function, with theGeom.histogram
attribute as the aesthetic parameter. We will use thediamonds
dataset for the purpose. This can be done as follows:plot(dataset("ggplot2", "diamonds"), x = "Price", Geom.histogram)
As with earlier plots, color aesthetics can be used to differentiate...