In general, it can be reasonably assumed that most collections of data are not as pristine as one would like them to be. Inconsistencies, poorly designed structures, and even mysterious extra whitespaces can be a nightmare for an average developer attempting to mine even the smallest amount of useful information. While not required, a best practice for data visualization is to first verify that the data to be visualized is formatted as expected.
Tip
Best practice
Before visualizing data, be sure it is arranged as expected and is free from inconsistencies.
The Google answer to the prevailing issue of dirty data is the free software tool, Google Refine. At the time of publication, Google Refine was in the process of transitioning from a Google Code hosted project to a GitHub hosted project called Open Refine. In either case, the capabilities and intentions of the tool remain the same. The application is a downloadable installation for Mac, Windows, and Linux operating systems...