Exploring String Values
String values pose particular challenges for data exploration because they can take on almost any value. This is particularly true for free-form strings, such as addresses and names, which may not be cleaned. This section looks at exploring the length and characters in strings.
Histogram of Length
A simple way to get familiar with string values is to do a histogram of the length of the values. What is the length of values in the City column in the Orders table?
SELECT LEN(City) as length, COUNT(*) as numorders, MIN(City), MAX(City) FROM Orders o GROUP BY LEN(City) ORDER BY length
This query provides not only a histogram of the lengths, but also examples of two values—the minimum and maximum values for each length. For the City column, there are lengths from 0 to 20, which is the maximum length the column stores.
Strings Starting or Ending with Spaces
Spaces at the beginning of string values can cause unexpected problems. The value “ NY”...