Tabular data, by nature, is two-dimensional, and thus, there is a limited amount of information that can be presented in a single cell. As a workaround, you will occasionally see datasets with more than a single value stored in the same cell. Tidy data allows for exactly a single value for each cell. To rectify these situations, you will typically need to parse the string data into multiple columns with the methods from the str
Series accessor.
In this recipe, we examine a dataset that has a column containing multiple different variables in each cell. We use the str
accessor to parse these strings into separate columns to tidy the data.