Some datasets are nice to see but complicate to process further. Take a look at the matches file we saw in Chapter 3:
Match Date;Home Team;Away Team;Result 02/06;Italy;France;2-1 02/06;Argentina;Hungary;2-1 06/06;Italy;Hungary;3-1 06/06;Argentina;France;2-1 10/06;France;Hungary;3-1 10/06;Italy;Argentina;1-0 ...
Imagine you want to answer these questions:
How many teams played?
Which team converted most goals?
Which team won all matches it played?
The dataset is not prepared to answer those questions, at least in an easy way. If you want to answer those questions in a simple way, you will first have to normalize the data, that is, convert it to a suitable format before proceeding. Let's work on it.