5.1 THE STORY SO FAR
To recapitulate our progress thus far, we are working our way through the Data Science Methodology.
- In Chapter 3, we discussed the importance of the Problem Understanding Phase.
- Also in Chapter 3, we dealt with several issues regarding the Data Preparation Phase.
- In Chapter 4, we covered some important topics in the Exploratory Data Analysis Phase.
- Now, here in Chapter 5, we are ready to tackle the Setup Phase.
The Setup Phase consists of a number of very important tasks that must be completed before we can begin our data modeling. These include:
- Partitioning the data
- Validating the data partition
- Balancing the data
- Establishing baseline model performance
We cover each of these topics in turn in this chapter.