4.1 EDA VERSUS HT
Clients or analysts often have a priori hypotheses that they would like the data to test. An example of such a hypothesis is: Do cellphone users have a higher rate of positive responses than landline users? The resulting hypothesis test (HT) could be carried out using either classical statistical methods or using the cross‐validation methods of data science (Chapter 5).
On the other hand, the client or the analyst may not have any salient a priori notions about what the data might uncover. In such cases, they would prefer to use exploratory data analysis (EDA) or graphical data analysis. EDA allows the user to:
- Use graphics to explore the relationship between the predictor variables and the target variable.
- Use graphics and tables to derive new variables that will increase predictive value.
- Use binning productively, to increase predictive value.
In this chapter, we will continue to explore the bank_marketing_training data set from Chapter 3. We begin by using...