Correlation Matrix and Visualization
Correlation, as you know, is a measure that indicates how two variables fluctuate together. Any correlation value of 1, or near 1, indicates that those variables are highly correlated. Highly correlated variables can sometimes be damaging for the veracity of models and, in many circumstances, we make the decision to eliminate such variables or to combine them to form composite or interactive variables.
Let's look at how data correlation can be generated and then visualized in the following exercise.
Exercise 3.05: Finding the Correlation in Data to Generate a Correlation Plot Using Bank Data
In this exercise, we will be creating a correlation plot and analyzing the results of the bank dataset.
The following steps will help you to complete the exercise:
- Open a new Colab notebook, install the
pandas
packages and load the banking data:import pandas as pd file_url = 'https://raw.githubusercontent.com/PacktWorkshops...