Practical Exercises for Chapter 8
Exercise 1: Understanding the Importance of EDA
Load a dataset of your choice. Perform initial explorations like .head(), .info() and .describe() to understand the data.
import pandas as pd
# Example Solution:
df = pd.read_csv('your_dataset.csv')
print(df.head())
print(df.info())
print(df.describe())
Exercise 2: Identifying Types of Data
Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.
# Example Solution:
# Categorical: 'Gender', 'Country'
# Numerical: 'Age', 'Income'
Exercise 3: Calculating Descriptive Statistics
Calculate the mean, median, and standard deviation of a numerical column in your dataset.
# Example Solution:
mean_age = df['Age'].mean()
median_age = df['Age'].median()
std_age = df['Age'].std()
print(f"Mean Age: {mean_age}")
print(f"Median Age: {median_age...