Book Image

Mastering Exploratory Analysis with pandas

By : Harish Garg
Book Image

Mastering Exploratory Analysis with pandas

By: Harish Garg

Overview of this book

<p>The pandas is a Python library that lets you manipulate, transform, and analyze data. It is a popular framework for exploratory data visualization and analyzing datasets and data pipelines based on their properties. </p><p> </p><p>This book will be your practical guide to exploring datasets using pandas. You will start by setting up Python, pandas, and Jupyter Notebooks. You will learn how to use Jupyter Notebooks to run Python code. We then show you how to get data into pandas and do some exploratory analysis, before learning how to manipulate and reshape data using pandas methods. You will also learn how to deal with missing data from your datasets, how to draw charts and plots using pandas and Matplotlib, and how to create some effective visualizations for your audience. Finally, you will wrapup your newly gained pandas knowledge by learning how to import data out of pandas into some popular file formats. </p><p> </p><p>By the end of this book, you will have a better understanding of exploratory analysis and how to build exploratory data pipelines with Python. </p><p></p>
Table of Contents (6 chapters)

Removing columns from a pandas DataFrame

In this section, we'll look at how to remove columns or rows from a dataset in pandas. We will come to understand the drop() method and the functionality of its parameters in detail.

To start with, we first import the pandas module into our Jupyter notebook:

import pandas as pd

After this, we read our CSV dataset using the following code:

data = pd.read_csv('data-titanic.csv', index_col=3)
data.head()

The dataset should look something like the following:

To remove a single column from our dataset, the pandas drop() method is used. The drop() method consists of two parameters. The first parameter is the name of the column that needs to be eliminated; the second parameter is the axis. This parameter tells the drop method whether it should drop a row or column, and sets inplace to True, which tells the method to drop it from...