Book Image

Mastering Python for Data Science

By : Samir Madhavan
Book Image

Mastering Python for Data Science

By: Samir Madhavan

Overview of this book

Table of Contents (19 chapters)
Mastering Python for Data Science
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
7
Estimating the Likelihood of Events
Index

Studying the Titanic


To perform the data analysis, we'll be using the Titanic dataset from Kaggle.

This dataset is simple to understand and does not require any domain understanding to derive insights.

This dataset contains the details of each passenger on the Titanic and also whether they survived or not.

The following are the field descriptions:

Field

Descriptions

survival

Survival(0 = No, 1 = Yes)

pclass

Passenger class(1 = 1st, 2 = 2nd, 3 = 3rd)

name

Name of the passenger

sex

Gender of the passenger

age

Age of the passenger

sibsp

Number of siblings/spouses aboard

parch

Number of parents/children aboard

ticket

Ticket number

fare

Passenger fare

cabin

Cabin

embarked

Port of embarkation

(C = Cherbourg, Q = Queenstown, S = Southampton)

Since the data is quite simple to understand, we'll keep the survival analysis as the main theme that can be used for the analysis of the data. We'll attach questions to these themes.

These are the questions that we...