Book Overview & Buying
Table Of Contents

Become a Python Data Analyst

By : Alvaro Fuentes

4.3 (510)

Buy this Book

Become a Python Data Analyst

4.3 (510)

By: Alvaro Fuentes

Buy this Book

Overview of this book

Python is one of the most common and popular languages preferred by leading data analysts and statisticians for working with massive datasets and complex data visualizations. Become a Python Data Analyst introduces Python’s most essential tools and libraries necessary to work with the data analysis process, right from preparing data to performing simple statistical analyses and creating meaningful data visualizations. In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques. By the end of this book, you will have hands-on experience performing data analysis with Python.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

The Anaconda Distribution and Jupyter Notebook

The Anaconda distribution

Jupyter Notebook

Using the Jupyter Notebook

Summary

Vectorizing Operations with NumPy

Introduction to NumPy

NumPy arrays

Using NumPy for simulations

Summary

Pandas - Everyone's Favorite Data Analysis Library

Introduction to the pandas library

Operations and manipulations of pandas

Answering simple questions about a dataset

Answering further questions

Summary

Visualization and Exploratory Data Analysis

Introducing Matplotlib

Introduction to pyplot

Object-oriented interface

Common customizations

EDA with seaborn and pandas

Analyzing variables individually

Relationships between variables

Summary

Statistical Computing with Python

Introduction to SciPy

Hypothesis testing

Summary

Introduction to Predictive Analytics Models

Predictive analytics and machine learning

Understanding the scikit-learn library

Building a regression model using scikit-learn

Regression model to predict house prices

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Using NumPy for simulations

Now let's learn how to use NumPy in a real-world scenario. Here, we will cover two examples of simulations using NumPy, and in the process, we will also learn about other operations that we can do with arrays.

Coin flips

We will look into a coin flip, or coin toss, simulation using NumPy. For this purpose, we will use the randint function that comes in the random submodule from NumPy. This function takes the low, high, and size arguments, which will be the range of random integers that we want for the output. So, in this case, we want the output to be either 0 or 1, so the value for low will be 0 and high will be 2 but not including 2. Here, the size argument will define the number of random...