Book Image

Become a Python Data Analyst

By : Alvaro Fuentes
Book Image

Become a Python Data Analyst

By: Alvaro Fuentes

Overview of this book

Python is one of the most common and popular languages preferred by leading data analysts and statisticians for working with massive datasets and complex data visualizations. Become a Python Data Analyst introduces Python’s most essential tools and libraries necessary to work with the data analysis process, right from preparing data to performing simple statistical analyses and creating meaningful data visualizations. In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques. By the end of this book, you will have hands-on experience performing data analysis with Python.
Table of Contents (8 chapters)

What this book covers

Chapter 1, The Anaconda Distribution and Jupyter Notebook, covers the most important libraries for data science with Python. This is a well-charted overview of the main objects, attributes, methods, and functions that we will use for doing predictive analytics with Python.

Chapter 2, Vectorizing Operations with NumPy, explores Numpy—this is the library upon which almost all other scientific computing in Python projects are based. Learning how to handle NumPy arrays is crucial for doing anything related to data science in Python.

Chapter 3, Pandas - Everyone's Favorite Data Analysis Library, gives an overview of pandas which is a library that provides high performance, easy-to-use data structures, and data analysis tools for the Python programming language. We data scientists love it, and it is one of the key reasons behind Python’s popularity in the data science community. In this section, we show by example how to perform descriptive analysis with pandas.

Chapter 4, Visualization and Explanatory Data Analysis, explains that visualization is a key topic for data science. Python provides a lot of options for doing visualizations for different purposes. In this volume, we learn about two of the most popular libraries, matplotlib and seaborn, and perform exploratory data analysis on real-world datasets.

Chapter 5, Statistical Computing with Python, explains how to perform common statistical computations with Python and use them to make sense of a dataset that contains information about the alcohol consumption of teenagers.

Chapter 6, Introduction to Predictive Analytics Models, gives a brief introduction to predictive analytics and builds a model to predict the drinking habits of teenagers.