Book Image

Become a Python Data Analyst

By : Alvaro Fuentes
Book Image

Become a Python Data Analyst

By: Alvaro Fuentes

Overview of this book

Python is one of the most common and popular languages preferred by leading data analysts and statisticians for working with massive datasets and complex data visualizations. Become a Python Data Analyst introduces Python’s most essential tools and libraries necessary to work with the data analysis process, right from preparing data to performing simple statistical analyses and creating meaningful data visualizations. In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques. By the end of this book, you will have hands-on experience performing data analysis with Python.
Table of Contents (8 chapters)

Introduction to NumPy

NumPy, also known as Python's vectorization solution, is the fundamental package for doing scientific computing with Python. It gives us the ability to create multidimensional array objects and to perform faster mathematical operations than we can do with base Python. It is the basis of most of Python's Data Science ecosystem. Most of the other libraries that we use in data analytics with Python, such as scikit-learn and pandas rely on NumPy. Some advanced features of NumPy are as follows:

  • It provides sophisticated (broadcasting) functions
  • It provides tools for integrating with lower-level languages such as C, C++, and Fortran
  • It has the ability to do linear algebra and complex mathematical operations such as Fourier Transform (FT) and random number generator (RNG)

So, if you need to do some really high-performance data analysis at scale and you...