Hands-On Data Analysis with NumPy and Pandas

By : Curtis Miller
By: Curtis Miller

Overview of this book

Python, a multi-paradigm programming language, has become the language of choice for data scientists for visualization, data analysis, and machine learning. Hands-On Data Analysis with NumPy and Pandas starts by guiding you in setting up the right environment for data analysis with Python, along with helping you install the correct Python distribution. In addition to this, you will work with the Jupyter notebook and set up a database. Once you have covered Jupyter, you will dig deep into Python’s NumPy package, a powerful extension with advanced mathematical functions. You will then move on to creating NumPy arrays and employing different array methods and functions. You will explore Python’s pandas extension which will help you get to grips with data mining and learn to subset your data. Last but not the least you will grasp how to manage your datasets by sorting and ranking them. By the end of this book, you will have learned to index and group your data for sophisticated data analysis and manipulation.
Chapter 2. Diving into NumPY

By now you should have installed everything you need to use Python for data analysis. Let's now begin discussing NumPy, an important package for managing data and performing calculations. Without NumPy, there would not be any data analysis using Python, so understanding NumPy is critical. Our key objective in this chapter is learning to use the tools provided in NumPy.

In this chapter, the following topics will be covered:

  • NumPy data types
  • Creating arrays
  • Slicing arrays
  • Mathematics
  • Methods and functions

We begin by discussing data types, which are conceptually important when handling NumPy arrays. In this chapter, we will discuss NumPy data types controlled by dtype objects, which are the way NumPy stores and manages data. We'll also briefly introduce NumPy arrays called ndarray and discuss what they do.