Book Image

Hands-On Data Analysis with NumPy and Pandas

By : Curtis Miller
5 (1)
Book Image

Hands-On Data Analysis with NumPy and Pandas

5 (1)
By: Curtis Miller

Overview of this book

Python, a multi-paradigm programming language, has become the language of choice for data scientists for visualization, data analysis, and machine learning. Hands-On Data Analysis with NumPy and Pandas starts by guiding you in setting up the right environment for data analysis with Python, along with helping you install the correct Python distribution. In addition to this, you will work with the Jupyter notebook and set up a database. Once you have covered Jupyter, you will dig deep into Python’s NumPy package, a powerful extension with advanced mathematical functions. You will then move on to creating NumPy arrays and employing different array methods and functions. You will explore Python’s pandas extension which will help you get to grips with data mining and learn to subset your data. Last but not the least you will grasp how to manage your datasets by sorting and ranking them. By the end of this book, you will have learned to index and group your data for sophisticated data analysis and manipulation.
Table of Contents (12 chapters)

Exploring series and DataFrame objects


We'll start looking at pandas series and DataFrame objects. In this section, we'll start getting familiar with pandas series and DataFrames by looking at how they are created. We'll start with series since they are the building block of DataFrames. Series are one-dimensional array-like objects containing data of a single type. From this fact alone, you'd rightly conclude that they're very similar to one-dimensional NumPy arrays, but series have different methods than NumPy arrays that make them more ideal for managing data. They can be created with an index, which is metadata identifying the contents of the series. Series can handle missing data; they do so by representing missing data with NumPy's NaN.

Creating series

We can create series from array-like objects; these include lists, tuples, and NumPy ndarray objects. We can also create a series from a Python dict. Another way to add an index to a series is to create one by passing either an index or...