Book Image

Python: Real-World Data Science

By : Fabrizio Romano, Dusty Phillips, Phuong Vo.T.H, Martin Czygan, Robert Layton, Sebastian Raschka
Book Image

Python: Real-World Data Science

By: Fabrizio Romano, Dusty Phillips, Phuong Vo.T.H, Martin Czygan, Robert Layton, Sebastian Raschka

Overview of this book

The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you’ll have gained key skills and be ready for the material in the next module. The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it’s time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls.
Table of Contents (12 chapters)
Free Chapter
Table of Contents
Python: Real-World Data Science
Meet Your Course Guide
What's so cool about Data Science?
Course Structure
Course Journey
The Course Roadmap and Timeline

Chapter 3. Data Analysis with pandas

In this chapter, we will explore another data analysis library called pandas. The goal of this chapter is to give you some basic knowledge and concrete examples for getting started with pandas.

An overview of the pandas package

pandas is a Python package that supports fast, flexible, and expressive data structures, as well as computing functions for data analysis. The following are some prominent features that pandas supports:

  • Data structure with labeled axes. This makes the program clean and clear and avoids common errors from misaligned data.
  • Flexible handling of missing data.
  • Intelligent label-based slicing, fancy indexing, and subset creation of large datasets.
  • Powerful arithmetic operations and statistical computations on a custom axis via axis label.
  • Robust input and output support for loading or saving data from and to files, databases, or HDF5 format.

Related to pandas installation, we recommend an easy way, that is to install it as a part...