Book Image

Python for Data Science For Dummies - Second Edition

By : John Paul Mueller, Luca Massaron
Book Image

Python for Data Science For Dummies - Second Edition

By: John Paul Mueller, Luca Massaron

Overview of this book

Python is a general-purpose programming language created in the late 1980s — and named after Monty Python — that's used by thousands of people to do things from testing microchips at Intel to powering Instagram to building video games with the PyGame library. The book begins by discussing how Python can make data science easy. You’ll learn how to work with the Anaconda tool suite that makes coding in Python easy. You’ll also learn to write code using Google Colab. As you progress, you'll discover how to perform interesting calculations and data manipulations using various Python libraries, such as pandas and NumPy. You’ll learn how to create data visualizations with MatPlotLib. While learning the advanced concepts, you’ll learn how to wrangle data by using techniques, such as hierarchical clustering. Finally, you’ll learn how to work with decision trees and use machine learning to make predictions. By the end of the book, you’ll have the skills and the knowledge that’s needed to write code in Python and extract information from data.
Table of Contents (13 chapters)
Free Chapter
1
Cover
9
Index
10
About the Authors
11
Advertisement Page
12
Connect with Dummies
13
End User License Agreement

Chapter 7

Conditioning Your Data

IN THIS CHAPTER

Bullet Working with NumPy and pandas

Bullet Working with symbolic variables

Bullet Considering the effect of dates

Bullet Fixing missing data

Bullet Slicing, combining, and modifying data elements

The characteristics, content, type, and other elements that define your data in its entirety is the data shape. The shape of your data determines the kinds of tasks you can perform with it. In order to make your data amenable to certain types of analysis, you must shape it into a different form. Think of the data as clay and you as the potter, because that’s the sort of relationship that exists. However, instead of using your hands to shape the data, you rely on functions and algorithms to perform the task. This chapter helps you understand the tools you have available to shape data and the ramifications of shaping it.

Also in this chapter, you consider the problems associated with shaping. For example, you need to know what to do when data is missing from a dataset...