Book Image

Python for Data Science For Dummies - Second Edition

By : John Paul Mueller, Luca Massaron
Book Image

Python for Data Science For Dummies - Second Edition

By: John Paul Mueller, Luca Massaron

Overview of this book

Python is a general-purpose programming language created in the late 1980s — and named after Monty Python — that's used by thousands of people to do things from testing microchips at Intel to powering Instagram to building video games with the PyGame library. The book begins by discussing how Python can make data science easy. You’ll learn how to work with the Anaconda tool suite that makes coding in Python easy. You’ll also learn to write code using Google Colab. As you progress, you'll discover how to perform interesting calculations and data manipulations using various Python libraries, such as pandas and NumPy. You’ll learn how to create data visualizations with MatPlotLib. While learning the advanced concepts, you’ll learn how to wrangle data by using techniques, such as hierarchical clustering. Finally, you’ll learn how to work with decision trees and use machine learning to make predictions. By the end of the book, you’ll have the skills and the knowledge that’s needed to write code in Python and extract information from data.
Table of Contents (13 chapters)
Free Chapter
About the Authors
Advertisement Page
Connect with Dummies
End User License Agreement

Chapter 8

Shaping Data


Bullet Manipulating HTML data

Bullet Manipulating raw text

Bullet Discovering the bag of words model and other techniques

Bullet Manipulating graph data

Chapter 7 demonstrates techniques for working with data as an entity — as something you work with in Python. However, data doesn’t exist in a vacuum. It doesn’t just suddenly appear within Python for absolutely no reason at all. As demonstrated in Chapter 6, you load the data. However, loading may not be enough — you may have to shape the data as part of loading it. That’s the purpose of this chapter. You discover how to work with a variety of container types in a way that makes it possible to load data from a number of complex container types, such as HTML pages. In fact, you even work with graphics, images, and sounds.

Remember As you progress through the book, you discover that data takes all kinds of forms and shapes. As far as the computer is concerned, data consists of 0s and 1s. Humans...