Book Image

Jupyter for Data Science

By : Dan Toomey
Book Image

Jupyter for Data Science

By: Dan Toomey

Overview of this book

Jupyter Notebook is a web-based environment that enables interactive computing in notebook documents. It allows you to create documents that contain live code, equations, and visualizations. This book is a comprehensive guide to getting started with data science using the popular Jupyter notebook. If you are familiar with Jupyter notebook and want to learn how to use its capabilities to perform various data science tasks, this is the book for you! From data exploration to visualization, this book will take you through every step of the way in implementing an effective data science pipeline using Jupyter. You will also see how you can utilize Jupyter's features to share your documents and codes with your colleagues. The book also explains how Python 3, R, and Julia can be integrated with Jupyter for various data science tasks. By the end of this book, you will comfortably leverage the power of Jupyter to perform various tasks in data science successfully.
Table of Contents (17 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface

Converting JSON to CSV


For this chapter, we will be using the Yelp data available from the challenge at https://www.yelp.com/dataset/challenge. This section uses the dataset from round 9 of the challenge. For background, Yelp is a site for rating different products and services where Yelp publishes the ratings to users.

The dataset file is a very large (a few gigabytes) amount of ratings. There are several sets of rating information in the download-for business ratings, reviews, tips (as in this would be a nice place to visit), and a user set. We are interested in the review data.

When dealing with such large files it may be useful to find and use a large file editor so you can poke into the data file. On Windows, most of the standard editors are limited to a few megabytes. I used the Large Text File Viewer program to open these JSON files.

All of the files are in JSON format. JSON is a human readable format with structured elements—for example, a city object containing street objects. While...