Book Image

Jupyter Cookbook

By : Dan Toomey
Book Image

Jupyter Cookbook

By: Dan Toomey

Overview of this book

Jupyter has garnered a strong interest in the data science community of late, as it makes common data processing and analysis tasks much simpler. This book is for data science professionals who want to master various tasks related to Jupyter to create efficient, easy-to-share, scientific applications. The book starts with recipes on installing and running the Jupyter Notebook system on various platforms and configuring the various packages that can be used with it. You will then see how you can implement different programming languages and frameworks, such as Python, R, Julia, JavaScript, Scala, and Spark on your Jupyter Notebook. This book contains intuitive recipes on building interactive widgets to manipulate and visualize data in real time, sharing your code, creating a multi-user environment, and organizing your notebook. You will then get hands-on experience with Jupyter Labs, microservices, and deploying them on the web. By the end of this book, you will have taken your knowledge of Jupyter to the next level to perform all key tasks associated with it.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Adding the Spark engine


Spark is an Apache project that provides an open source computing framework specially geared toward cluster computing. For our purposes, it provides a language called Spark that can be used to access Hadoop information sets.

How to do it...

We install the Spark engine and execute a Spark Jupyter script to show its working, as follows.

Installing the Spark engine

Generally, installing Spark involves two steps:

  • Installing Spark (for your environment)
  • Connecting Spark to your environment (whether standalone or clustered)

The Spark installations are environment specific. I've included the steps to install Spark (in connection with Jupyter) for a Windows environment here. There are different instructions for other environments.

Similarly, Spark relies on a base language to work from. This can be Scala or Python. We automatically have Python as part of the Jupyter installations, so we will rely on Python as the basis. In other words, we will code a Python Notebook, where Python...