Book Image

SciPy Recipes

By : V Kishore Ayyadevara, Ruben Oliva Ramos
Book Image

SciPy Recipes

By: V Kishore Ayyadevara, Ruben Oliva Ramos

Overview of this book

With the SciPy Stack, you get the power to effectively process, manipulate, and visualize your data using the popular Python language. Utilizing SciPy correctly can sometimes be a very tricky proposition. This book provides the right techniques so you can use SciPy to perform different data science tasks with ease. This book includes hands-on recipes for using the different components of the SciPy Stack such as NumPy, SciPy, matplotlib, and pandas, among others. You will use these libraries to solve real-world problems in linear algebra, numerical analysis, data visualization, and much more. The recipes included in the book will ensure you get a practical understanding not only of how a particular feature in SciPy Stack works, but also of its application to real-world problems. The independent nature of the recipes also ensure that you can pick up any one and learn about a particular feature of SciPy without reading through the other recipes, thus making the book a very handy and useful guide.
Table of Contents (11 chapters)

Running SciPy in Jupyter

If your main goal in using SciPy is to do data exploration and analysis or scientific computations, Jupyter provides an ideal interactive environment. Using Jupyter, we can integrate computations, graphs, formatted text, and even more sophisticated media. Essentially, anything that can be inserted in a web page can be handled by Jupyter.

Getting ready

This recipe assumes that you have a working installation of IPython and Jupyter. If you followed one of the recipes in this chapter to set up Anaconda or a standalone installation of the SciPy stack, you have all you need.

How to do it...

The following steps demonstrate how to start Jupyter and create a new notebook:

  1. Open a command window on the directory where you want your notebook files stored.
  2. Start Jupyter by running the following command in the Terminal window:
jupyter notebook
  1. After a few moments, the notebook will open in your web browser. The notebook starting page is known as the dashboard, and is shown in the following screenshot:
  1. To create a new notebook, click the New button at the top right and select Python 3 from the menu. The following screenshot shows a newly created notebook:

Code in a notebook is entered in an execution cell, which is surrounded by a green border when active. To get a taste of what working with the Jupyter notebook feels like, click on an execution cell in the notebook and enter the following code:

%matplotlib inline 
import numpy as np
import matplotlib.pyplot as plt

With the mouse cursor still in the same execution cell, press Shift + Enter to run the cell. The preceding code displayed the magic %matplotlib inline first to tell Jupyter that we want to display plots in the notebook itself, and the next two lines imported NumPy and pyplot (an interactive plotting library provided by matplotlib).


After running the cell, the cursor automatically moves to the next cell. Enter the following code in this cell:

from scipy.stats import norm, binom
n, p = 100, 0.5
mean = n * p
sdev = np.sqrt(n * p * (1-p))
sample = np.array([binom.rvs(n, p) for _ in range(1000)])
xvalues = np.linspace(mean-3*sdev, mean+3*sdev, 200)
yvalues = norm.pdf(xvalues, loc=mean, scale=sdev)
hist = plt.hist(sample, normed=True,
color='red', lw=3, ls='dotted', alpha=0.5)
plt.plot(xvalues, yvalues, color='blue', lw=2)
plt.title('Coin toss simulation, $n={}$, $p={:5.2f}$'.format(n, p))
plt.xlabel('Number of heads')
plt.ylabel('Frequency')
None

This code simulates 100 tosses of a fair coin. The simulation is repeated 1,000 times and the results are stored in the array sample. Then a histogram of the results is plotted, together with a normal approximation, according to the central limit theorem. Pressing Shift + Enter to run the cell will produce a plot of the histogram, representing the simulation and the theoretical normal approximation of the distribution of the number of heads in the coin tosses.