Plotting our data in a histogram as a probability distribution tells matplotlib
to integrate the total area of the histogram, and scale the values appropriately. Rather than showing how many values go into each bin as in the previous recipe, we'll have the probability of finding a number in the bin.
To create a probability distribution for a single column in a Pandas DataFrame, begin by importing all the required libraries. To show the
matplotlib
plots in IPython Notebook, we will use an IPython magic function which starts with%
:%matplotlib inline import pandas as pd import numpy as np from pymongo import MongoClient import matplotlib as mpl import matplotlib.pyplot as plt
Next, connect to MongoDB, and run a query specifying the five fields to be retrieved from the MongoDB data:
client = MongoClient('localhost', 27017) db = client.pythonbicookbook collection = db.accidents fields = {'Date':1, 'Police_Force':1, ...