In this recipe, you'll learn how to create a probability distribution histogram of two variables. This plot comes in handy when you are trying to see how much overlap there is between two variables in your data.
To plot two sets of values in a probability distribution, begin by importing all the required libraries. To show the
matplotlib
plots in IPython Notebook, we will use an IPython magic function which starts with%
:%matplotlib inline import pandas as pd import numpy as np from pymongo import MongoClient import matplotlib as mpl import matplotlib.pyplot as plt
Next, connect to MongoDB and run a query specifying the five fields to be retrieved from the MongoDB data:
client = MongoClient('localhost', 27017) db = client.pythonbicookbook collection = db.accidents fields = {'Date':1, 'Police_Force':1, 'Accident_Severity':1, 'Number_of_Vehicles':1, 'Number_of_Casualties':1} data = collection...