We now get to the substance of array programming with NumPy. We will perform manipulations and computations on ndarrays.
Let's first import NumPy, pandas, matplotlib, and seaborn:
In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline
We load the NYC taxi dataset with pandas:
In [2]: data = pd.read_csv('../chapter2/data/nyc_data.csv', parse_dates=['pickup_datetime', 'dropoff_datetime'])
We get the pickup and dropoff locations of the taxi rides as ndarrays, using the .values
attribute of pandas DataFrames:
In [3]: pickup = data[['pickup_longitude', 'pickup_latitude']].values dropoff = data[['dropoff_longitude', 'dropoff_latitude']].values pickup Out[3]: array([[-73.955925, 40.781887], [-74.005501, 40.745735], [-73.969955, 40.79977...