-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating
The Data Science Workshop
By :
Previously, we learned about the overall structure of a dataset and the kind of information it contains. Now, it is time to really dig into it and look at the values of each column.
First, we need to import the pandas package:
import pandas as pd
Then, we'll load the data into a pandas DataFrame:
file_url = 'https://github.com/PacktWorkshops/The-Data-Science-Workshop/blob/master/Chapter10/dataset/Online%20Retail.xlsx?raw=true' df = pd.read_excel(file_url)
The pandas package provides several methods so that you can display a snapshot of your dataset. The most popular ones are head(), tail(), and sample().
The head() method will show the top rows of your dataset. By default, pandas will display the first five rows:
df.head()
You should get the following output:
Figure 10.5: Displaying the first five rows using the head() method
The output of the head() method shows that the InvoiceNo, StockCode, and...