3.4 ADDING AN INDEX FIELD
The data scientist may want to augment the data set with new variables that can enhance understanding. For example, not all data sets, including the bank_marketing data sets, come equipped with an ID field. Thus, we can add an index field to the data, which will serve two purposes: (i) it acts as an ID field for data sets without such a field and (ii) it tracks the sort order of the records in the database. In data science, we often repartition and re‐sort the data; it is therefore helpful to have an index field, in order to recover the original sort order when desired. How to add an index field using Python and R follows.
3.4.1 How to Add an Index Field Using Python
First, we need to open the required package, using the code discussed in the previous chapter.
import pandas as pd
Next, import the data set under the name bank_train by using the read_csv() command and specifying the file's location.
bank_train = pd.read_csv("C:/.../bank_marketing_training...