Formally, our problem is the following, applying the estimated models to new, unlabeled data, in order to get a prediction of the response variable.
To do this, we are going to leverage the predict()
function, which basically takes the following arguments:
object
, that is, the object resulting from estimation activitynew_data
, pointing to a data frame storing the new data on which to perform the prediction activity
The function will return a vector storing the obtained new predictions.
All good then, but on which data do you think we are going to apply our models? I have got here the customer list of the Middle East area, as of one year ago. We are going to apply our models to it. Let's assume that it is a .xlsx
file, so first of all we have to import it, employing our well-known import
function:
me_customer_list <- import("middle_east_customer_list.xlsx")
Let's have a look at its attributes via the str()
function, as you should be used to doing by now...