Besides implementing a loop
function to perform the k-fold cross-validation, you can use the tuning
function (for example, tune.nnet
, tune.randomForest
, tune.rpart
, tune.svm
, and tune.knn
.) within the e1071
package to obtain the minimum error value. In this recipe, we will illustrate how to use tune.svm
to perform the 10-fold cross-validation and obtain the optimum classification model.
In this recipe, we continue to use the telecom churn
dataset as the input data source to perform 10-fold cross-validation.
Perform the following steps to retrieve the minimum estimation error using cross-validation:
Apply
tune.svm
on the training dataset,trainset
, with the 10-fold cross-validation as the tuning control. (If you find an error message, such ascould not find function predict.func
, please clear the workspace, restart the R session and reload thee1071
library again):> tuned = tune.svm(churn~., data = trainset, gamma = 10^-2, cost = 10^2, tunecontrol=tune.control(cross=10))
Next, you can obtain the summary information of the model, tuned:
> summary(tuned) Error estimation of 'svm' using 10-fold cross validation: 0.08164651
Then, you can access the performance details of the tuned model:
> tuned$performances gamma cost error dispersion 1 0.01 100 0.08164651 0.02437228
Lastly, you can use the optimum model to generate a classification table:
> svmfit = tuned$best.model > table(trainset[,c("churn")], predict(svmfit)) yes no yes 234 108 no 13 1960
The e1071
package provides miscellaneous functions to build and assess models, therefore, you do not need to reinvent the wheel to evaluate a fitted model. In this recipe, we use the tune.svm
function to tune the svm model with the given formula, dataset, gamma, cost, and control functions. Within the tune.control
options, we configure the option as cross=10
, which performs a 10-fold cross validation during the tuning process. The tuning process will eventually return the minimum estimation error, performance detail, and the best model during the tuning process. Therefore, we can obtain the performance measures of the tuning and further use the optimum model to generate a classification table.