Variable importance plot provides a list of the most significant variables in descending order by a mean decrease in Gini. The top variables contribute more to the model than the bottom ones and also have high predictive power in classifying default and non-default customers.
Surprisingly, grid search does not have variable importance functionality in Python scikit-learn, hence we are using the best parameters from grid search and plotting the variable importance graph with simple random forest scikit-learn
function. Whereas, in R programming, we have that provision, hence R code would be compact here:
>>> import matplotlib.pyplot as plt
>>> rf_fit = RandomForestClassifier(n_estimators=1000, criterion="gini", max_depth=300, min_samples_split=3,min_samples_leaf=1)
>>> rf_fit.fit(x_train,y_train)
>>> importances = rf_fit.feature_importances_
>>> std = np.std([tree.feature_importances_ for tree in rf_fit.estimators_...