In this chapter, we learned a great deal about methodologies for selecting subsets of features in order to increase the performance of our machine learning pipelines in both a predictive capacity as well in-time-complexity.
The dataset that we chose had a relatively low number of features. If selecting, however, from a very large set of features (over a hundred), then the methods in this chapter will likely start to become entirely too cumbersome. We saw that in this chapter, when attempting to optimize a CountVectorizer
pipeline, the time it would take to run a univariate test on every feature is not only astronomical; we would run a greater risk of experiencing multicollinearity in our features by sheer coincidence.
In the next chapter, we will introduce purely mathematical transformations that we may apply to our data matrices in order to alleviate the trouble of working with vast quantities of features, or even a few highly uninterpretable features. We will begin to work with...