8.4 Using data selection via uncertainty to keep models fresh
We saw at the beginning of the chapter that we can use uncertainties to figure out whether data is part of the training data or not. We can expand on this idea in the context of an area of machine learning called active learning. The promise of active learning is that a model can learn more effectively on less data if we have a way to control the type of data it is trained on. Conceptually, this makes sense: if we train a model on data that is not of sufficient quality, it will also not perform well. Active learning is a way to guide the learning process and data a model is trained on by providing functions that can acquire data from a pool of data that is not part of the training data. By iteratively selecting the right data from the pool, we can train a model that performs better than if we had chosen the data from the pool at random.
Active learning can be used in many modern-day systems where there is a ton of unlabeled...