Summary
The assumptions in stream-based learning are different from batch-based learning, chief among them being upper bounds on operating memory and computation times. Running statistics using sliding windows or sampling must be computed in order to scale to a potentially infinite stream of data. We make the distinction between learning from stationary data, where it is assumed the generating data distribution is constant, and dynamic or evolving data, where concept drift must be accounted for. This is accomplished by techniques involving the monitoring of model performance changes or the monitoring of data distribution changes. Explicit and implicit adaptation methods are ways to adjust to the concept change.
Several supervised and unsupervised learning methods have been adapted for incremental online learning. Supervised methods include linear, non-linear, and ensemble techniques, The HoeffdingTree is introduced which is particularly interesting due largely in part to its guarantees on...