Stemming is a type of transformation that can be applied to a single word, though typically the stemming operation occurs right after tokenizing. Stemming after tokenizing is so common that natural.js offers a tokenizeAndStem convenience method that can be attached to the String class prototype.
Specifically, stemming reduces a word to its root form, for instance by transforming running to run. Stemming your text after tokenizing can significantly reduce the entropy of your dataset, because it essentially de-duplicates words with similar meanings but different tenses or inflections. Your algorithm will not need to learn the words run, runs, running, and runnings separately, as they will all get transformed into run.
The most popular stemming algorithm, the Porter stemmer, is a heuristic algorithm that defines a number of staged rules for the transformation. But, in essence...