Stemming is the process where we convert an English word to its base form.
Some of the examples are shown as follows:
[ running , ran ] => run [ laughing , laugh , laughed ] => laugh
With this, our search would be a lot better. When someone searches for run
, then all documents, no matter whether they consist the word running or ran, will also be shown as a match.
To achieve this, we have two approaches:
Let's see how the algorithmic approach can be implemented and what are its pros and cons.
Our first choice is the snowball algorithm. It's a powerful algorithm that finds the stem of a given word using an algorithm...