Summary
In this chapter, we examined the performance difference between the skip-gram and CBOW algorithms. For the comparison, we used a popular two-dimensional visualization technique, t-SNE, which we also briefly introduced to you, touching on the fundamental intuition and mathematics behind the method.
Next, we introduced you to the several extensions to Word2vec algorithms that boost their performance, followed by several novel algorithms that were based on the skip-gram and CBOW algorithms. Structured skip-gram extends the skip-gram algorithm by preserving the position of the context word during optimization, allowing the algorithm to treat input-output based on the distance between them. The same extension can be applied to the CBOW algorithm, and this results in the continuous window algorithm.
Then we discussed GloVe—another word embedding learning technique. GloVe takes the current Word2vec algorithms a step further by incorporating global statistics into the optimization, thus increasing...