ELMo – Taking ambiguities out of word vectors
So far, we’ve looked at word embedding algorithms that can give only a unique representation of the words in the vocabulary. However, they will give a constant representation for a given word, no matter how many times you query. Why would this be a problem? Consider the following two phrases:
I went to the bank to deposit some money
and
I walked along the river bank
Clearly, the word “bank” is used in two totally different contexts. If you use a vanilla word vector algorithm (e.g. skip-gram), you can only have one representation for the word “bank”, and it is probably going to be muddled between the concept of a financial institution and the concept of walkable edges along a river, depending on the references to this word found in the corpus it’s trained on. Therefore, it is more sensible to provide embeddings for a word while preserving and leveraging the context around...