Abstract
Contextualized word embeddings, i.e. vector representations for words in context, are naturally seen as an extension of previous noncontextual distributional semantic models. In this work, we focus on BERT, a deep neural network that produces contextualized embeddings and has set the state-of-the-art in several semantic tasks, and study the semantic coherence of its embedding space. While showing a tendency towards coherence, BERT does not fully live up to the natural expectations for a semantic vector space. In particular, we find that the position of the sentence in which a word occurs, while having no meaning correlates, leaves a noticeable trace on the word embeddings and disturbs similarity relationships.
| Original language | English |
|---|---|
| Pages | 350-361 |
| Number of pages | 12 |
| DOIs | |
| Publication status | Published - 2020 |
| Event | Proceedings of the Society for Computation in Linguistics (SCiL) 2020 - New Orleans, Louisiana Duration: 2 Jan 2020 → 5 Jan 2020 |
Conference
| Conference | Proceedings of the Society for Computation in Linguistics (SCiL) 2020 |
|---|---|
| Period | 2/01/20 → 5/01/20 |