31 - Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling - a podcast by Allen Institute for Artificial Intelligence

from 2017-10-06T15:53:28

:: ::

ICLR 2017 paper by Hakan Inan, Khashayar Khosravi, Richard Socher, presented by Waleed.

The paper presents some tricks for training better language models.
It introduces a modified loss function for language modeling, where producing a word that is similar to the target word is not penalized as much as producing a word that is very different to the target (I've seen this in other places, e.g., image classification, but not in language modeling). They also give theoretical and empirical justification for tying input and output embeddings.

https://www.semanticscholar.org/paper/Tying-Word-Vectors-and-Word-Classifiers-A-Loss-Fra-Inan-Khosravi/424aef7340ee618132cc3314669400e23ad910ba

Further episodes of NLP Highlights

Further podcasts by Allen Institute for Artificial Intelligence

Website of Allen Institute for Artificial Intelligence