Clippers Tuesday: Denis Newman-Griffis on concept embeddings

Word representations are a key technology in the NLP toolbox, but extending their success into representations of phrases and knowledge base entities has proven challenging. In this talk, I will present a method for jointly learning embeddings of words, phrases, and entities from uannotated text, using only a list of mappings between entities and surface forms. I compare these against prior methods that have relied on explicitly annotated text or the rich structure of knowledge graphs, and show that our learned embeddings better capture similarity and relatedness judgments and some relational domain knowledge.

I will also discuss experiments on augmenting the embedding model to learn soft entity disambiguation from contexts, and using member words to augment the learning of phrases. These additions harm model performance on some evaluations, and I will show some preliminary analysis of why the specific modeling approach for these ideas may not be the right one. I hope to brainstorm ideas on how to better model joint phrase-word learning and contextual disambiguation, as part of ongoing work.