Speaker: John Wieting
Date: Tuesday 07/28/2020, 2pm-3pm
Zoom link: check e-mails from AI or CL lists or e-mail email@example.com
Title: Learning and Applications of Paraphrastic Representations for Natural Language
Abstract: Representation learning has had a tremendous impact in machine learning and natural language processing (NLP), especially in recent years. Learned representations provide useful features needed for downstream tasks, allowing models to incorporate knowledge from billions of tokens of text. The result is better performance and generalization on many important problems of interest. This talk focuses on the problem of learning paraphrastic representations for units of language spanning from sub-words to full sentences – the latter being a focal point. Our primary goal is to learn models that can encode arbitrary word sequences into a vector with the property that sequences with similar semantics are near each other in the learned vector space, and that this property transfers across domains.
We first show several simple, but effective, models to learn word and sentence representations on noisy paraphrases automatically extracted from bilingual corpora. These models outperform contemporary models on a variety of semantic evaluations. We then propose techniques to enable deep networks to learn effective semantic representations, addressing a limitation of our prior work. We also automatically construct a large paraphrase corpus that improves the performance of all our studied models, especially those using deep architectures, and has found uses for a variety of generation tasks such as paraphrase generation and style-transfer.
We next propose models for multilingual paraphrastic sentence representations. Again, we first propose a simple and effective approach that outperforms more complicated methods on cross-lingual sentence similarity and mining bitext. We then propose a generative model that concentrates semantic information into a single interlingua representations and pushes information responsible for linguistic variation to separate language-specific representations. We show that this model has improved performance on both monolingual and cross-lingual tasks over prior work and successfully disentangles these two sources of information.
Finally, we apply our representations to the task of fine-tuning neural machine translation systems using minimum risk training. The conventional approach is to use BLEU (Papineni et al., 2002), since that is commonly used for evaluation. However, we found that using an embedding model to evaluate similarity allows the range of possible scores to be continuous and, as a result, introduces fine-grained distinctions between similar translations. The result is better performance on both human evaluations and BLEU score, along with faster convergence during training.
Bio: John Wieting is a PhD candidate in the Language Technology Institute at Carnegie Mellon University, supervised by Graham Neubig and Taylor Berg-Kirkpatrick. Previously he worked with Kevin Gimpel at the Toyota Technological Institute-Chicago, and completed his MS under the guidance of Dan Roth at the University of Illinois Urbana-Champaign. His research focuses on representation learning and its applications for natural language processing. He is also interested in language generation, with a particular interest in paraphrasing and related tasks.