Clippers 8/31: Xintong Li on Self-Training for Compositional Neural NLG in Task-Oriented Dialogue

Xintong Li will present his work with Symon Jory Stevens-Guille, Aleksandre Maskharashvili and me on self-training for compositional neural NLG, including material from our upcoming INLG-21 paper along with some additional background.

Here’s the abstract for our INLG paper:

Neural approaches to natural language generation in task-oriented dialogue have typically required large amounts of annotated training data to achieve satisfactory performance, especially when generating from compositional inputs. To address this issue, we show that self-training enhanced with constrained decoding yields large gains in data efficiency on a conversational weather dataset that employs compositional meaning representations. In particular, our experiments indicate that self-training with constrained decoding can enable sequence-to-sequence models to achieve satisfactory quality using vanilla decoding with five to ten times less data than with ordinary supervised baseline; moreover, by leveraging pretrained models, data efficiency can be increased further to fifty times. We confirm the main automatic results with human evaluations and show that they extend to an enhanced, compositional version of the E2E dataset. The end result is an approach that makes it possible to achieve acceptable performance on compositional NLG tasks using hundreds rather than tens of thousands of training samples.