Clippers 1/24: Sandro Maskharashvili on Discourse Relations in NLG

Discourse Relations: Their Role and Use in Natural Language Generation

Abstract
Speakers make extensive use of discourse connectives (e.g., but, and, so, although etc.) while communicating messages with rich information: Discourse connectives express abstract relations, called discourse relations, between pieces of information they connect. This facilitates understanding the message the speaker wants to communicate. Traditional computational linguistic (CL) approaches to natural language processing heavily rely on modeling discourse relations, in both natural language generation (NLG) and parsing tasks. The recent emergence of neural network-based approaches to natural language modeling led to remarkable advances in many CL tasks, including NLG. Nevertheless, when it comes to discourse-level phenomena, particularly the coherent use of discourse connectives, improvements are less obvious. First, I will present results of my doctoral research concerning design of symbolic, grammatical approaches to discourse, which are in line with the traditional CL approaches to discourse but overcome some important obstacles that previous approaches have. Then, I will review studies we have been systematically carrying out to establish whether neural network-based approaches can be extended/revised to overcome the issues they face. Based on our results, I will argue that reinstating the central, ubiquitous status of discourse relations by explicitly encoding discourse relations in natural language meaning representations, significantly enhances correct and coherent generation of discourse connectives with neural network-based approaches. Finally, I will discuss ample possibilities of exploring synergies of traditional, grammatical approaches and the state-of-the-art neural network-based ones to overcome critical issues, such as, data limitation problems for low-resourced languages, and interpretability of the performance of neural-network based models of language.