Recent work in natural language generation has seen an increase in end-to- end neural network model usage. We report on ongoing work exploring how well these models can generate discourse that is coherent while still preserving the content of the input. We exemplify this work with results on the generation of discourses by the widely used model BART, which we fine-tune on texts reconstructed from the Penn Discourse Tree Bank. These texts are structured by explicit and implicit discourse connectives, e.g. ‘but’, ‘while’, ‘however’. We show that encoding in the input the discourse relation to be expressed by the connective, e.g. ‘Contingency Cause Result’, improves how well the model expresses the intended discourse relation, including whether the connective is implicit or explicit. Metrics inspired by psycholinguistic results are discussed.
Human sentence processing appears to require assembling the meanings of words into precise interpretations, a process that can be described in terms of semantic composition operations such as extraction and argument attachment. Using a set of broad-coverage psycholinguistic corpora with annotations from a generalized categorial grammar (Nguyen et al., 2012), we test the extent to which such composition operations influence self-paced reading times, eye-tracking measures, and fMRI BOLD signal. We find evidence for effects from several operations such as argument attachment and extraction; the latter effect is confirmed in a separate test on held-out data. Our results suggest that composition operations may play an explicit role in the construction of meaning over the course of sentence processing.
Computational Models of Sentence Processing and Syntactic Acquisition
This talk will provide a survey of our recent work on models of sentence processing and syntactic acquisition. First, this talk introduces an incremental left-corner parser that incorporates information about common linguistic abstractions such as syntactic categories, predicate-argument structure, and morphological rules as a computational-level model of sentence processing. Experimental results show that surprisal estimates from the proposed processing model deliver comparable and often superior fits to self-paced reading and eye-tracking data compared to those from pre-trained neural language models, suggesting that the strong linguistic generalizations made by the proposed model may help predict humanlike processing costs that manifest in latency-based measures. Subsequently, this talk presents a neural PCFG induction model that allows a clean manipulation of the influence of subword information in grammar induction. Experiments on child-directed speech demonstrate first that the incorporation of subword information results in more accurate grammars with categories that word-based induction models have difficulty finding, and second that this effect is amplified in morphologically richer languages that rely on functional affixes to express grammatical relations.