Clippers 2/8: Byung-Doh Oh on Computational Models of Sentence Processing and Syntactic Acquisition

Computational Models of Sentence Processing and Syntactic Acquisition

This talk will provide a survey of our recent work on models of sentence processing and syntactic acquisition. First, this talk introduces an incremental left-corner parser that incorporates information about common linguistic abstractions such as syntactic categories, predicate-argument structure, and morphological rules as a computational-level model of sentence processing. Experimental results show that surprisal estimates from the proposed processing model deliver comparable and often superior fits to self-paced reading and eye-tracking data compared to those from pre-trained neural language models, suggesting that the strong linguistic generalizations made by the proposed model may help predict humanlike processing costs that manifest in latency-based measures. Subsequently, this talk presents a neural PCFG induction model that allows a clean manipulation of the influence of subword information in grammar induction. Experiments on child-directed speech demonstrate first that the incorporation of subword information results in more accurate grammars with categories that word-based induction models have difficulty finding, and second that this effect is amplified in morphologically richer languages that rely on functional affixes to express grammatical relations.