Modeling incremental sentence processing with relational graph convolutional networks
We present an incremental model of sentence processing in which syntactic and semantic information influence each other in an interactive manner. To this end, a PCFG-based left-corner parser (van Schijndel et al. 2013) has previously been extended to incorporate the semantic dependency predicate context (i.e. pair; Levy & Goldberg, 2014) associated with each node in the parse tree. In order to further improve the accuracy and generalizability of this model, dense representations of semantic predicate contexts and syntactic categories are learned and utilized as features for making parsing decisions. More specifically, a relational graph convolutional network (RGCN; Schlichtkrull et al. 2018) is trained to learn representations for predicates, as well as role functions for cuing the representation associated with each of its arguments. In addition, syntactic category embeddings are learned jointly with the parsing sub-models to minimize cross-entropy loss. Ultimately, the goal of the model is to provide a measure of predictability that is sensitive to semantic context, which in turn will serve as a baseline for testing claims about the nature of human sentence processing.
Abstract: In this talk, I will present our paper accepted in AAAI 2020. We conduct a data-driven study focusing on analyzing and predicting sentence deletion — a prevalent but understudied phenomenon in document level Text Simplification on a large English text simplification corpus. We inspect various discourse-level factors associated with sentence deletion, using a new manually annotated sentence alignment corpus we collected. We reveal that professional editors utilize different strategies to meet the readability standards of elementary and middle schools. To predict whether a sentence will be deleted during simplification to a certain level, we harness automatically aligned data to train a classification model. We find that discourse-level factors contribute to the challenging task of predicting sentence deletion for simplification.
Bio: Yang Zhong is a first-year Ph.D. student in the Department of Computer Science and Engineering, advised by Prof. Wei Xu. His research mainly focuses on the stylistic variation of language, as well as in the field of document level text simplification.
Linguistic Marker Discovery with BERT
Detecting politeness in text is a task that has attracted attention in recent years due to its role in identifying abusive language. Previous work have either used feature-based models or deep neural networks for this task. Due to the lack of context, feature-based models perform significantly worse compared to modern deep-learning models. We leverage pretrained Bert representations to provide clustering of words based on their context. We show how we are able to obtain interpretable contextualized features that can help reduce the gap in performance between feature-based models and deep learning approaches.