Clippers 9/28: Byung-Doh Oh on unsupervised grammar induction

Byung-Doh Oh will be presenting his work unsupervised grammar induction, followed by some attempts to extend the project.

Character-based PCFG Induction for Modeling the Syntactic Acquisition of Morphologically Rich Languages

Unsupervised PCFG induction models, which build syntactic structures from raw text, can be used to evaluate the extent to which syntactic knowledge can be acquired from distributional information alone. However, many state-of-the-art PCFG induction models are word-based, meaning that they cannot directly inspect functional affixes, which may provide crucial information for syntactic acquisition in child learners. This work first introduces a neural PCFG induction model that allows a clean ablation of the influence of subword information in grammar induction. Experiments on child-directed speech demonstrate first that the incorporation of subword information results in more accurate grammars with categories that word-based induction models have difficulty finding, and second that this effect is amplified in morphologically richer languages that rely on functional affixes to express grammatical relations. A subsequent evaluation on multilingual treebanks shows that the model with subword information achieves state-of-the-art results on many languages, further supporting a distributional model of syntactic acquisition.

Clippers 9/14: Willy Cheung leads discussion of Yu and Ettinger’s (EMNLP-20) paper

Willy Cheung will lead a discussion of Yu and Ettinger’s (EMNLP-20) paper to help prepare for Allyson Ettinger’s upcoming invited talk on September 24:

https://aclanthology.org/2020.emnlp-main.397/

Assessing Phrasal Representation and Composition in Transformers
Lang Yu, Allyson Ettinger

Abstract

Deep transformer models have pushed performance on NLP tasks to new limits, suggesting sophisticated treatment of complex linguistic inputs, such as phrases. However, we have limited understanding of how these models handle representation of phrases, and whether this reflects sophisticated composition of phrase meaning like that done by humans. In this paper, we present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers. We use tests leveraging human judgments of phrase similarity and meaning shift, and compare results before and after control of word overlap, to tease apart lexical effects versus composition effects. We find that phrase representation in these models relies heavily on word content, with little evidence of nuanced composition. We also identify variations in phrase representation quality across models, layers, and representation types, and make corresponding recommendations for usage of representations from these models.

Clippers 9/7: Lingbo Mo and Ash Lewis on interactive semantic parsing for KBQA

Ash Lewis and Lingbo Mo will present their work with Huan Sun and Mike White titled “Transparent Dialogue for Step-by-Step Semantic Parse Correction”. Here’s the abstract:

Existing studies on semantic parsing focus primarily on mapping a natural-language utterance to a corresponding logical form in a one-shot setting. However, because natural language can contain a great deal of ambiguity and variability, this is a difficult challenge. In this work, we investigate an interactive semantic parsing framework, which shows the user how a complex question is answered step-by-step and enables them to make corrections through natural-language feedback to each step in order to increase the clarity and accuracy of parses. We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework, and construct INSPIRED, a transparent dialogue dataset with complex questions, predicted logical forms, and step-by-step, natural-language feedback. Our experiments show that the interactive framework with human feedback can significantly improve the overall parse accuracy. Furthermore, we develop a pipeline for dialogue simulation to apply the framework to other various state-of-the-art models for KBQA and largely improve their performance as well, which sheds light on the generalizability of this framework for other parsers without further annotation effort.

Clippers 8/31: Xintong Li on Self-Training for Compositional Neural NLG in Task-Oriented Dialogue

Xintong Li will present his work with Symon Jory Stevens-Guille, Aleksandre Maskharashvili and me on self-training for compositional neural NLG, including material from our upcoming INLG-21 paper along with some additional background.

Here’s the abstract for our INLG paper:

Neural approaches to natural language generation in task-oriented dialogue have typically required large amounts of annotated training data to achieve satisfactory performance, especially when generating from compositional inputs. To address this issue, we show that self-training enhanced with constrained decoding yields large gains in data efficiency on a conversational weather dataset that employs compositional meaning representations. In particular, our experiments indicate that self-training with constrained decoding can enable sequence-to-sequence models to achieve satisfactory quality using vanilla decoding with five to ten times less data than with ordinary supervised baseline; moreover, by leveraging pretrained models, data efficiency can be increased further to fifty times. We confirm the main automatic results with human evaluations and show that they extend to an enhanced, compositional version of the E2E dataset. The end result is an approach that makes it possible to achieve acceptable performance on compositional NLG tasks using hundreds rather than tens of thousands of training samples.

Cory Shain: working memory load

fMRI evidence of working memory load in naturalistic language processing
Abstract: Working memory plays a critical role in prominent theories of human incremental language processing. Because the compete parse cannot be recognized from a partial string, working memory is thought to be used to store and update parse fragments. Although constructed-stimulus experiments have produced evidence for this hypothesis, these findings have failed to generalize to naturalistic settings. In addition, the language-specificity of any such memory systems is unknown. In this study, we explore a rich set of theory-driven memory costs as predictors of human brain responses (fMRI) to naturalistic story listening, using participant specific functional localization to identify a language responsive network and a “multiple demand” network thought to support domain general working memory. Results show memory costs as postulated by the dependency locality theory, but only in the language network. We argue that working memory is indeed involved in core language comprehension processes, but that the memory resources used are housed in the language system.

3/23: Pranav on stylometry for darknet migrant identification

Title: Stylometry with Structure and Multitask Learning: Implications for Darknet Forum Migrant Analysis

Abstract:  Vendors’ trustworthiness on darknet markets is associated with an anonymous identity. Both buyers and vendors, especially influential ones, tend to migrate to new markets when a previously used market shuts down.  A better understanding of the signaling strategies used by darknet market vendors for establishing trustworthiness in their products requires linking users’ identities as they migrate across darknet forums. We develop a stylometry-based multitask learning approach for natural language and interaction modeling using graph embeddings to construct low-dimensional representations of short episodes of user activity for authorship attribution. We provide a comprehensive evaluation of our methods across four different darknet forums demonstrating its efficacy over the state-of-the-art, with a lift of up to 2.5x on Mean Retrieval Rank and 2x on Recall@10.

3/2: Willy leads discussion on the Arrau corpus

Annotating a broad range of anaphoric phenomena, in a variety of genres: the ARRAU Corpus

Olga Uryupina, Ron Artstein, Antonella Bristot, Federica Cavicchio, Francesca Delogu, Kepa J. Rodriguez, Massimo Poesio

This paper presents the second release of ARRAU, a multi-genre corpus of anaphoric information created over ten year years to provide data for the next generation of coreference / anaphora resolution systems combining different types of linguistic and world knowledge with advanced discourse modeling supporting rich linguistic annotations. The distinguishing features of ARRAU include: treating all NPs as markables, including non-referring NPs, and annotating their (non-) referentiality status; distinguishing between several categories of non-referentiality and
annotating non-anaphoric mentions; thorough annotation of markable boundaries (minimal/maximal spans, discontinuous markables); annotating a variety of mention attributes, ranging from morphosyntactic parameters to semantic category; annotating the genericity status of mentions; annotating a wide range of anaphoric relations, including bridging relations and discourse deixis; and, finally, annotating anaphoric ambiguity. The current version of the dataset contains 350K tokens and is publicly available from LDC. In this paper, we discuss in detail all the distinguishing features of the corpus, so far only partially presented in a number of conference and workshop papers; and we discuss the development between the first release of ARRAU in 2008 and this second one.

2/16: Ahmad Aljanaideh leads discussion of “Context in informational bias detection”

Context in Informational Bias Detection

Esther van den Berg, Katja Markert

Informational bias is bias conveyed through sentences or clauses that provide tangential, speculative or background information that can sway readers’ opinions towards entities. By nature, informational bias is context-dependent, but previous work on informational bias detection has not explored the role of context beyond the sentence. In this paper, we explore four kinds of context for informational bias in English news articles: neighboring sentences, the full article, articles on the same event from other news publishers, and articles from the same domain (but potentially different events). We find that integrating event context improves classification performance over a very strong baseline. In addition, we perform the first error analysis of models on this task. We find that the best-performing context-inclusive model outperforms the baseline on longer sentences, and sentences from politically centrist articles.

2/9: Sara Court leads discussion on Moeller et al “Improving Low-Resource Morphological Learning with Intermediate Forms from Finite State Transducers”

https://journals.colorado.edu/index.php/computel/article/view/427

Neural encoder-decoder models are usually applied to morphology learning as an end-to-end process without considering the underlying phonological representations that linguists posit as abstract forms before morphophonological rules are applied. Finite State Transducers for morphology, on the other hand, are developed to contain these underlying forms as an intermediate representation. This paper shows that training a bidirectional two-step encoder-decoder model of Arapaho verbs to learn two separate mappings between tags and abstract morphemes and morphemes and surface allomorphs improves results when training data is limited to 10,000 to 30,000 examples of inflected word forms.