Neural Methodius Revisited: Do Discourse Relations Help with Pre-Trained Models Too?
Aleksandre Maskharashvili, Symon Stevens-Guille, Xintong Li, Michael White
Recent developments in natural language generation (NLG) have bolstered arguments in favor of re-introducing explicit coding of discourse relations in the input to neural models. In the Methodius corpus, a meaning representation (MR) is hierarchically structured and includes discourse relations. Meanwhile pre-trained language models have been shown to implicitly encode rich linguistic knowledge which provides an excellent resource for NLG. By virtue of synthesizing these lines of research, we conduct extensive experiments on the benefits of using pre-trained models and discourse relation information in MRs, focusing on the improvement of discourse coherence and correctness. We redesign the Methodius corpus; we also construct another Methodius corpus in which MRs are not hierarchically structured but flat. We report experiments on different versions of the corpora, which probe when, where, and how pre-trained models benefit from MRs with discourse relation information in them. We conclude that discourse relations significantly improve NLG when data is limited.
Neural inflection and rating with analogical candidates (joint w/Andrea Sims)
Abstract: Recent research on computational inflection prediction leads to a frustrating quandary. On the one hand, neural sequence-to-sequence models (Kann and Schuetze, 2016) provide steadily-improving state of the art perfor mance in predicting the inflectional forms of real words, outperforming a variety of non-neural models proposed in previous work (Nicolai et al., 2016). On the other, a series of experiments reveal their inadequacy in predicting the acceptability ratings of “wug” nonce words (Corkery et al., 2019). Like other neural models, these systems sometimes learn brittle generalizations which differ from human cognition and fail badly on out-of-sample data (Dankers et al., 2021). We present a neural system which aims to obtain the best of both worlds: state-of-the-art inflection prediction performance, and the ability to rate a wide variety of plausible forms for a given input in a human-like way. We show that, unlike many pre-neural models, the system is capable of generalizing across classes of related inflectional changes, leading to new testable hypotheses about the mental representation of inflectional paradigms.
Do languages differ in semantic transparency of derived words? Using word vectors to explore English and Russian
This study explores whether the semantic relationship of derived words to their bases is similarly sensitive to word frequency in English and Russian. High-frequency derived words are thought to be memorized by speakers, rather than being parsed into constituents. As a result, such words may become semantically opaque, implying that frequent words have lower average transparency. We investigated whether distributional differences of English and Russian derivational suffixes translate into differences in semantic transparency, using cosine similarity of word vectors. Our results show a positive correlation between derived word frequency and semantic transparency, contrary to expectations. This may reflect suffix-specific effects.
With the recent explosion and hype of deep learning, linguists within the NLP community have used carefully constructed linguistic examples to do targeted assessment of model linguistic capability, to see what models really know and where they fall short. In the spirit of these studies, my project aims to investigate neural network behavior on a linguistic phenomenon that has not received much attention: cataphora (i.e. when a referring expression such as a pronoun precedes its antecedent). I investigate the behavior of two models on cataphora: WebNLG (a model trained for NLG as described in Li et al 2020, based on pretrained T5 model in Raffel et al 2019), and the Joshi model (a finetuned model for coreference resolution described in Joshi et al 2019, based on the pretrained BERT model in Devlin et al 2019). The general idea is to test whether these models can distinguish acceptable and unacceptable examples involving cataphora. Some factors I will be investigating include 1) preposed (ie fronted) vs. postposed clauses. 2) cataphora across subordination vs. coordination of clauses. 3) a special case of pragmatic subordination with contrastive “but”.
Ash Lewis and Lingbo Mo will present an update on their work, beginning with a paper called Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction. Since they last presented, they have conducted further experiments and begun planning for a “real user” study. They will also share their thoughts on potential future work for feedback. An abstract of the paper can be found below.
Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction
Existing studies on semantic parsing focus on mapping a natural-language utterance to a logical form (LF) in one turn. However, because natural language may contain ambiguity and variability, this is a difficult challenge. In this work, we investigate an interactive semantic parsing framework that explains the predicted LF step by step in natural language and enables the user to make corrections through natural-language feedback for individual steps. We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework, aiming to increase the transparency of the parsing process and help the user trust the final answer. We construct INSPIRED, a crowdsourced dialogue dataset derived from the ComplexWebQuestions dataset. Our experiments show that this framework has the potential to greatly improve overall parse accuracy. Furthermore, we develop a pipeline for dialogue simulation to evaluate our framework w.r.t. a variety of state-of-the-art KBQA models without further crowdsourcing effort. The results demonstrate that our frameworkpromise s to be effective across such models.
Enriching Linguistic Analyses by Modelling Neutral and Controversial Items
Typically, linguistic analyses are performed over datasets composed of text items where each item is assigned a category that represents a phenomenon. This category is obtained by combining multiple human annotations. Items considered for analyses are often those which exhibit a clear polarizing phenomenon (e.g. either polite or impolite). However, language can sometimes exhibit none of those phenomena (neither polite nor impolite) or a combination of phenomena (e.g. polite and impolite). This is evident in NLU datasets as they contain a significant number of items on which annotators disagreed, or agreed that they do not exhibit any phenomenon. The goal is to discover linguistic patterns associated with those items. This helps in further enriching linguistic analyses by providing insight into how language could be interpreted by different listeners.
SYSML: StYlometry with Structure and Multitask Learning: Implications for Darknet Forum Migrant Analysis
Darknet market forums are frequently used to exchange illegal goods and services between parties who use encryption to conceal their identities. The Tor network is used to host these markets, which guarantees additional anonymization from IP and location tracking, making it challenging to link across malicious users using multiple accounts (sybils). Additionally, users migrate to new forums when one is closed, making it difficult to link users across multiple forums. We develop a novel stylometry-based multitask learning approach for natural language and interaction modeling using graph embeddings to construct low-dimensional representations of short episodes of user activity for authorship attribution. We provide a comprehensive evaluation of our methods across four different darknet forums demonstrating its efficacy over the state-of-the-art, with a lift of up to 2.5X on Mean Retrieval Rank and 2X on Recall@10.
Modeling Plural Inflection Class Structure in Maltese
Theoretical and typological research in morphology define an inflectional paradigm as the collection of related word forms associated with a given lexeme. When multiple lexemes share the same paradigm, they in turn define an inflection class. Recent work in morphology uses information theory to quantify the complexity of a language’s inflectional system in terms of interpredictability across word forms and paradigms. These studies provide precise synchronic descriptions of inflectional structure, but are unable to account for how or why these systems emerge in language-specific ways. I’ll be presenting on ongoing research for my QP1 that addresses this question by modeling the relative influence of three factors – phonological form, semantic meaning, and etymological origin – on the organization of plural inflection classes in Maltese.
Are syntactic categories real?
People can express novel, precise complex ideas — plans with sophisticated contingencies, predictive models of interrelated uncertain events, and more — which seems to suggest a formal, compositional semantics in which sentences are divided into categories with associated semantic functions. But state-of-the-art NLP systems – transformers like BERT and GPT-3 — don’t work like that. This talk will review evidence about syntactic categories from sentence processing experiments and grammar inductions simulations conducted over the past few years in the OSU computational cognitive modeling lab, and hazard some guesses about the cognitive status of syntactic categories.
Title: Semi-Supervised Heterogeneous Feature Learning in a Large-Scale Conversational AI System
Abstract: This paper aims to improve an important downstream component of a large-scale industrial conversational AI system. The component is called the Skill Routing Component (SRC) and is responsible for a variety of tasks. As the last component before executing user requests, SRC utilizes many textual and symbolic features obtained from heterogeneous upstream components like automatic speech recognition (ASR) and natural language understanding (NLU), which necessitates the need for an efficient way to utilize these features. To achieve this, we propose a unified transformer model which in contrast to the traditional methods encodes the heterogeneous features into a shared latent space. Next, there is an inherent connection between SRC tasks and upstream NLU tasks. We utilize noisy NLU data for pre-training the unified SRC model via specifically curated objectives and fine-tune it separately on the different SRC tasks. Our method shows an average improvement of 1.8% on four SRC tasks over the state-of-the-art baseline.