Evaluating state-of-the-art models of speaker commitment
When a speaker, Mary, utters “John did not discover that Bill lied”, we take Mary to be committed to Bill having lied, whereas in “John didn’t say that Bill lied”, we do not take that she is. Extracting such inferences arising from speaker commitment (aka event factuality) is crucial for information extraction and question answering. In this talk, we evaluate the state-of-the-art models for speaker commitment and natural language inference on the CommitmentBank, an English dataset of naturally occurring discourses, annotated with speaker commitment towards the content of the complement (“lied” in the example) of clause-embedding verbs (“discover”, “say”) under four entailment-canceling environment (negation, conditional, question, and modal). The CommitmentBank thus focuses on specific linguistic constructions and can be viewed as containing “adversarial” examples for speaker commitment models. We perform a detailed error analysis of the models’ outputs by breaking down items into classes according to various linguistic features. We show that these models can achieve good performance on certain classes of items, but fail to generalize to the diverse linguistic constructions that are present in natural language, highlighting directions for improvement.
Prediction is All You Need: A Large-Scale Study of the Effects of Word Frequency and Predictability in Naturalistic Reading
A number of psycholinguistic studies have factorially manipulated words’ contextual predictabilities and corpus frequencies and shown separable effects of each on measures of human sentence processing, a pattern which has been used to support distinct processing effects of prediction on the one hand and strength of memory representation on the other. This paper examines the generalizability of this finding to more realistic conditions of sentence processing by studying effects of frequency and predictability in three large-scale naturalistic reading corpora. Results show significant effects of word frequency and predictability in isolation but no effect of frequency over and above predictability, and thus do not provide evidence of distinct effects. The non-replication of separable effects in a naturalistic setting raises doubts about the existence of such a distinction in everyday sentence comprehension. Instead, these results are consistent with previous claims that apparent effects of frequency are underlyingly effects of predictability.
Improving classification of speech transcripts
Off-the-shelf speech recognition systems can yield useful results and accelerate application development, but general-purpose systems applied to specialized domains can introduce acoustically small–but semantically catastrophic–errors. Furthermore, sufficient audio data may not be available to develop custom acoustic models for niche tasks. To address these problems, we propose a concept to improve performance in text classification tasks that use speech transcripts as input, without any in-domain audio data. Our method augments available typewritten text training data with inferred phonetic information so that the classifier will learn semantically important acoustic regularities, making it more robust to transcription errors from the general purpose ASR. We successfully pilot our method in a speech-based virtual patient used for medical training, recovering up to 62% of errors incurred by feeding a small test set of speech transcripts to a classification model trained on typescript.