I’ll be presenting next Tuesday on incremental coreference as it relates to linguistic and psycholinguistic accuracy. Specifically, I’ll first discuss some human reading time results from coreference-based predictors, and reasons to think humans are processing coreference in an online way. The second part will cover ongoing work to add coreference prediction to an existing incremental left-corner parser, and give a sketch of linguistic and future psycholinguistic evaluation using such a parser.
Depth-bounding a grammar has been a popular technique for applying cognitively motivated restrictions to grammar induction algorithms to limit the search space of possible grammars. In this talk I will introduce two Bayesian depth-bounded grammar induction models for probabilistic context-free grammar from raw text. Both of them first depth-bound a normal PCFG and then sample trees using the depth-bounded PCFG but with different sampling algorithms. Several analyses are performed showing that depth-bounding is indeed effective in limiting the search space of the inducer. Results are also presented for successful unbounded PCFG induction with minimal constraints which has usually been thought to be very difficult. Parsing results on three different languages show that our models are able to produce parse trees better than or competitive with state-of-the-art constituency grammar induction models in terms of parsing accuracy.
This talk proposes deconvolutional time series regression (DTSR) — a general-purpose regression technique for modeling sequential data in which effects can reasonably be assumed to be temporally diffuse — and applies it to discover temporal structure in three existing psycholinguistic datasets. DTSR borrows from digital signal processing by recasting time series modeling as temporal deconvolution. It thus learns latent impulse response functions (IRF) that mediate the temporal relationship between two signals: the independent variable(s) on the one hand and the dependent variable on the other. Synthetic experiments show that DTSR successfully recovers true latent IRF, and psycholinguistic experiments demonstrate (1) important patterns of temporal diffusion that have not previously been quantified in psycholinguistic reading time experiments, (2) the ability to provide evidence for the absence of temporal diffusion, and (3) comparable (or in some cases substantially improved) prediction quality in comparison to more heavily parameterized statistical models. DTSR can thus be used to detect the existence of temporal diffusion and (when it exists) determine data driven impulse response functions to control for it. This suggests that DTSR can be an important component of any analysis pipeline for time series.
Evaluation Order Effects in Dynamic Continuized CCG:
From Negative Polarity Items to Balanced Punctuation
Combinatory Categorial Grammar’s (CCG; Steedman, 2000) flexible treatment of word order and constituency enable it to employ a compact lexicon, an important factor in its successful application to a range of NLP problems. However, its word order flexibility can be problematic for linguistic phenomena where linear order plays a key role. In this talk, I’ll show that the enhanced control over evaluation order afforded by Continuized CCG (Barker & Shan, 2014) makes it possible to formulate improved analyses of negative polarity items and balanced punctuation, and discuss their implementation as a refinement to a prototype parser for Dynamic Continuized CCG (White et al., 2017).
Hypertagging, or supertagging for realization, is the process of assigning CCG tags to predicates. Previous work has shown that it significantly increases realization speed and quality by reducing the search space of the realizer. This project seeks to improve on the current OpenCCG hypertagger, which uses a two-stage maximum entropy algorithm and reaches a dev accuracy of 95.1%. In this talk, I will present the results of various experiments using an LSTM hypertagger with different logical form linearization schemes. The performance with a pre-order linearization scheme is slightly under that of the current OpenCCG hypertagger, but the oracle linearization suggests that with a more English-like linearization, hypertagging with an LSTM is a promising way forward.
Word representations are a key technology in the NLP toolbox, but extending their success into representations of phrases and knowledge base entities has proven challenging. In this talk, I will present a method for jointly learning embeddings of words, phrases, and entities from uannotated text, using only a list of mappings between entities and surface forms. I compare these against prior methods that have relied on explicitly annotated text or the rich structure of knowledge graphs, and show that our learned embeddings better capture similarity and relatedness judgments and some relational domain knowledge.
I will also discuss experiments on augmenting the embedding model to learn soft entity disambiguation from contexts, and using member words to augment the learning of phrases. These additions harm model performance on some evaluations, and I will show some preliminary analysis of why the specific modeling approach for these ideas may not be the right one. I hope to brainstorm ideas on how to better model joint phrase-word learning and contextual disambiguation, as part of ongoing work.
Virtual patients are an effective, cost-efficient tool for training medical professionals to interview patients in a standardized environment. Technological limitations have thus far limited these tools to typewritten interactions; however, as speech recognition systems have improved, full-scale deployment of a spoken dialogue system for this purpose has edged into the range of feasibility. To build the best such system possible, we propose to take advantage of work done to improve the functioning of virtual patients in the typewritten domain. Specifically, our approach is to noisily map spoken utterances into text using off-the-shelf speech recognition, whereupon the text can be used to train existing question classification architectures. We expect that phoneme-based CNNs may mitigate recognition errors in the same way that character-based CNNs mitigate e.g., spelling errors in the typewritten domain. In this talk I will present the architecture of the system being developed to collect speech data, the experimental design, and some baseline results.
Automatic paraphrasing with lexical substitution
Generating automatic paraphrases with lexical substitution is a difficult task, but can be useful to supplement data in domain specific machine learning tasks. The Virtual Patient Project is an exact example of this problem, where have limited domain specific training data but need to accurately identify a user’s intended question, an example of which we may have only seen once. In this talk, I will present the progress Amad Hussein, Michael White, and I have made in automatically generating paraphrases, using unsupervised lexical substitution with WordNet, word embeddings, and the Paraphrase Database. Although currently our oracle accuracy in automatically classifying question types is only moderately above our baseline, they are modestly significant and give an estimate of what can be accomplished with human filtering. We propose future work in this direction that utilizes machine translation and phrase level substitution.
For the task of speech enhancement, local learning objectives are agnostic to phonetic structures helpful for speech recognition. We propose to add a global criterion to speech enhancement that allows the model to learn these high-level abstractions. We first train a spectral classifier on clean speech to predict senone labels. Then, the spectral classifier is joined with our speech enhancer as a noisy speech recognizer. This model is taught to mimic the output of the spectral classifier alone on clean speech. This mimic loss is combined with the traditional local criterion to train the speech enhancer to produce de-noised speech. Feeding the de-noised speech to an off-the-shelf Kaldi training recipe for the CHiME-2 corpus shows significant improvements in Word Error Rate (WER).
“Saccadic models for referring expression generation”
Referring expression generation (REG) is the task of describing an object in a scene so that an observer can pick it out. We have many experimental results showing that REG is constrained by the sequential nature of human vision (that is, the human eye cannot take in the whole image at once, but must look from place to place— saccade— to see more parts of the image clearly). Yet current neural network models for computer vision begin precisely by analyzing the entire image at once; thus, they cannot be used directly as models of the human REG algorithm. A recent model for computer vision (Mnih et al 2014) has a limited field of vision and makes saccades around the image; I propose to adapt this model to the REG task and use it as a psycholinguistic model of human processing. I will present some background literature, a pilot model architecture and results on some contrived tasks with synthetic data. I will discuss possible ways forward for the model and hope to get some interesting feedback from the group.