Clippers Tuesday: Evan Jaffe on Incremental Coreference

I’ll be presenting next Tuesday on incremental coreference as it relates to linguistic and psycholinguistic accuracy. Specifically, I’ll first discuss some human reading time results from coreference-based predictors, and reasons to think humans are processing coreference in an online way. The second part will cover ongoing work to add coreference prediction to an existing incremental left-corner parser, and give a sketch of linguistic and future psycholinguistic evaluation using such a parser.

Clippers Tuesday: Lifeng Jin on Bayesian Grammar Induction

Depth-bounding a grammar has been a popular technique for applying cognitively motivated restrictions to grammar induction algorithms to limit the search space of possible grammars. In this talk I will introduce two Bayesian depth-bounded grammar induction models for probabilistic context-free grammar from raw text. Both of them first depth-bound a normal PCFG and then sample trees using the depth-bounded PCFG but with different sampling algorithms. Several analyses are performed showing that depth-bounding is indeed effective in limiting the search space of the inducer. Results are also presented for successful unbounded PCFG induction with minimal constraints which has usually been thought to be very difficult. Parsing results on three different languages show that our models are able to produce parse trees better than or competitive with state-of-the-art constituency grammar induction models in terms of parsing accuracy.

Clippers Tuesday: Cory Shain on Deconvolutional Time Series Regression

This talk proposes deconvolutional time series regression (DTSR) — a general-purpose regression technique for modeling sequential data in which effects can reasonably be assumed to be temporally diffuse — and applies it to discover temporal structure in three existing psycholinguistic datasets. DTSR borrows from digital signal processing by recasting time series modeling as temporal deconvolution. It thus learns latent impulse response functions (IRF) that mediate the temporal relationship between two signals: the independent variable(s) on the one hand and the dependent variable on the other. Synthetic experiments show that DTSR successfully recovers true latent IRF, and psycholinguistic experiments demonstrate (1) important patterns of temporal diffusion that have not previously been quantified in psycholinguistic reading time experiments, (2) the ability to provide evidence for the absence of temporal diffusion, and (3) comparable (or in some cases substantially improved) prediction quality in comparison to more heavily parameterized statistical models. DTSR can thus be used to detect the existence of temporal diffusion and (when it exists) determine data driven impulse response functions to control for it. This suggests that DTSR can be an important component of any analysis pipeline for time series.