Clippers Tuesday: Lifeng Jin on Bayesian Grammar Induction

Depth-bounding a grammar has been a popular technique for applying cognitively motivated restrictions to grammar induction algorithms to limit the search space of possible grammars. In this talk I will introduce two Bayesian depth-bounded grammar induction models for probabilistic context-free grammar from raw text. Both of them first depth-bound a normal PCFG and then sample trees using the depth-bounded PCFG but with different sampling algorithms. Several analyses are performed showing that depth-bounding is indeed effective in limiting the search space of the inducer. Results are also presented for successful unbounded PCFG induction with minimal constraints which has usually been thought to be very difficult. Parsing results on three different languages show that our models are able to produce parse trees better than or competitive with state-of-the-art constituency grammar induction models in terms of parsing accuracy.