High frequency marker categories in grammar induction
High frequency marker words have been shown crucial in first language acquisition where they provide reliable clues for speech segmentation and grammatical categorization of words. Recent work in model selection of grammar induction has also hinted at a similar role played by high frequency marker words in distributionally inducing grammars. In this work, we first expand the notion of high frequency marker words to high frequency marker categories to include languages where grammatical relations between words are expressed by morphology, not word order. Through analysis of data from previous work and experiments with novel induction models, this work shows that high frequency marker categories are the main drive of accurate grammar induction.