Acquiring language from speech by learning to remember and predict
Classical accounts of child language learning invoke memory limits as a pressure to discover sparse, language-like representations of speech, while more recent proposals stress the importance of prediction for language learning. In this talk, I will describe a broad-coverage unsupervised neural network model to test memory and prediction as sources of signal by which children might acquire language directly from the perceptual stream. The model embodies several likely properties of real-time human cognition: it is strictly incremental, it encodes speech into hierarchically organized labeled segments, it allows interactive top-down and bottom-up information flow, it attempts to model its own sequence of latent representations, and its objective function only recruits local signals that are plausibly supported by human working memory capacity. Results show that much phonemic structure is learnable from unlabeled speech on the basis of these local signals. In addition, remembering the past and predicting the future both contribute independently to the linguistic content of acquired representations.