Clippers 4/7: Cory Shain on Unsupervised Models of Human Language Acquisition

Title: An unsupervised discrete-state sequence model of human language acquisition from speech

Abstract: I will present a progress report on an ongoing attempt to apply discrete-state multi-scale recurrent neural networks as models of child language acquisition from speech. The model is inspired by prior arguments that abstract linguistic representations (e.g. phonemes and words) constrain the acoustic form of natural language utterances, and thus that attempting to efficiently store and anticipate auditory signals may emergently guide child learners to discover underlying linguistic structure. In this study, the artificial learner is a recurrent neural network arranged in interacting layers. Information exchange between adjacent layers is governed by binary detector neurons. When the detector neuron fires between two layers, those layers exchange their current analyses of the input signal in the form of discrete binary codes. Thus, in line with much existing linguistic theory, the model exploits both bottom-up and top-down signals to produce a representation of the input signal that is segmental, discrete, and featural. The learner adapts this behavior in service of four simultaneous unsupervised objectives: reconstructing the past, predicting the future, reconstructing the segment given a label, and reconstructing the label given a segment. Each layer treats the layer below as data, and thus learning is partially driven by attempting to model the learner’s own mental state, in line with influential hypotheses from cognitive neuroscience. The model solves a novel task (unsupervised joint segmentation and labeling of phonemes and words from speech), and it is therefore difficult to establish an overall state of the art performance threshold. However, results for the subtask of unsupervised word segmentation currently lag well behind the state of the art.