This Tuesday, Kasia Hitczenko will be visiting from the University of Maryland:
Using prosody to learn sound categories
Infants must learn the sound categories of their language, but this is difficult because there is variability in speech that causes overlap between categories and masks where the correct categories are. This work investigates whether incorporating knowledge of these systematic sources of variability can improve sound category learning. I present two models that incorporate one such source of variability, namely prosody, into two existing models of sound category learning and present preliminary results on simulated data from one of these models.
This Tuesday, David King will be talking about his ongoing work on morphological reinflection:
In a recent shared task, neural machine translation systems performed well at reinflecting a variety of languages (e.g. German, Hungarian, and Turkish), but not Russian. I will present preliminary attempts to analyze where the top performing neural machine translation model still fails with Russian. Since these shortcomings are primarily related to a word’s semantics and sound change (i.e. phonological alternation) I hope to overcome these challenges using Russian word vectors and an additional character level language model.
This Tuesday, Adam Stiff will be talking about his efforts to take a dynamical systems-based approach to speech recognition (yes, via spiking networks):
Speech can be viewed as a dynamical system (i.e. a continuous function from a state space onto itself, with state changing continuously through time), and in very broad terms, this perspective should be fairly uncontroversial (indeed, it is often the basis for models of speech production). It is, however, extremely impractical, due to the huge number of nonlinear variables involved, and the apparent lack of a framework for learning them. Thus, the tools developed by mathematicians to understand nonlinear dynamical systems have not been widely utilized in attempts at automated speech recognition. I’ll argue that the brain does employ such techniques, and that adapting them could produce benefits in terms of energy efficiency, scalability, and robustness to the problem of catastrophic forgetting in the face of ongoing learning. Furthermore, observation of “fast” (sub-millisecond) dynamics may theoretically offer some benefits for recognition accuracy, and act as a bottom-up factor in learning phone segmentation. I also hope to exhibit some results from an (ongoing) phone classification experiment, to identify constraints that should be respected by a successful implementation of some of these ideas.