Evaluating the quality of children’s utterances in adult–child dialogue remains challenging. Common metrics such as Mean Length of Utterance, lexical diversity, and readability scores are heavily influenced by length and do not capture how meaningfully a child’s response contributes to the conversation. This work, currently under EACL review, draws on child developmental theory to propose a new approach that first classifies the previous adult utterance type and then evaluates the child’s response along two dimensions: Expansion (contextual elaboration and reasoning depth) and Independence (how much the child advances the discourse on their own). These dimensions reflect core aspects of language development, reveal clear age-related trends, improve age prediction over traditional baselines, and show strong alignment with human judgments.
In this talk, I will present the method, the findings, and their implications for studying children’s conversational development. I will also discuss future research directions in multimodal speech and language modeling, including prosody-based state estimation and adaptive generation models that incorporate cues from prosody.