Tandem algorithm for voiced speech segregation (2010)

 

Sound Demo for the Tandem Algorithm for Speech Segregation

This page demonstrates the tandem algorithm proposed by G. Hu and D.L. Wang. For details of this algorithm see:

Hu G. and Wang D.L. (2008): A tandem algorithm for pitch estimation and voiced speech segregationIEEE Transactions on Audio, Speech, and Language Processing, vol. 18, pp. 2067-2079.

For the second set of demos with a naturalic utterance (including both voiced and unvoiced speech), an unvoiced speech segregation algorithm is additionally used. This unvoiced segregation algorithm is described in:

Hu G. and Wang D.L. (2008): Segregation of unvoiced speech from nonspeech interferenceJournal of the Acoustical Society of America, vol. 124, pp. 1306-1319.

 

Voiced speech segregation (for comparison with earlier systems click here)

 

    Noise (mixture ID)    Mixture    Segregated target 
Pure Tone (v3n0) Audio Audio
White Noise (v3n1) Audio Audio
Noise Burst (v3n2) Audio Audio
Cocktail Party (v3n3) Audio Audio
Rock Music (v3n4) Audio Audio
Siren (v3n5) Audio Audio
Trill Telephone (v3n6) Audio Audio
Female Speech (v3n7) Audio Audio
Male Speech (v3n8) Audio Audio
Female Speech (v3n9) Audio Audio

Naturalistic speech segregation (SNR = 0 dB)

    Noise    Mixture    Segregated speech 
White Noise Audio Audio
Rock Music Audio Audio
Electric Fan Audio Audio
Alarm Clock Audio Audio
  Bird Chirp with Water Flow Audio Audio
Wind Noise Audio Audio
Rain Audio Audio
Cocktail Party Audio Audio
Playground Audio Audio
Crowd Noise Audio Audio