Sound Demo for the Tandem Algorithm for Speech Segregation
This page demonstrates the tandem algorithm proposed by G. Hu and D.L. Wang. For details of this algorithm see:
Hu G. and Wang D.L. (2008): A tandem algorithm for pitch estimation and voiced speech segregation. IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, pp. 2067-2079.
For the second set of demos with a naturalic utterance (including both voiced and unvoiced speech), an unvoiced speech segregation algorithm is additionally used. This unvoiced segregation algorithm is described in:
Hu G. and Wang D.L. (2008): Segregation of unvoiced speech from nonspeech interference. Journal of the Acoustical Society of America, vol. 124, pp. 1306-1319.
Voiced speech segregation (for comparison with earlier systems click here)
Noise (mixture ID) | Mixture | Segregated target |
Pure Tone (v3n0) | Audio | Audio |
White Noise (v3n1) | Audio | Audio |
Noise Burst (v3n2) | Audio | Audio |
Cocktail Party (v3n3) | Audio | Audio |
Rock Music (v3n4) | Audio | Audio |
Siren (v3n5) | Audio | Audio |
Trill Telephone (v3n6) | Audio | Audio |
Female Speech (v3n7) | Audio | Audio |
Male Speech (v3n8) | Audio | Audio |
Female Speech (v3n9) | Audio | Audio |
Naturalistic speech segregation (SNR = 0 dB)
Noise | Mixture | Segregated speech |
White Noise | Audio | Audio |
Rock Music | Audio | Audio |
Electric Fan | Audio | Audio |
Alarm Clock | Audio | Audio |
Bird Chirp with Water Flow | Audio | Audio |
Wind Noise | Audio | Audio |
Rain | Audio | Audio |
Cocktail Party | Audio | Audio |
Playground | Audio | Audio |
Crowd Noise | Audio | Audio |