Reading (due Tuesday February 7 by noon, via email)
Read the this chapter on pitch perception:
de Cheveigné, A. 2005. Pitch perception models. In Pitch: Neural Coding and Perception,
ed. C. J. Plack, A. J. Oxenham , R. R. Fay and N. A. Popper, 169–233. New York, NY: Springer.
And write up a short summary (~ 1 pagaragraph for each point) on
- the definition of pitch
- place theory (with a brief synopsis of its history and its relationship to signal processing methods)
- time theory (with a brief synopsis of its history and its relationship to signal processing methods)
- you should focus on the first nine sections of the chapter
Coding (due Thursday February 9 by noon, via email)
Write a function that takes a wavefile and produces the following figure (using avm.wav)
The log amplitude spectrogram plot should be made using mySpecgram.m with an N value of 1/10th of the sampling rate and the default win and hop values. The displayed image should be 10 * the log of the absolute STFT output of the function.
The MFCC plot should be made using myMFCC.m, you should use the STFT representation returned by mySpecgram as an input into this function. The MFCCs should be normalized using the following code:
mfccvalnorm = (mfccval – repmat(mean(mfccval,2), 1, size(mfccval,2))) ./ repmat(std(mfccval,[],2), 1, size(mfccval, 2));
Only coefficients 2–13 should be plotted and the displayed image should be 10 * the log of the absolute MFCC values.
The cochleagram plot should be made using LyonPassiveEar.m from the Auditory Toolbox with a decimation factor of 100 and the default values for the optional arguments. You can calculate the appropriate values for the y axis by calling: [filters, cochfreq] = DesignLyonFilters(sr,8,8/32), where sr is the sampling rate.
The chromagram plot should be made using the output from pitch_to_chroma.m in the Chroma Toolbox. Prior to calling pitch_to_chroma, you will need to run the following functions: wav_to_audio, estimateTuning, and audio_to_pitch_via_FB. The winLenSTMSP for audio_to_pitch_via_FB should be 4410 and the shiftFB should be the value returned by estimateTuning. When running pitch_to_chroma, you will need to specify an value of 0 for applyLogCompr and the inputFeatureRate should be the value returned in sideinfo.pitch.featureRate from wav_to_audio.
Pay special attention to replicating the x and y axes of each plot.