How to assess frequency stability of sung notes while keeping frequency precision under the possiblity that a vibrato (frequency modulation) is present.

4 views (last 30 days)
dear community,
i have different recordings of a sequency of tones with increasing frequency. this tone sequence is repeated and sung at different base speeds (about 60 bpm up to 240 bpm). i'm not only interested in the precise catching of the fundamental frequency, but also in the tones' stability.
to get a precise analysis of the fundamental frequency, i am analyzing using a as big tone time window as possible, since the fft bin widths are reciprocal to the time window length.
concerning the stability of those sung tones i see two problems: since the tones are of different lengths, the time windiws and fft bins are of different length/width and as a consequence the bandwdth of e.g. the fundamental frequency peak bin is not comparable.
the other problem is the possible presence of a vibrato, which is a frequency modulation (a periodic sinusoidal jitter in frequency) which leads to a broader peak in frequency domain, but which influence i want to remove. saying that i am only interested in unperiodic changes in frequency per sung tone. my only idea to remove the influence of the vibrato is to analyse base frequency with a realy small window, looking at the sequence of extracted fundamental frequencies and temoving that using a lowpass (those vibrato frequency can be e.g. 8Hz) but by analysing with such small windows i loose the precision of the general fundamental frequency.
i hope you got my basic idea and problem and i would be happy to hear some ideas to solve my dilemma.
EDIT: I now added a sample file in which many tones contain a vibrato

Accepted Answer

William Rose
William Rose on 30 Dec 2021
@Jonas, I put my notebook computer on a piano and reorded while I played the notes CDEFGA. File attached. I used this file as an example.
The code reads the file and generates a spectrogram. The value twin=0.1 sets the time-frequency resolution. This value is the widow width in seconds used for the spectrogram. The window is moved in steps equal to half of twin across the signal, and the spectrum is computed at each point. The frequency resolution of hte spectrum is 1/twin. You can get more time resolution by reducing twin, but the spectral resolution gets worse. And vice versa if you increase twin.
The script also generates a plot of volume versus time.
The user observes this plot and selects a volume threshold for frequency estimation. I chose 25 dB in this case - see dashed line on plot. Frequencies will be estimated only at times when volume>=threshold. The user also specifies the max frequency to consider. In this case, my highest note was A440, so I chose fmax=500 Hz. The script finds the frequency (<=fmax) with maximum intensity at every time point where volume exceeds threshold. The results are plotted on a second spectrogram plot.
The results are also plotted on a 2D plot of frequency versus time.
You can improve the frequency resolution by fitting a parabola to the spectrum at each maximum point plus the point on either side in the spectrum, within every time slice. The peak of the parabola will usually be at a frequency that is between the discrete frequencies of the FFT. This enhances your frequency resolution. Here is a routine which demonstrates how to do this.
You can also decide how you want to analyze the frequency data for each note, to assess "quality" or other features of each note. You can try different values for twin.

Sign in to comment.

More Answers (1)

William Rose
William Rose on 28 Dec 2021
Edited: William Rose on 28 Dec 2021
Are you willing to post an example file of such a recoridng?
Do you have an external reference signal, like a trigger, for when the pitch is (suposed to) change? If so, consider using it to partition the signal.
240 bpm is fast! You'd be lucky to get 1/8 of a second n each note at that speed, plus 1/8 second in between notes to change. And as you know, of the recording is only 1/8 of a second long, then the frequency resolution of the simple FFT 8 Hz.
I would experiemnt with the short-time Fourier transform - stft(). See the Matlab help here. I would choose a time reslution in the range (0.5 to 1)/(singer's beats per second) to get started.
I am skeptical that low passing to eliminate vibrato will help, but give it a try.
Also, check out MUSIC (MUltile SIgnal Clssification) on the Matlab file exchange.
Google "frequency estimation form short samples" and you will get a lot of journal articles that look promising.
  1 Comment
Jonas on 29 Dec 2021
hi William,
i think my description sounded a bit more complicated as it is: the tone sequence are only 6 tones with increasing pitch, the sequence repeats several times: 60 bpm up to 240, in about 60bpm increments (meaning that there are 4 sequences)
as you pointed out the frequency resolution of 1/8 s is 8Hz, so if there would be a vibrato of strength of a semitone the vibrato strength is about 12th root of 2 (equaling about 6% of pitch frequency if we assume an even temperation). if we want to see the change in pitch, the pitch should be not lower than 133Hz.
the separation of the tones is not the problem, i am just searching for a comparable measure of pitch stability while i remove the periodic influence of the vibrato.
the low pass filtering was meant to be applied to a sequence of extracted pitches of a single tone with vibrato, e.g. if the extracted pitches of a single 'clean' tone with vibrato were [200 208 216 208 200 192 184 192 200] Hz a filter would deliver a constant 200 Hz line, where if there was an 'unclean' tone's extracted pitch sequence could be [170 190 200 201 200 199 200] where the filter should retain the unclean starting behavior. a combination could be [170 190 200 208 216 208 200 192 184 192 200] Hz, where the filter should remove the vibrato but retain the uncleanness at the beginning.
i am also willing to share one recording, but at the moment I unfortunately don't have one available, in the new year i can make one available

Sign in to comment.


Find more on Signal Processing Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by