Gammatone filter bank
gammatoneFilterBank decomposes a signal by passing it through a bank of
gammatone filters equally spaced on the ERB scale. Gammatone filter banks were designed to
model the human auditory system.
To model the human auditory system:
gammatoneFilterBankobject and set its properties.
Call the object with arguments, as if it were a function.
To learn more about how System objects work, see What Are System Objects?
a gammatone filter bank. The object filters data independently across each input channel
gammaFiltBank = gammatoneFilterBank
gammaFiltBank = gammatoneFilterBank(range)
Range property to
gammaFiltBank = gammatoneFilterBank(range,numFilts)
NumFilters property to
gammaFiltBank = gammatoneFilterBank(range,numFilts,fs)
SampleRate property to
sets each property Name to the specified Value. Unspecified properties have default
gammaFiltBank = gammatoneFilterBank(___,
gammaFiltBank = gammatoneFilterBank([62.5,12e3],'SampleRate',24e3)creates a gammatone filter bank,
gammaFiltBank, with bandpass filters placed between 62.5 Hz and 12 kHz.
gammaFiltBankoperates at a sample rate of 24 kHz.
Unless otherwise indicated, properties are nontunable, which means you cannot change their
values after calling the object. Objects lock when you call them, and the
release function unlocks them.
If a property is tunable, you can change its value at any time.
For more information on changing property values, see System Design in MATLAB Using System Objects.
FrequencyRange — Frequency range of filter bank (Hz)
[50 8000] (default) | two-element row vector of monotonically increasing values
Frequency range of the filter bank in Hz, specified as a two-element row vector of monotonically increasing values.
NumFilters — Number of filters
32 (default) | positive integer scalar
Number of filters in the filter bank, specified as a positive integer scalar.
SampleRate — Input sample rate (Hz)
16000 (default) | positive scalar
Input sample rate in Hz, specified as a positive scalar.
audioIn — Audio input to filter bank
scalar | vector | matrix
Audio input to the filter bank, specified as a scalar, vector, or matrix. If specified as a matrix, the columns are treated as independent audio channels.
audioOut — Audio output from filter bank
scalar | vector | matrix | 3-D array
Audio output from the filter bank, returned as a scalar, vector, matrix, or 3-D
array. The shape of
audioOut depends on the shape of
audioIn is an M-by-N
audioOut is returned as an
array. If N is 1, then
audioOut is returned as
To use an object function, specify the
System object™ as the first input argument. For
example, to release system resources of a System object named
Apply Gammatone Filter Bank
Create a default gammatone filter bank for a 16 kHz sample rate.
fs = 16e3; gammaFiltBank = gammatoneFilterBank('SampleRate',fs)
gammaFiltBank = gammatoneFilterBank with properties: FrequencyRange: [50 8000] NumFilters: 32 SampleRate: 16000
fvtool to visualize the response of the filter bank.
Process white Gaussian noise through the filter bank. Use a spectrum analyzer to view the spectrum of the filter outputs.
sa = dsp.SpectrumAnalyzer('SampleRate',16e3,... 'PlotAsTwoSidedSpectrum',false,... 'FrequencyScale','log',... 'SpectralAverages',100); for i = 1:5000 x = randn(256,1); y = gammaFiltBank(x); sa(y); end
Analysis and Synthesis
This example illustrates a nonoptimal but simple approach to analysis and synthesis using
Read in an audio file and listen to its contents.
[audioIn,fs] = audioread('Counting-16-44p1-mono-15secs.wav'); sound(audioIn,fs)
Create a default
gammatoneFilterBank. The default frequency range of the filter bank is 50 to 8000 Hz. Frequencies outside of this range are attenuated in the reconstructed signal.
gammaFiltBank = gammatoneFilterBank('SampleRate',fs)
gammaFiltBank = gammatoneFilterBank with properties: FrequencyRange: [50 8000] NumFilters: 32 SampleRate: 44100
Pass the audio signal through the gammatone filter bank. The output is 32 channels, where the number of channels is set by the
NumFilters property of the
audioOut = gammaFiltBank(audioIn); [N,numChannels] = size(audioOut)
N = 685056
numChannels = 32
To reconstruct the original signal, sum the channels. Listen to the result.
reconstructedSignal = sum(audioOut,2); sound(reconstructedSignal,fs)
The gammatone filter bank introduced various group delays for the output channels, which results in poor reconstruction. To compensate for the group delay, remove the beginning delay from the individual channels and zero-pad the ends of the channels. Use
info to get the group delays. Listen to the group delay-compensated reconstruction.
infoStruct = info(gammaFiltBank); groupDelay = round(infoStruct.GroupDelays); % round for simplicity audioPadded = [audioOut;zeros(max(groupDelay),gammaFiltBank.NumFilters)]; for i = 1:gammaFiltBank.NumFilters audioOut(:,i) = audioPadded(groupDelay(i)+1:N+groupDelay(i),i); end reconstructedSignal = sum(audioOut,2); sound(reconstructedSignal,fs)
Create Gammatone Spectrogram
Read in an audio signal and convert it to mono for easy visualization.
[audio,fs] = audioread('WaveGuideLoopOne-24-96-stereo-10secs.aif'); audio = mean(audio,2);
gammatoneFilterBank with 64 filters that span the range 62.5 to 20,000 Hz. Pass the audio signal through the filter bank.
gammaFiltBank = gammatoneFilterBank('SampleRate',fs, ... 'NumFilters',64, ... 'FrequencyRange',[62.5,20e3]); audioOut = gammaFiltBank(audio);
Calculate the energy-per-band using 50 ms windows with 25 ms overlap. Use
dsp.AsyncBuffer to divide the signals into overlapped windows and then to log the RMS value of each window for each channel.
samplesPerFrame = round(0.05*fs); samplesOverlap = round(0.025*fs); buff = dsp.AsyncBuffer(numel(audio)); write(buff,audioOut.^2); sink = dsp.AsyncBuffer(numel(audio)); while buff.NumUnreadSamples > 0 currentFrame = read(buff,samplesPerFrame,samplesOverlap); write(sink,mean(currentFrame,1)); end
Convert the energy values to dB. Plot the energy-per-band over time.
gammatoneSpec = read(sink); D = 20*log10(gammatoneSpec'); timeVector = ((samplesPerFrame-samplesOverlap)/fs)*(0:size(D,2)-1); cf = getCenterFrequencies(gammaFiltBank)./1e3; surf(timeVector,cf,D,'EdgeColor','none') axis([timeVector(1) timeVector(end) cf(1) cf(end)]) view([0 90]) caxis([-150,-60]) colorbar xlabel('Time (s)') ylabel('Frequency (kHz)')
A gammatone filter bank is often used as the front end of a cochlea simulation, which
transforms complex sounds into a multichannel activity pattern like that observed in the
auditory nerve. The
gammatoneFilterBank follows the algorithm described in . The algorithm is an
implementation of an idea proposed in . The design of the
gammatone filter bank can be described in two parts: the filter shape (gammatone) and the
frequency scale. The equivalent rectangular bandwidth (ERB) scale defines the relative spacing
and bandwidth of the gammatone filters. The derivation of the ERB scale also provides an
estimate of the auditory filter response which closely resembles the gammatone filter.
The ERB scale was determined using the notched-noise masking method. This method involves a listening test wherein notched noise is centered on a tone. The power of the tone is tuned, and the audible threshold (the power required for the tone to be heard) is recorded. The experiment is repeated for different notch widths and center frequencies.
The notched-noise method assumes the audible threshold corresponds to a constant signal-to-masker ratio at the output of the theoretical auditory filter. That is, the ratio of the power of the fc tone and the shaded area is constant. Therefore, the relationship between the audible threshold and 2Δf (the notch bandwidth) is linearly related to the relationship between the noise passed through the filter and 2Δf.
The derivative of the function relating Δf to the noise passed through the filter estimates the auditory filter shape. Because Δf has an inverse relationship with the noise power passed through the filter, the derivative of the function must be multiplied by –1. The resulting auditory filter shape is usually approximated as a roex filter.
The equivalent rectangular bandwidth of the auditory filter is defined as the width of a rectangular filter required to pass the same noise power as the auditory filter.
 defines ERB as a function of center frequency for young listeners with normal hearing and a moderate noise level:
The ERB scale (ERBs) is an extension of the relationship between ERB and center frequency, derived by integrating the reciprocal of the ERB function:
To design a gammatone filter bank,  suggests distributing the
center frequencies of the filters in proportion to their bandwidth. To accomplish this,
gammatoneFilterBank defines the center frequencies as linearly spaced on
the ERB scale, covering the specified frequency range with the desired number of filters.
You can specify the frequency range and desired number of filters using the
The gammatone filter was introduced in . The continuous impulse response is:
a –– amplitude factor
t –– time in seconds
n –– filter order (set to four to model human hearing)
fc–– center frequency
b –– bandwidth, set to 1.019*
ϕ –– phase factor
The gammatone filter is similar to the roex filter derived from the
gammatoneFilterBank implements the digital filter
as a cascade of four second-order sections, as described in .
 Slaney, Malcolm. "An Efficient Implementation of the Patterson-Holdworth Auditory Filter Bank." Apple Computer Technical Report 35, 1993.
 Patterson, R.d., K. Robinson, J. Holdsworth, D. Mckeown, C. Zhang, and M. Allerhand. "Complex Sounds and Auditory Images." Auditory Physiology and Perception. 1992, pp. 429–446.
 Aertsen, A. M. H. J., and P. I. M. Johannesma. "Spectro-temporal Receptive Fields of Auditory Neurons in the Grassfrog." Biological Cybernetics. Vol. 38, Issue 4, 1980, pp. 223–234.
 Glasberg, Brian R., and Brian CJ Moore. "Derivation of Auditory Filter Shapes from Notched-Noise Data." Hearing Research. Vol. 47. Issue 1-2, 1990, pp. 103 –138.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
System Objects in MATLAB Code Generation (MATLAB Coder)