# Area between baseline and data peak

32 views (last 30 days)
Lizan on 22 Sep 2014
Commented: Image Analyst on 28 Jul 2017

Hi,

I am trying to determine the area under a peak curve, see image. The baseline illustrated in image is not zero, and I would like to subtract this from the area under the peak. The data is from a picture and been summed to obtain this profile. I want to determine the intensity by summing the area under the peak. 1. Is there a way to determine the baseline (y- value) automatically by code.

2. How do I determine the area minus the baseline? I use sum to determine the area under the peak curve.

Star Strider on 22 Sep 2014
Edited: Star Strider on 22 Sep 2014
Here is how I would do it:
% Create Data
x = 0:480;
y = 1E4*(0.5+rand(size(x)));
y(400:450) = 1.3E+5*exp(-(x(400:450)-425).^2/50)+y(400:450);
% Generate Statistics & Identify Peak
ysts = [mean(y) median(y); mean(y)-1.96*std(y)/sqrt(length(y)) mean(y)+1.96*std(y)/sqrt(length(y))];
[ypk, pki] = max(y);
pkiv = find(y >= ysts(2,2)); % Find Peak Indices
Iy = cumtrapz(x, y-ysts(1,1)); % Integrate Entire Record
Iydb = [ones(pkiv(1),1) x(1:pkiv(1))']\Iy(1:pkiv(1))';
Iyd = Iy' - [ones(size(x))' x']*Iydb; % Detrend Using Baseline
Ipk = Iy(max(pkiv)) - Iy(min(pkiv)); % Find Peak Area
% Plot Data & Detrended Integral & Selected Statistics
figure(1)
plot(x, y)
hold on
plot(x, Iyd, '-g')
hold off
legend('Data', 'Detrended Integral', 'Location', 'NW')
text(100, 1E+6, sprintf('Peak Area = %23.15E', Ipk))
The idea is reasonably straightforward:
1. Identify the peak as y-values greater than the upper 95% confidence interval for the mean of all the data;
2. Integrate the entire record using cumtrapz;
3. Do a linear regression on all the integrated data up to the first index of the peak;
4. Use that regression to detrend the integrated data;
5. Use the identified indices of the peak to calculate the approximate integral of the peak.
This approach may not generalise well if most of the baseline information is not at x-values less than the peak, because it depends on that information to detrend the integral.
The plot produces: Image Analyst on 28 Jul 2017
When you do \Iy you're doing matrix division. Did you really want element-by-element division? If so, use dot slash instead of slash.
Also ones() is returning a column vector, so make sure x is a row vector, since you transpose x to be a column vector and want to stitch it to the right of ones.
Also be away of automatic expansion of your ly vector since you're dividing a pkiv(1)-by-2 vector by a vector array.

### More Answers (2)

Andrew Reibold on 22 Sep 2014
Edited: Andrew Reibold on 22 Sep 2014
The area under a curve is the definition of an integral. You will want to take the integral and subtract the baseline * baseline_width to find the area.
I'm not sure if its really scientific to determine the baseline just from raw data. Maybe this entire unique data set has some factor that shifts every point in some direction which makes you perceive the baseline somewhere that it isn't. I'm sure you could take the mean of non-peaked data or something, but normally the baseline is determined from the original function, historical data, or whatever you are stepping off from.
Lizan on 21 Oct 2014
Some data examples:

Image Analyst on 22 Sep 2014
You can find the peak (at least for this plot) very easily with max(). Then you can just "fall down" the peak on each side (with a for loop) and find the place (index) where the data is no longer decreasing. Then you can just fit a line between those two valley locations, say with polyfit() or whatever, to get the baseline.
##### 2 CommentsShowHide 1 older comment
Image Analyst on 20 Oct 2014
signal(i) will be greater than or equal to signal(i-1).