Partial Least Squares regression - confidence interval of the predicted variable (response)

11 visualizaciones (últimos 30 días)
Hello all,
I am interested in obtaining confidence intervals for the response variable of PLS (Partial Least Squares Regression). Can someone help me on that? Here is my attempt at it:
% https://nl.mathworks.com/help/stats/partial-least-squares-regression-and-principal-components-regression.html
load spectra
X = NIR; % independent variables
y = octane; % dependent variables
PLS_comp = 3; % number of PLS components
[XL,yl,XS,YS,beta,PCTVAR,mse,stats] = plsregress(X,y,PLS_comp); % PLS regression
yfit = [ones(size(X,1),1) X]*beta; % Model fit
residuals = y - yfit; % Ordinary residuals vector
alpha_stat = 0.05; % Significance level
dgf = length(y) - PLS_comp - 1; % Degree of freedom
RMSE_model = sqrt(sum(residuals.^2)/dgf); % Degree of freedom corrected root-mean squared error (standard deviation estimator)
t_Student = tinv((1-alpha_stat/2),dgf); % t-value Student distribution
delta = t_Student*RMSE_model*sqrt(1+stats.T2); % CI boundaries
figure()
set(gcf,'color','white','position',[100 100 500 500])
errorbar(y,yfit,delta,'o')
hold on; grid minor;
hline = refline([1 0]);
hline.Color = 'k';
hline.LineStyle = ':';
xlabel('Measured')
ylabel('Predicted')
Questions are:
  • Is there a better (or simpler) way to do it? (maybe even using a MATLAB standard function). I tried to follow the guidelines of this paper here, in case someone is wondering about the degrees of freedom: 10.1016/j.chemolab.2009.11.003
  • Is this approach correct? The confidence intervals look too big to be correct
  • This T2 statistic from the stats struct is not retrievable for data outside the training data. How do I collect it for a new spectra? (if same approach is used). I cannot get confidence intervals of prediction the way I did it.
Kind regards,
Gustavo

Respuesta aceptada

Torsten
Torsten el 20 de En. de 2023
I did not look into your code in detail, but I think you could use the output structure "gof" from MATLAB's "fit" together with "confint" to compare with your statistical parameters.

Más respuestas (0)

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by