How to define a custom equation in fitlm function for linear regression?

27 visualizaciones (últimos 30 días)
Spirit
Spirit el 22 de Nov. de 2017
Comentada: the cyclist el 8 de Jun. de 2023
I'd like to define a custom equation for linear regression. For example y = a*log(x1) + b*x2^2 + c*x3 + k. This is a linear regression problem - but how to do this within FitLm function?
Thanks, Shriram

Respuestas (3)

the cyclist
the cyclist el 22 de Nov. de 2017
Editada: the cyclist el 22 de Nov. de 2017
% Set the random number seed for reproducibility
rng default
% Make up some pretend data
N = 100;
x1 = rand(N,1);
x2 = rand(N,1);
x3 = rand(N,1);
a = 2;
b = 3;
c = 5;
k = 7;
noise = 0.2*randn(N,1);
y = a*log(x1) + b*x2.^2 + c*x3 + k + noise;
% Put the variables into a table, naming them appropriately
tbl = table(log(x1),x2.^2,x3,y,'VariableNames',{'log_x1','x2_sqr','x3','y'});
% Specify and carry out the fit
mdl = fitlm(tbl,'y ~ 1 + log_x1 + x2_sqr + x3')
  5 comentarios
Walter Roberson
Walter Roberson el 8 de Jun. de 2023
If you have a discontinuity in the first or second derivatie of the model, then you surely do not have a linear regression situation.
You probably need to use ga(). Not fmincon() or similar -- those optimizers cannot handle discontinuities in derivatives either.
the cyclist
the cyclist el 8 de Jun. de 2023
I suggest you search the keywords segmented regression matlab and/or piecewise regression matlab. Although I don't believe there are any built-in functions for this, you should find a few different threads that you might find useful. I also think you might want to start a brand-new question for this, after you have done that search. In that question, I would suggest posting your data, which makes it easier for people to try out code suggestions.

Iniciar sesión para comentar.


laurent jalabert
laurent jalabert el 19 de Dic. de 2021
Editada: laurent jalabert el 19 de Dic. de 2021
To proceed with a custom function it is possible to use the non linear regression model
The example below is intended to fit a basic Resistance versus Temperature at the second order such as R=R0*(1+alpha*(T-T0)+beta*(T-T0)^2), and the fit coefficient will be b(1)=R0, b(2) = alpha, and b(3)=beta.
The advantage here, is that the SE will be computed directly for R0, alpha and beta.
beta0 is an initial range of [R0 alpha beta]
b(n) is retrieved using mdl.Coefficients.Estimate(n), for n=1,2,3
standard deviation on the coefficients are retrieved by mdl.Coefficients.SE(n)
(Curve fitting toolbox and Statistical/Machine Learning toolbox are both requiered)
clear tbl mdl
% your vector data T_T0 and R of same dimension
tbl = table(T_T0,R);
modelfun = @(b,x)b(1).*(1+b(2).*x(:,1)+b(3).*x(:,1).^2);
beta0 = [100 1e-3 1e-6];
mdl = fitnlm(tbl,modelfun,beta0,'CoefficientNames',{'R0';'alpha';'beta'})
  1 comentario
Erin Evans
Erin Evans el 7 de Jun. de 2023
Would you be able to help me write this such that there is a conditional statement in it? I essentially need two connected trendlines such that the statistics are for both sections together.
my equation is:
log10(Qs) = log10(a) + b*log(Qr) + c*log10(Max(1, Qr / Qc)) + d*log10(Qr(i) / Qr(i - 1))
the code I have so far is:
AgaDisc = readtable("File Path");
%% Bring in data columns
AgaDischargeArray = table2array(AgaDisc(:,"Discharge"));
AgaLoadArray = table2array(AgaDisc(:,"SedimentLoad"));
%% Normalize the discharge and sediment data
meanDisc = mean(AgaDischargeArray);
medianLoad = median(AgaLoadArray);
AgaDischargeArray = AgaDischargeArray/meanDisc;
AgaLoadArray = AgaLoadArray/medianLoad;
%% log10 sediment
logQs = log10(AgaDischargeArray);
%% log10 discharge
logQt = log10(AgaLoadArray);
%% Create new table of log-normalized data
tbl = table(logQs, logQt);
%% Add weighting factor
weights = zeros(length(logQt), 1);
weightFactor = [0.5, 1, 5, 10];
Q = quantile(logQt, 3);
for i = 1:length(logQt)
if logQt(i) > Q(3)
weights(i) = weightFactor(4);
elseif logQt(i) > Q(2)
weights(i) = weightFactor(3);
elseif logQt(i) > Q(1)
weights(i) = weightFactor(2);
else
weights(i) = weightFactor(1);
end
end
m = fitlm(tbl, 'logQs ~ logQt', 'RobustOpts', 'on', 'weight', weights);

Iniciar sesión para comentar.


laurent jalabert
laurent jalabert el 8 de Jun. de 2023
please check carefully your expression, cause you use log10 and log (I guess neperian log here)
log10(Qs) in equation and logQs = log10(AgaDischargeArray) in your program; is it same ?
d*log(Qr(i)/Qr(i-1) might be similar to d* diff(log(Qr))
log10(Qs) = log10(a) + b*log(Qr) + c*log10(Max(1, Qr / Qc)) + d*log10(Qr(i) / Qr(i - 1))
log10Qs is x(:,1) as first column of tbl.
logQt is x(:,2) as the second column of tbl.
a,b,c,d are unknown
Qc is not defined
d* diff(log(Qr)) will lead to problem because its length is length(Qr) -1

Categorías

Más información sobre Linear and Nonlinear Regression en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by