Non-linear regression

Question

Collin Pecora el 5 de Mayo de 2021

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/822375-non-linear-regression

Comentada: the cyclist el 6 de Mayo de 2021

This is a question on non-linear curve fitting, not on model selection and not particularly on how to do regression in matlab.

I have 2 models for the same physical process, each a function of the vector x, and both have two coefficients, for m1, they are x0 and b and for m2 they are x0 and theta, where theta is notationally, b/x0. The models return slightly different values for x0 and theta from m2 does not general equal b/x0 from m1.

I am looking for advice on how to justify the use of one model over the other.

x = linspace(0,1,1e3); % For example
m1 = @(x) (1 - (x./x0)) .* exp(-b .* (x/x0));
m2 = @(x) (1 - (x./x0)) .* exp(-theta .* x);

Thanks

Collin

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

the cyclist el 5 de Mayo de 2021

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/822375-non-linear-regression#answer_692880

Abrir en MATLAB Online

It seems to me that your two models are equivalent, but one is specified in a way that masks that fact.

x0 is really just a scaling parameter -- you can think of it as the "units" of x. Define a new variable:

z = x/x0;

Then

mz1 = @(z) (1 - z) .* exp(-b .* z);

mz2 = @(z) (1 - z) .* exp(-theta .* x0 .* z);

and then define a new parameter

b2 = theta .* x0;

resulting in

mz1 = @(z) (1 - z) .* exp(-b .* z);
mz2 = @(z) (1 - z) .* exp(-b2 .* z);

and you see that your models are equivalent, 1-parameter models.

2 comentarios
Mostrar NingunoOcultar Ninguno

Collin Pecora el 5 de Mayo de 2021

Cyclist

Your insight is absolutely correct, both that the models are mathematically equivalent and for what the x0 parameter does.

Let’s say you are using a minimizer, like say, fmincon where the objective function is the sum of squared errors (SSE) between your measurements and either model1 or model2. Note that x0 is one of the key parameters we are after. Both models will result in nearly identical x0's with model1 usually terminating at slightly smaller SSEs, thus indicating that model1 has a small advantage over model2 using SSE as a scoring metric.

My question is why, what is it about wrapping the x0 into the theta parameter in model2 that causes the slight differences in results, and is there a way I can express it numerically or statistically?

the cyclist el 6 de Mayo de 2021

Abrir en MATLAB Online

Actually, I have to take back what I said about these boiling down to a 1-parameter model. Because you don't know x0 a priori, you can't do that step.

However, the two models are certainly equivalent.

% Set random number generator seed, for reproducibility

rng default

% Generative process for creating the data, with a little noise in y.

b = 2;

x0 = 3;

theta = 5;

m1 = @(x) (1 - (x./x0)) .* exp(-b .* (x/x0));

x = linspace(0,1,20)';

y = m1(x) + 0.02*randn(size(x));

% The two forms of the fitting function

f1 = @(B1,x) (1 - (x./B1(1))) .* exp(-B1(2) .* (x./B1(1)));

f2 = @(B2,x) (1 - (x./B2(1))) .* exp(-B2(2) .* x);

% The two fits

mdl1 = fitnlm(x,y,f1,[1 1])

mdl1 =

Nonlinear regression model: y ~ (1 - (x/B11))*exp( - B12*(x/B11)) Estimated Coefficients: Estimate SE tStat pValue ________ ______ _______ _______ B11 3.2994 2.388 1.3817 0.18399 B12 2.1569 2.4787 0.87016 0.39567 Number of observations: 20, Error degrees of freedom: 18 Root Mean Squared Error: 0.0298 R-Squared: 0.979, Adjusted R-Squared 0.978 F-statistic vs. zero model: 4.99e+03, p-value = 1.98e-25

mdl2 = fitnlm(x,y,f2,[1 1])

mdl2 =

Nonlinear regression model: y ~ (1 - (x/B21))*exp( - B22*x) Estimated Coefficients: Estimate SE tStat pValue ________ _______ ______ ________ B21 3.2994 2.3888 1.3812 0.18412 B22 0.65372 0.27883 2.3445 0.030724 Number of observations: 20, Error degrees of freedom: 18 Root Mean Squared Error: 0.0298 R-Squared: 0.979, Adjusted R-Squared 0.978 F-statistic vs. zero model: 4.99e+03, p-value = 1.98e-25

% The predicted y values from the fits

y1 = predict(mdl1,x);

y2 = predict(mdl2,x);

% Plot the data and fits

figure

hold on

scatter(x,y)

plot(x,y1,'-')

plot(x,y2,'--')

legend({'data','1st model','2nd model'})

% Compare coefficents

coef1 = mdl1.Coefficients.Estimate;

coef2 = mdl2.Coefficients.Estimate;

fprintf('\nb/x0 from first model: %10.7f\n',coef1(2)/coef1(1))

b/x0 from first model: 0.6537185

fprintf('theta from second model: %10.7f\n',coef2(2))

theta from second model: 0.6537189

You are seeing, though, that because of the numerical nature of the calculation, they do not return exactly the same result. But I don't think you can "justify" one over the other. They are truly equivalent, mathematically.

It's difficult to tell from inspection, but I think that the second method may be a bit more numerically stable, because with just one parameter in the exponential, you will likely not get as large deviations calculated. Perhaps that is a reason to choose it.

Iniciar sesión para comentar.

Non-linear regression

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Non-linear regression

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno