I want to compare fit() using option 'sin1' with the same equation defined using 'fittype'.

Question

Nick Granville el 20 de Feb. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1915620-i-want-to-compare-fit-using-option-sin1-with-the-same-equation-defined-using-fittype

Comentada: Matt J el 21 de Feb. de 2023

I have used fit() with the option 'sin1' to fit a noisy set of data that very approximately look like a sine wave. To understand how 'fittype' works, I would like to fit the curve 'a*sin(b*x+c)' using 'fittype'. This should give the same answer. It does if the StartPoint is the answer; it does not if I omit StartPoint or put in some random values.

How do I make 'fittype' robust to a wide range of (noisy) data and different starting points. The standard option using fit() and 'sin1' does this.

I attach some simple sample code.

% I am trying to use the fit function to fit sine and/or cosine curves to a
% cyclic dataset that is very noisy.  I am most interested in the frequency
% but the noise often prevents the FFT function from giving the frequency
% that I am interested in.
% This code uses the fit function with the predefined curve 'sin1' to
% calculate the best fit.  This works very well.  I then try and do the
% same thing using my own curve, 'a*sin(b*x+c)', i.e. the same curve, but I
% fail.  I only succeed when the StartPoint is very close to the correct
% solution.
% I feel that I must be doing something wrong, but I cannot work out what
% it is.
xdata = (-19.5:19.5)';                                % I think that any evenly spaced data will do
rdata = [-3.8699, -11.9740, -1.0781, 9.4637, ...                    % Real data from my application
    18.2971, 3.7348, -11.5774, 0.3397, 8.4653, 7.7785, -1.4079, -13.4483, -7.6968, ...
    5.1592, 6.4323, -8.2318, -14.8538, 0.8788, 10.9452, 5.6580, -9.0037, -10.0607, ...
    -0.5338, 8.2230, 3.8553, 1.3426, -1.0028, -1.8683, 6.7254, 10.0907, 3.2068, ...
    -7.2804, -4.3917, 6.5188, 5.7636, -3.0115, -13.8272, -12.7043, 5.0029, 9.9405]';
p = polyfit(xdata, rdata, 3);                        % Subtracting a cubic curve to level the ydata
py = polyval(p, xdata);
ydata = rdata - py;
[f, gof, output] = fit(xdata, ydata, 'sin1');             % Using the standard MATLAB curve, 'sin1'
fprintf('    ''sin1'':   f.a: %-6.2f   f.b: %-6.2f   f.c: %-6.2f     gof.sse: %-6.1f\n', ...
    f.a1, f.b1, f.c1, (gof.sse/numel(xdata)));
xdata10 = xdata(1):0.1:xdata(end);                   % Defining extra xdata to give a smoother line
fy = feval(f, xdata10);
mysin = fittype('a*sin(b*x+c)', ...                               % My attempt to do the same thing
    'dependent', {'y'}, 'independent', {'x'}, 'coefficients', {'a','b','c'});
[f2, gof2, output2] = fit(xdata, ydata, mysin, 'StartPoint', [1.0, 1.0, 1.0]);
fprintf('     mysin:   f.a: %-6.2f   f.b: %-6.2f   f.c: %-6.2f     gof.sse: %-6.1f\n', ...
    f2.a, f2.b, f2.c, (gof2.sse/numel(xdata)));
fy2 = feval(f2, xdata10);
plot(xdata, ydata, 'xb:')
hold on
plot(xdata10, fy, ' m--')
plot(xdata10, fy2, ' r-')
legend('Real data', '''sin1'' fit', 'My fit');
hold off

2 comentarios
Mostrar NingunoOcultar Ninguno

dpb el 20 de Feb. de 2023

Editada: dpb el 20 de Feb. de 2023

Abrir en MATLAB Online

xdata = (-19.5:19.5)'; % I think that any evenly spaced data will do

rdata = [-3.8699, -11.9740, -1.0781, 9.4637, ... % Real data from my application

18.2971, 3.7348, -11.5774, 0.3397, 8.4653, 7.7785, -1.4079, -13.4483, -7.6968, ...

5.1592, 6.4323, -8.2318, -14.8538, 0.8788, 10.9452, 5.6580, -9.0037, -10.0607, ...

-0.5338, 8.2230, 3.8553, 1.3426, -1.0028, -1.8683, 6.7254, 10.0907, 3.2068, ...

-7.2804, -4.3917, 6.5188, 5.7636, -3.0115, -13.8272, -12.7043, 5.0029, 9.9405]';

p = polyfit(xdata, rdata, 3); % Subtracting a cubic curve to level the ydata

py = polyval(p, xdata);

ydata = rdata - py;

plot(xdata,rdata)

hold on

plot(xdata,py)

p

p = 1×4

1.0e-05 * 0.0003 -0.0026 -0.0866 -0.1570

The data are symmetric; the polynomial does nothing useful, at least for the sample dataset.

However, just as a minor simplification, MATLAB supplies the equivalent with detrend so one can write the above as

ydata=detrend(rdata,3);

if there are datasets for which it might be helpful.

As for the Q? posed, as the doc notes, sinN models are very sensitive to starting points and the MATLAB builtin solver uses an (AFAICT undocumented) optimization technique internally to get an initial starting point estimate and bounds the coefficients >0 whereas the custom model uses simply a randomized starting point on the [0,1] interval and no bounds on the coefficients.

To reproduce that result, you would also have to use a fit options object and starting points at least reasonably close to those the MATLAB heuristic model uses (which, of course, is undocumented/proprietary).

So, the Q? in return is, if the builtin model produces the desired results, why not use it instead? Other than simply knowing what the builtin algorithm is, which may be of interest and perhaps even needed (to publish?).

I couldn't in a quick look, find anything more about the algorithm TMW uses...whether there's any other information publicly available or not, I don't know.

Nick Granville el 21 de Feb. de 2023

My aim is to understand what 'fit' and 'fittype' do, rather than tell the world what MATLAB do. Ultimately, I may want to fit 'a*(sin(b*x+c))^0.5'. I wanted to ensure that I got the best result, rather than a nonsense result.

As an aside, I accept that the equation should properly be written: 'a*sign(sin(bx+c))*(abs(sin(b*x+c)))^0.5'. Choosing the best starting points may become even more important.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Matt J el 20 de Feb. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1915620-i-want-to-compare-fit-using-option-sin1-with-the-same-equation-defined-using-fittype#answer_1175620

Editada: Matt J el 20 de Feb. de 2023

Abrir en MATLAB Online

How do I make 'fittype' robust to a wide range of (noisy) data and different starting points. The standard option using fit() and 'sin1' does this.

When you specify 'sin1', the solver knows what kind of fit it is looking for and can choose an appropriate strategy for forming the initial StatPoint. Conversely, when you specify the fittype as a custom model, the burden falls upon you to do that.

Choosing a StartPoint randomly is not a shrewd strategy. You know for example, that

a=(max(ydata)-min(ydata))/2

will be a better choice for a than an arbitrary selection. You can also then evaluate curve at particular x,y points to solve approximately for the other parameters. For example, since Y(0)=a*sin(c), you could do,

Y0=interp1(xdata,ydata,0);
c=asin(Y0/a);

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Matt J el 20 de Feb. de 2023

Editada: Matt J el 20 de Feb. de 2023

Abrir en MATLAB Online

Here is one initialization strategy you could consider:

xdata = (-19.5:19.5)'; % I think that any evenly spaced data will do

rdata = [-3.8699, -11.9740, -1.0781, 9.4637, ... % Real data from my application

18.2971, 3.7348, -11.5774, 0.3397, 8.4653, 7.7785, -1.4079, -13.4483, -7.6968, ...

5.1592, 6.4323, -8.2318, -14.8538, 0.8788, 10.9452, 5.6580, -9.0037, -10.0607, ...

-0.5338, 8.2230, 3.8553, 1.3426, -1.0028, -1.8683, 6.7254, 10.0907, 3.2068, ...

-7.2804, -4.3917, 6.5188, 5.7636, -3.0115, -13.8272, -12.7043, 5.0029, 9.9405]';

p = polyfit(xdata, rdata, 3); % Subtracting a cubic curve to level the ydata

py = polyval(p, xdata);

ydata = rdata - py;

%%Derive initial start values

x=linspace(min(xdata),max(xdata),1000);

y=interp1(xdata,ydata,x,'cubic');

T=x(end)-x(1);

a=2*trapz(x,abs(y))/T;

crossings=x(diff(sign(y))~=0);

p=polyfit(1:numel(crossings), crossings,1);

b=pi/p(1);

c=rem(-b*p(2),pi);

StartPoint=[a,b,c]

StartPoint = 1×3

13.5984 1.2453 2.7640

[f, gof, output] = fit(xdata, ydata, 'sin1'); % Using the standard MATLAB curve, 'sin1'

fprintf(' ''sin1'': f.a: %-6.2f f.b: %-6.2f f.c: %-6.2f gof.sse: %-6.1f\n', ...

f.a1, f.b1, f.c1, (gof.sse/numel(xdata)));

'sin1': f.a: 10.11 f.b: 1.25 f.c: 2.75 gof.sse: 15.5

xdata10 = xdata(1):0.1:xdata(end); % Defining extra xdata to give a smoother line

fy = feval(f, xdata10);

mysin = fittype('a*sin(b*x+c)', ... % My attempt to do the same thing

'dependent', {'y'}, 'independent', {'x'}, 'coefficients', {'a','b','c'});

[f2, gof2, output2] = fit(xdata, ydata, mysin, 'StartPoint', StartPoint);

fprintf(' mysin: f.a: %-6.2f f.b: %-6.2f f.c: %-6.2f gof.sse: %-6.1f\n', ...

f2.a, f2.b, f2.c, (gof2.sse/numel(xdata)));

mysin: f.a: 10.11 f.b: 1.25 f.c: 2.75 gof.sse: 15.5

fy2 = feval(f2, xdata10);

plot(xdata, ydata, 'xb:')

hold on

plot(xdata10, fy, ' m--')

plot(xdata10, fy2, ' r-')

legend('Real data', '''sin1'' fit', 'My fit');

hold off

Nick Granville el 21 de Feb. de 2023

Abrir en MATLAB Online

Thank you both for your comments. You confirmed that fitting 'sin1' or 'a*sin(b*x+c)' should behave similarly. The skill is in choosing the starting points. Therefore:

temparray = ydata(1 + find(diff(sign(round(diff(ydata)))) ~= 0));
sp1 = mean(abs(temparray));
sp2 = numel(temparray) * pi() / (numel(ydata)-1);
sp3 = pi()/2;

temparray contains the values of all the turning points (without using the Signal Processing Toolbox). sp1 (i.e. 'a') is the mean absolute value of the turning points and so a good indicator of amplitude. sp2 (or 'b') is related to the number of turning points and so a good indicator of frequency. I am still working on getting a robust value of sp3 (or 'c') that is not overly dependent on the first value of my data.

Matt J el 21 de Feb. de 2023

I am still working on getting a robust value of sp3 (or 'c') that is not overly dependent on the first value of my data.

The method in my comment above does so.

Iniciar sesión para comentar.

I want to compare fit() using option 'sin1' with the same equation defined using 'fittype'.

2 comentarios
Mostrar NingunoOcultar Ninguno

Respuesta aceptada

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

I want to compare fit() using option 'sin1' with the same equation defined using 'fittype'.

2 comentarios Mostrar NingunoOcultar Ninguno

Respuesta aceptada

6 comentarios Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

2 comentarios
Mostrar NingunoOcultar Ninguno

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos