# polyfit curve turns around near last point

4 views (last 30 days)
Haoyu Li on 19 Mar 2020
Commented: Walter Roberson on 19 Mar 2020
Hi,
I'm new to Matlab.
I have a set of points and I'm trying to use polyfit to get a cubic polynomial. But the generated curve turns around near the last point.
How can I generate a correct polynomial and plot it from the (0,0) point to the last one? Why the black plot doesn't start from (0,0)? x = [0 0 -0.0063 -0.0205 -0.0445 -0.0794 -0.1256 -0.1813 -0.2411 -0.2944 -0.3242 -0.3059 -0.2086 0.0037 0.3699 0.9294 1.7201 2.7772 4.1327 5.8152]
y = [0 -6.9533 -13.9100 -20.8699 -27.8323 -34.7959 -41.7590 -48.7191 -55.6729 -62.6169 -69.5473 -76.4605 -83.3538 -90.2260 -97.0777 -103.9128 -110.7391 -117.5695 -124.4224 -131.3218]
p = polyfit(x, y, 3);
d = polyval(p, x, 1);
[px,py] = ploynom(p(4),p(3),p(2),p(1));
hold on
plot(x,y,'*'); % Input points
plot(px(1:20),py(1:20),'-r'); % Plot polyfit result
plot(x, d, '-k');
daspect([1 1 1])
axis([-100 100 -100 100]);
hold off
function [x,y] = ploynom(a0,a1,a2,a3)
x = zeros(1,20);
y = zeros(1,20);
index = 1;
for i = 1:1:20
j = a0 + i*a1 + i*i*a2 + i*i*i*a3;
index = index + 1;
x(index) = i;
y(index) = j;
end
end

Matt J on 19 Mar 2020
Your data, when plotted alone, do not look very polynomial like. They do not even look like a function.
x = [0 0 -0.0063 -0.0205 -0.0445 -0.0794 -0.1256 -0.1813 -0.2411 -0.2944 -0.3242 -0.3059 -0.2086 0.0037 0.3699 0.9294 1.7201 2.7772 4.1327 5.8152]
y = [0 -6.9533 -13.9100 -20.8699 -27.8323 -34.7959 -41.7590 -48.7191 -55.6729 -62.6169 -69.5473 -76.4605 -83.3538 -90.2260 -97.0777 -103.9128 -110.7391 -117.5695 -124.4224 -131.3218]
plot(x,y,'*'); Haoyu Li on 19 Mar 2020
Hi Matt,
So if the x is negative and positive mixed, it's impossible to generate a polynomial?
Matt J on 19 Mar 2020
The problem is you have multiple y for a given x. A polynomial (or any function) has only one y for a given x.
Matt J on 19 Mar 2020
I vaguely wonder if you have the x and y data interchanged. When I (re-)interchange them, I get a much more reasonable looking data trend and polynomial fit, though a 4th order polynomial fits the data a fair bit better than a cubic:
fit3=@(z) polyval(polyfit(y, x, 3), z);
fit4=@(z) polyval(polyfit(y, x, 4), z);
plot(y,x,'*');
hold on;
fplot(fit3,[min(y),max(y)])
fplot(fit4,[min(y),max(y)],'k')
hold off
legend('Data','3rd Order Fit', '4th Order Fit') Walter Roberson on 19 Mar 2020
There is no particular reason that the best cubic fit to a set of points would necessarily go through any one of the points. Or any of the points, for that matter.
The coefficients found by the initial polyfit are correct. If you use calculus to minimize a sum-of-squared-errors then the exact coefficients for a*x^3 + b*x^2 + c*x + d are:
a = 277877287064895803678497781245508451287701450302577725603197360165382121190622565108694426309764317184/967676561079094216428837973317001224011506358024577130586302565149971777681663432622658075733194121375
b = 1852916415006531640389090338694927374455104919960268282259530425480427914727610178876949605838088896512/1612794268465157027381396622195002040019177263374295217643837608583286296136105721037763459555323535625
c = -124339822977430694539770894658562849104244122428885410270576725781909018742659972860509669593383488734469923983117/4255886523331027075957355526839747479471503089061504962660035402629159035356904593401257926517104612275281461248
d = -145467529832635009164572892487692503106182319419912735588393592302278037646945296939262160591083595382042060137663564873/2723767374931857328612707537177438386861761976999363176102422657682661782628418939776805072970946951856180135198720000
assuming that each of the x and y values are converted to their nearest rational representation.
But the generated curve turns around near the last point.
Shrug. When you fit a set of real points to a cubic, then there will be at least one real-valued minima or maxima somewhere, but that "somewhere" is not necessarily going to be anywhere particular in the range that you interpolate over.
Your last point is not very close to your starting point, and the nature of polynomials is that as you get towards the extreme edges of x, the value of the polynomial changes quickly. The extreme point can end up shaping the interpolated a polynomial a lot. If you do a fit over all points except the last one, you will get a quite different shape out.
If you plot(x,y) you will see that in the original x order, you do not have a polynomial: your x start at 0 and go negative and then positive. You get more of a polynomial if you display x in terms of y.

Haoyu Li on 19 Mar 2020
Hi Walter,
Thank you for your answer. I'm not very clear about how did you get the a b c d. On my side, they are 0.2872 1.1489 -29.2160 -53.4067.
If do a fit except the last point, there's still a turn-around at the new last point. I guess this is because the x change between negative/positive?
I do want to get a polynomial from these points for next step calculation. Is there any possible or alternative method? How about generating several polynomial pieces from part of the points with positive (or negative) x?
Walter Roberson on 19 Mar 2020
I'm not very clear about how did you get the a b c d
syms a b c d
%construct a residue function
residue = sum(((a*x.^3 + b*x.^2 + c*x.^1 + d*x.^0) - y).^2);
%optimize it by solving for the derivatives being 0
sola = solve(diff(residue,a),a);
residue2 = subs(residue,a,sola);
solb = solve(diff(residue2,b),b);
residue3 = subs(residue2,b,solb);
solc = solve(diff(residue3,c),c);
residue4 = subs(residue3,c,solc);
sold = solve(diff(residue4,d),d);
%back-substitute
Dsol = sold;
Csol = subs(solc,d,Dsol);
Bsol = subs(subs(solb,c,Csol),d,Dsol);
Asol = subs(subs(subs(sola,a,sola),b,Bsol),c,Csol),d,Dsol);
abcd = [Asol; Bsol; Csol; Dsol] %these are the EXACT optimal values for a, b, c, d
double(abcd) %to see the floating point approximations
You will see that these values have the same floating point approximation as the results of the polyfit() that you assigned to p. This proves that polyfit() did in fact find the right answer, according to calculus. Those coefficients really do give the optimal fit of a polynomial of degree 3 to that data, under the circumstance that you define "optimal fit" according to the sum-of-squared-errors (SSE), also known as a least-squared fit. Every other set of coefficients must have a higher SSE.
Why those values and not some other values with a different implied shape? Because that's what happens to fit best mathematically.
Take into account that when you do a polyfit() that the order of the data points does not matter: polyfit does not know that [x(K), y(K)] is supposed to be "connected" to [x(K+1), y(K+1)].