
How to make multiple lines of best fit into one scatter graph
    10 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
I have a big number of Data, I want to be able to automatically filter the sets of Data to make different lines of best fit ( trendlines) that can intersect with each other to get the intersection points. Then I need to be able to automatically calculate the slope of each line. This is the Code I got so far.
The picture shows that result I want ( which i did manually just to show you an explanation) and the result I was able to get from Matlab and I will attach the code here for you to see what I got so far.  Is it even possible to do this using Matlab ?
clc
clear all
x=[5705 5690 5671 5667 5604 5585 5555 5542 5502 5501 5495] 
y=[12644 12612 12570 12560 12420 12361 12278 12240 12098 12078 12005]
p=polyfit(x,y,1);
px=[min(x) max(x)];
py=polyval(p,px)
scatter(x,y, "filled")
set(gca,'YDir','reverse')
lsline
0 comentarios
Respuestas (2)
  John D'Errico
      
      
 el 26 de Feb. de 2025
        
      Editada: John D'Errico
      
      
 el 26 de Feb. de 2025
  
      Is it possible to do in MATLAB? Of course it is. It of course depends on the skill of the person writing the code. It depends on the signal to noise ratio. High noise problems will be problematic for any code. And any heuristic you devise will fail on SOME set of carefully chosen data. It depends on your requirements for doing this automatically. Should the code be able to know automatically how many lines to fit? Again, if you say yes to that, then I can easily devise a set of data that will cause insolvable problems. Sometimes that will be simple, but not always.
And whether this can be done in MATLAB is a silly question. (Sorry, but it is.) MATLAB is just a programming language. It is the skill of the programmer that matters, NOT the language. How robust is the algorithm depends on the programmer, their understanding of the problem, and their knowledge of statistics, of numerical methods, etc. And of course, it depends on how well the programmer understands the data which they will be seeing. How much noise should they expect?
x=[5705 5690 5671 5667 5604 5585 5555 5542 5502 5501 5495];
y=[12644 12612 12570 12560 12420 12361 12278 12240 12098 12078 12005];
plot(x,y,'o')
When I look at that plot, I might see one line that can be fit. With a little more care, I might decide the bottom three points seem to follow a different slope from the rest. But is there a third segment up high? Perhaps. How noisy is the data? I don't know. Only you know that.
Anyway, what might I do?
Just looking at the plot of the data, we might make the decision from that picture that the first three points belong to one group. Then the next 4 points seem to form a cluster with a common slope. Finally, the last three points might have a slightly lower slope, and they MIGHT fall on a different line. 
One simple trick is to simply compute the slope of the line segments between each consecutive pair of points. If we do so, we see this:
[xsort,tags] = sort(x);
ysort = y(tags);
segslopes = diff(ysort)./diff(xsort) % slope between each consecutive pair of points
xmid = conv(xsort,[1/2,1/2],'valid') % midpoints of each segment in x
plot(xmid,segslopes,'o')
However, if I look at this plot, it becomes clear that if we consider the variability of the slopes in blocks 2 and 3, then compare to the variability in slopes in that first "block" I might decide that we actually might have 4 lines, NOT 3 to consider. Which is it? 
Any heuristic you will write will suffer from exactly these issues. How many lines are there to be found?
The comomon tool to estimate such a model is called a broken stick regression. But even there, we can find issues. For example, I used a tool of my own creation to fit that data. You can find it on the file exchange, as my SLM toolbox. Here is what it did though:

Even though I told it to break the curve into three segments, then decide where the breaks should go based on your data, do you see it made a decision that is not the same as what I first guessed? The problem is, those first three points just don't fall on a straight line very well.
  Image Analyst
      
      
 el 27 de Feb. de 2025
        See my attached demo.  It does a piecewise linear fit over two sections, finding the best splitting point.  You can adapt it to work with 3 or more sections if you want.

Ver también
Categorías
				Más información sobre Get Started with Curve Fitting Toolbox en Help Center y File Exchange.
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




