How to remove rows in a nested for loop?
Mostrar comentarios más antiguos
I have a matrix Detected_center and a matrix Original_center with in each row a x and y coordinate. I want to compare each time one row of the Detected_center matrix with all the rows in the Original_center matrix and calculate the distance. If the row of the Detected_center is paired with a row of the Original_center as the minimal distance is within a threshold, I want that this row is removed from the Original_center matrix for the sake of computational time.
Can any help me out how to do this in a save way? The beneath way is not save as the nested for loop should change in length.
Threshold = 0.2
for i = 1:length(Detected_center)
for j = 1:length(Original_center)
Distance(j,1) = ...
end
[Min_Distance, position] = min(Distance);
if Min_Distance < Threshold
TP = TP + 1;
Original_center(position,:) = []; %Remove row from matrix, as it is coupled
else
FP = FP + 1;
end
end

Blue center is from the matrix Detected_center, and the red centers are the centers of the matrix Original_center. The distances are calculated from the blue center to all red centers with pythagoras. If the distance between the red center which is the closest to the blue center falls within the threshold, it should be removed from the matrix Original_center.
8 comentarios
Geoff Hayes
el 14 de En. de 2022
@S. - pleaser clarify what you mean by the nested for loop should change in length. How will this matter if a row is removed from Original_center?
Physically removing the record from the array during the loop will make for the runtime to go up dramatically if there are many at all of these -- that will cause a reallocation of the full array and involves a copy.
Just mark the rows for deletion and wait until the end to remove them all in one swell foop...which can be done w/o the explicit loop via pdist2 using the optional input named parameter, 'Smallest',N and the optional second return variable which is the index of the computed distances which can be used with the subsequent test of whether are/aren't within the threshold limit.
While the option to control a numeric threshold isn't implemented in pdist2 directly, I would presume you would be able to know how many might possibly be so for your input data to set the number to return. OTOH, you could just return the whole array and do the test afterwards; it probably isn't any slower.
S.
el 17 de En. de 2022
If the distance between the red center which is the closest to the blue center falls within the threshold, it should be removed from the matrix Original_center.
If so, the order in which you iterate over the Detected_center points will affect the result. Suppose your data is,
Threshold=5;
Detected_center=[4 0;
8 0];
Original_center=[0 0;
6 0;
12 0];
No matter which Detected_center point you start the iterations with, the first point to be eliminated is [6,0]. However, if [4,0] is the last point to be processed then [0,0] will be the final point eliminated and the final value of Original_center will be [12 0], whereas if [8,0] is the last to be processed, then the final eliminated point will be [12 0] and the final value for Original_center will be [0,0].
Did you intend for the process to be sensitive to order in this way? What is the correct outcome in this example?
Matt J
el 18 de En. de 2022
This is because, the coordinates in Detected_center are given in four decimals precisely.
Not sure why that matters. My example data had infinite precision (they were integers). But if you think the issue can be ignored, see my answer below.
S.
el 18 de En. de 2022
Respuesta aceptada
Más respuestas (2)
discard=ismembertol(Original_center,Detected_center,Threshold,'ByRows',1,'DataScale',1);
Original_center(discard,:)=[];
7 comentarios
dpb
el 14 de En. de 2022
Neat idea if OP can recast the distance into a difference in the coordinates...
What do you exacly mean with OP?
The OP is you. Notice the upper right corner of your posts.
And indeed where is Min_Distance taken into acount for comparison with the threshold?
Inside ismembertol().
Detected_center are by the way x by 2 matrices, with x a number
That was clear, but what is the typical size of x?
I indeed need the Euclidean distance.
If you need the Euclidean distance, ismembertol won't work, but you should consider carefully whether the Euclidean distance is essential, since that requires slower alternatives.
Matt J
el 18 de En. de 2022
Well, I need to select the point of the matrix Original_center with the shortest distance to the point of the matrix Detected_center. See image in my original question to clarify this. So it's quite essential.
That doesn't explain why it needs to be the shortest Euclidean distance and not the shortest L-inf distance.
S.
el 18 de En. de 2022
Matt J
el 17 de En. de 2022
For Euclidean distance,
discard=any( pdist2(Original_center,Detected_center)<Threshold ,2);
Original_center(discard,:)=[];
8 comentarios
however, I want only that the smallest distance that meets the threshold is true.
Does it matter if there are ties? If not, this is a more vectorized alternative, but note my comment above.
%Example data
Threshold=5;
Detected_center=[4 0;
8 0];
Original_center=[0 0;
6 0;
12 0];
D = pdist2(Original_center,Detected_center);
[minval,loc]=min(D,[],1);
discard=loc(minval<Threshold);
Original_center(discard,:)=[];
Original_center
S.
el 18 de En. de 2022
No, you don't need loops or if statements:
[Min_Distance, loc] = pdist2(Original_center,Detected_center,'euclidean','smallest',1)
pos=Min_Distance<Threshold;
TP=sum(pos);
FP=numel(pos)-TP;
Original_center(loc(pos),:)=[];
Well, in this case you calculate all distances, which is not necessary and therefore computationally inefficient.
I don't know which version of my solution you're looking at. I return the shortest distance, just as you did. Regardless, I don't think using the 'smallest' flag makes pdist2 compute fewer distances. Output memory allocation is certainly more efficient when the flag is used, but pdist2 still needs to compute all the distances in order to know which one is the minimum.
tic
X=rand(1e6,2); Y=rand(1,2);
toc
tic;
D=pdist2(X,Y);
toc
tic
[D,I]=pdist2(X,Y,'euc','smallest',1);
toc
You can see that the difference in compute time above is basically the amount of time required to allocate memory for the result.
Moreover Original_center(loc(pos),:)=[]; is then not necessary. In the for loop I use, the one centerpoint that is coupled is removed from the Original_center matrix, so that the distance doesn't need to be calculated again.
It is much more efficient to remove the undesired points in one step, as I did, than to do it one at a time in a loop.
Categorías
Más información sobre Creating and Concatenating Matrices en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
