Extracting some points and finding some nearest elements.
Mostrar comentarios más antiguos
I have data I used dbscan clustering method. Now I need to find 5 different elements from each cluster. And calculate the 5 nearest elements of each point and group it.
In the below figure there are some points marked(pencil marked)and grouped the 5 elements(black round).
[I marked only 3 clusters just for example, I need it in the full clusters.]
After that how can I remove those clusters that do not have 5 nearest elements? Anybody, please help me.
clc;
clear;
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
gscatter(data(:,1),data(:,2),idx);

Respuestas (1)
Image Analyst
el 8 de Mzo. de 2020
I don't even think you need dbscan for this. You just need to define a length that separates "near enough" and "too far away". Then you just check every point in the array to see if it has 5 that are near enough, and keep those.
nearEnough = 0.02; % Whatever you want.
x = data(:,1);
y = data(:,2);
indexesToKeep = false(1, length(x)); % Initialize to not keeping any of them.
for k = 1 : length(x)
distances = sqrt((x(k) - x).^2 + (y(k) - y).^2);
if sum(distances > nearEnough) >= 5
% At least 5 are close enough to this k'th point, so keep this point.
indexesToKeep(k) = true;
end
end
x = x(indexesToKeep);
y = y(indexesToKeep);
12 comentarios
sreelekshmi ms
el 8 de Mzo. de 2020
Image Analyst
el 8 de Mzo. de 2020
Not sure what you're saying. But my code should work. You can apply it to each colored group (that comes from dbscan) one at a time if you want.
sreelekshmi ms
el 8 de Mzo. de 2020
Image Analyst
el 9 de Mzo. de 2020
For step 2, describe how you pick those 5 points from all the points in that class.
Not sure what step 4 is supposed to do.
What is your definition of "near" or "not near"? How far -- what distance is that?
sreelekshmi ms
el 9 de Mzo. de 2020
Editada: sreelekshmi ms
el 9 de Mzo. de 2020
Image Analyst
el 9 de Mzo. de 2020
I still have no idea how you're going to pick the first 5 points. Let's say you have 7000 points and there are 1000 points in each of 7 clusters. Now, which 5 of those 7000 would you pick in your step 2?
And once you've picked those initial 5 points, you will check to see how many "near" neighbors each has. Like point 1 may have 20 near neighbors, point 2 may have 3 near neighbors, point 3 may have 6 near neighbors, point 4 may have 250 near neighbors, and point 5 may have 2 near neighbors. So points 1, 3, and 4 have more than 5 near neighbors and go into "class 1" while points 2 and 5 have more than 5 near neighbors and so they go into class 2. Class 2 has 6998 points - all except the two points that have at least 5 near neighbors. Is that correct?
sreelekshmi ms
el 9 de Mzo. de 2020
Image Analyst
el 9 de Mzo. de 2020
So when you're picking the 5 from each class "that are maximum far apart", how do you define that? Do you look at each point in the class and
- find the distance to the nearest other point, or
- find the average distance from every other point in the class, or
- find the average distance to a certain number of points, like the average distance to the 8 closest other points?
Are you using any of those definitions of maximum? Or some other definition?
sreelekshmi ms
el 10 de Mzo. de 2020
sreelekshmi ms
el 10 de Mzo. de 2020
sreelekshmi ms
el 10 de Mzo. de 2020
sreelekshmi ms
el 11 de Mzo. de 2020
Categorías
Más información sobre Statistics and Machine Learning Toolbox en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!