Why is MATLABs knnsearch with KDTreeSearcher approx. 300 times slower than my own naive kd-tree implementation?

5 visualizaciones (últimos 30 días)
Hello,
while i was coding i noticed that the knnsearch with Matlabs KDTreeSearcher object seems a bit slow to me.
So I made a simple matlab kd-tree implementation of my own and compared it to the matlab function in terms of computation time.
It turns out that my own implementation is about 300 times faster? How can this be?
Thanks in advance!
With kind regards
David Skuddis
Here is my testing script: (MATLAB R2019a):
______________________________________________________________________________________
%% compare kd tree searcher matlab implementation with own kd tree script
% random 3d data
data = rand(100000,3);
ids = linspace(1,100000,100000)';
% create kd tree manually
[kd_tree] = createKD_Tree(data,ids,1);
% kd tree from matlab lib
kd_tree2 = KDTreeSearcher(data,'BucketSize',10);
% random point
point = rand(1,3);
% test kd tree
tic
for k = 1:1000
% init
node = kd_tree;
% naive kd search
while node.val == -1
if point(node.cutDim) <= node.cutVal
node = node.rightChild;
else
node= node.leftChild;
end
end
end
toc
% return distance
norm( point - data(node.val,:) )
tic
for k = 1:1000
% matlab kd search
NN_id = knnsearch(kd_tree2,point);
end
toc
% return distance
norm( point - data(NN_id,:) )
% manual kd tree generation
function [node] = createKD_Tree(data,ids,depth)
% get point nr
n_points = size(data,1);
if n_points == 1
% stop and save point id
node.val = ids;
return;
else
node.val = -1;
end
% get cut dimension
node.cutDim = mod(depth,3)+1;
% calc cut value
node.cutVal = median( data(:,node.cutDim) );
% split point ids
rightChildIds = data(:,node.cutDim) <= node.cutVal;
leftChildIds = ~rightChildIds;
% add to childs
node.leftChild = createKD_Tree(data(leftChildIds,:),ids(leftChildIds,:),depth+1);
node.rightChild = createKD_Tree(data(rightChildIds,:),ids(rightChildIds,:),depth+1);
end

Respuestas (1)

Carson Purnell
Carson Purnell el 7 de Mzo. de 2023
It isn't slower, you're just calling the function overhead 1000 times in a way that also bypasses all optimizations.
instead of calling knnsearch 1000 times, you should provide it with 1000 points. On my machine, this makes it twice as fast as your script code - results will vary. This only incurs function overhead once, and the function itself as well-optimized internally to handle large numbers of points that a script loop cannot be.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by