Vectorized code slower than loops?

3 visualizaciones (últimos 30 días)
Alex Kurek
Alex Kurek el 26 de Ag. de 2016
Editada: per isakson el 5 de Sept. de 2016
This question is a bit an offspring from an other one, but I have the following two codes:
maxN = 100;
levels = maxN+1;
xElements = 101;
umn = complex(zeros(levels, levels)); % cleaning
bessels = ones(1201, 1201, 101); % 1.09 GB
negMcontainer = ones(1201, 1201, 100);
posMcontainer = negMcontainer;
tic
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
nn = n + 1;
mm = 1;
m = 1:2:n;
numOfEl = ceil(n/2);
umn(nn, mm:mm+numOfEl-1) = bessels(i, j, nn) * posMcontainer(i, j, m);
end
end
end
toc
tic
for j = 1 : xElements
for i = 1 : xElements
for n = 1 : 2 : maxN
nn = n + 1;
mm = 1;
for m = 1 : 2 : n
umn(nn, mm) = bessels(i, j, nn) * posMcontainer(i, j, m);
mm = mm + 1;
end
end
end
end
toc
And it tourns out, that loops version is faste >2x. Why is that so? I know that i happens if vectorization requiers large temporary variables, but (it seems) it is not true here.
And generally, what (other than parfor) can I do to speed up this code?
Best regards, Alex
  1 comentario
Alexandra Harkai
Alexandra Harkai el 2 de Sept. de 2016
Not sure about the speedup possibilities just yet, but regarding the vectorisation, this may be helpful in seeing where the vector/loop implementations make a difference: http://www.matlabtips.com/matlab-is-no-longer-slow-at-for-loops/

Iniciar sesión para comentar.

Respuesta aceptada

per isakson
per isakson el 2 de Sept. de 2016
Editada: per isakson el 3 de Sept. de 2016
Given
  • Matlab stores matrices in column-major order.
  • bessels and posMcontainer are both large
Possibly the transport of data between the memory and the cpu will be more efficient (the caches will work better) if
umn(nn, mm:mm+numOfEl-1) = bessels(i, j, nn) * posMcontainer(i, j, m);
was replaced by
umn(mm:mm+numOfEl-1,nn) = bessels(nn, i, j) * posMcontainer(m, i, j);
The same should apply to the "all-for-loop-case".
&nbsp
And finally the test from Columns and Rows are not the same with an additional case. (R2016a,
result =runperf('NestedLoops.m');
fullTable = vertcat(result.Samples);
varfun(@mean,fullTable,'InputVariables' ...
,'MeasuredTime','GroupingVariables','Name')
ans =
Name GroupCount mean_MeasuredTime
__________________ __________ _________________
NestedLoops/test 4 1.3266
NestedLoops/test_1 4 0.88148
NestedLoops/test_2 4 0.49775
where NestedLoops.m contains
X=rand(100,100,2000);
for ii=1:100
for jj=1:100
X(ii,jj,:)=10*X(ii,jj,:);
end
end
X=rand(100,100,2000);
for jj=1:100
for ii=1:100
X(ii,jj,:)=10*X(ii,jj,:);
end
end
X=rand(2000,100,100);
for jj=1:100
for ii=1:100
X(:,ii,jj)=10*X(:,ii,jj);
end
end
The "differences" between the "cases" are actually larger, since
>> tic, X=rand(100,100,2000);, toc
Elapsed time is 0.355542 seconds.
  6 comentarios
Alex Kurek
Alex Kurek el 3 de Sept. de 2016
I do not know C language. But if you want, it is here: https://www.dropbox.com/s/69bf8fj7lc6cnbc/fisherComputer.c?dl=0
per isakson
per isakson el 3 de Sept. de 2016
Editada: per isakson el 5 de Sept. de 2016
Thanks, but TLNR.
Neither do I, however I get the impression that Coder switches the order of the loops to account for the difference in major order.
"slowed down a bit in .mex" &nbsp Now, I believe that one should code for column-major in Matlab and that Coder adapts the C-code to row-major. However, it puzzles me that the difference in C is only "a bit", since in Matlab it's significant.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Logical en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by