Is there a way to speed up for loop when grouped with GPU?

3 visualizaciones (últimos 30 días)
Mantas Vaitonis
Mantas Vaitonis el 30 de Mzo. de 2019
Comentada: Mantas Vaitonis el 31 de Mzo. de 2019
Hello,
At the moment for loop is bottle neck in my code. I know that GPU does not work with indexing and due to for loop all calculations are switching memories between GPU and CPU. But maybe someone would have a suggestion how to speed up this part or this is stalemate and due to memory switching cant not optimized more. In my case lab (200000000x130), dydis (100000).
function [a,b]=skaicia (lab,dydis,z)
comi=gpuArray(0.05);
t=gpuArray(0.6);
d=gpuArray(50001);
langas=gpuArray(50000);
atidaryta=gpuArray(50000);
x1 = zeros(dydis,65,1);
for i=1:z
x1(:,:,i)=lab(i*dydis+1-dydis:i*dydis,:);
x=gpuArray(x1(:,:,i));
x23=x(1:end-d,:);
[n1,n2]=size(x);
n1=gpuArray(n1);
n2=gpuArray(n2);
xt=permute(x,[2 1 3]);
dx1=(d-langas-1:d-2);
dx=permute(dx1,[2 1])+ (1:n1-d);
[sujn1(:,:,i),sujn2(:,:,i)]=mazinta(xt,dx,n2,n1,d,x23,t,langas,atidaryta,comi);
end
a=sujn1;
b=sujn2;
end
  2 comentarios
Walter Roberson
Walter Roberson el 31 de Mzo. de 2019
You are growing x1 dynamically along the third dimension -- you allocate it as dydis by 65 by 1, but you assign into x1(:,:,i) so it keeps getting larger.
You pull the part of x1 that you just assigned into a gpuArray that becomes x. You never use x1 again in your code other than continuing to grow it and copying the latest slice to gpu. You do not return x1.
Therefore it would be more efficient to directly do
x = gpuArray( lab(i*dydis+1-dydis:i*dydis,:) );
Mantas Vaitonis
Mantas Vaitonis el 31 de Mzo. de 2019
Thank you Walter, you suggestion was very helpful and did improve the speed of calculations.

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by