How to run Matlab function on GPU?

1 visualización (últimos 30 días)
Star
Star el 22 de En. de 2019
Comentada: Walter Roberson el 25 de En. de 2019
I want to run Parallel Computing program which consists of 3 workers with GPU enabled. I try to use gpuArray, but it seems no difference on the speed. My coding is shown below:
function [output,frequency] = spmdfunc(music)
fid = fopen('music.mp3', 'r', 'b');
g = uint8(fread(fid, 'ubit1')');
h = gpuArray(g);
x = gather(h);
while ( a+31 <= numel(x) )
if( x(a:a+14) == syncword)
poolobj = parpool (3);
spmd
if labindex == 1
afr = dsp.AudioFileReader(%Content);
while true
% Content
end
labSend(2, [])
release(afr)
elseif labindex == 2
while true
% Content
if isempty(frame)
labSend(3, []);
break
end
end
else
while true
if isempty (frame)
break
end
length = numel(output(1,:));
output(1,length+1:PCM_length+1152) = [samples(1,:) samples(3,:)];
output(2,length+1:PCM_length+1152) = [samples(2,:) samples(4,:)];
end
end %labindex
end %spmd end
Thank you.
  4 comentarios
Star
Star el 24 de En. de 2019
Hi Joss Knight, I am doing a decoder which reads input and decodes parallely. However, the speed is quite slow. Wonder if there is any other solution to speed up. Anyway, according to the link here, it says that using parfeval is slower if the work to be done is not "big enough". How am I know if my work is big enough?
Thank you.
Walter Roberson
Walter Roberson el 25 de En. de 2019
The parallel routines (not just parfeval) involve some synchronization, and involve copying inputs to another process, and involve copying outputs back. Meanwhile, the parallel processes typically only have access to a single core, so bulk calculation that they do usually cannot take good advantage of the high performance automatic multithreaded libraries. Thus for the parallel routines to be faster, the cost of the data transfer must be small compared to the cost of the calculation they do on a single core. There are cases that work, especially cases where vectorization is not effective.
Parallel computations can also work well in cases of reading files or collecting data from outside sources, but only in the case where you have enough hardware resources to serve the multiple requestors.
If you have multiple threads reading large files from one hard drive, the threads contend with each other and you might get lower performance than reading in sequence.
If you have multiple threads reading smaller files from one hard drive and having to think about them, then one of the threads can potentialy be reading while another is thinking, which is an effective use of resources.
If you have multiple threads reading from different devices, then if they are both on one controller, then it is possible that the drive speed might be less than the controller data transfer speed, in which case accessing multiple drives might be a way to improve performance. Until, that is, the controller bandwidth is all being used up.
If you have multiple threads reading from different devices on different controllers, then controller transfer speed stops becoming itself the major problem, and the limit instead (typically) becomes DMA (direct memory access) transfer between the controller and RAM to put the data in place without interrupting the CPU to transfer piecemeal. Without parallel processing, MATLAB would not have any mechanism to keep those different controllers all busy, as MATLAB does not offer "scatter-gather i/o" for disks (but more or less does for Data Acquisition Toolbox for transfers in background mode.)

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by