Increase amount of processor- and RAM used by MATLAB (parfor)

I'm running big calculations and simulations on a powerful computer (8 i7-cores and 12 GB RAM). But for some reason it only uses 10-13% of both RAM and CPU power. How can I increase this?
I suppose the limitation on the CPU is because it only uses 1 of the cores.
If that is the case, I suppose I should use parfor. But since I need to evaluate evaluate a 7-dimensional relation, it is not too straightforward.
An example of the kind of code it has to execute is:
for i=1:length(a)
for j=1:length(b)
for ii=1:length(c)
for jj=1:length(d)
E(i,j,ii,jj) = a(i)^2 * b(j) + c(ii) * d(jj) ^2 + a(i) * c(ii);
end
end
end
end
Can anybody help me out here? Just replacing one of the 'for' with 'parfor' does unfortunately not do the trick.
Thanks

 Respuesta aceptada

Matt J
Matt J el 8 de Oct. de 2012
Editada: Matt J el 8 de Oct. de 2012
Vectorizing might help, although this looks like it could be a huge matrix, and therefore difficult not only to compute fast, but to store.
aterms=a(:);
bterms=b(:).';
cterms=reshape(c,1,1,[]);
dterms=reshape(d.^2,1,1,1,[]);
E1=bsxfun(@times,aterms.^2,bterms);
E2=bsxfun(@times,aterms,cterms);
E3=bsxfun(@times,dterms,cterms);
E=bsxfun(@plus,E1,E2);
E=bsxfun(@plus,E,E3);

5 comentarios

Björn
Björn el 10 de Oct. de 2012
Editada: Björn el 10 de Oct. de 2012
This reduces the time of calculation from over a day down to less than a minute for a multidimensional array containing over 10^9 double precision terms! Great solution, thanks. But does bsxfun allow parallel computing (i.e. will it use multiple processes when my matlabpool is 4)?
Matt J
Matt J el 10 de Oct. de 2012
Editada: Matt J el 10 de Oct. de 2012
My understanding was that, even without the Parallel Computing Toolbox, many vectorized matrix operations in MATLAB were multithreaded under the hood, but you could check the Task Manager to see how much parallel activity you're getting.
Incidentally, I'm glad the code is working better for you, but a 10^9 element array seems insane to me, even in an age of 64-bit computing and 12GB RAM. You might want to describe what you intend to do with E. I'm skeptical that it is really necessary to build it explicitly.
Also, if your a,b,c,d data are sparse, you could possibly take advantage of my ndSparse data type, which also supports bsxfun.
E is the energy of a system described by two angles (which should be mapped continuously and I take therefore 500 points for those two angles) containing 5 other tunable parameters. I need to map the behavior of the energy at the different angles as a function of these 5 tunable parameters. For the tunable parameters I take only a few values around the symmetric case (at most 5, which is less than I prefer, but for the sake of speed). The resulting E is not sparse, and only has a few zeros.
Since the accuracy of E does not have to go above 1000, I now converted the E1, E2 and E3 (in the example) to int16, before putting them together. This makes the resulting data-files a lot smaller without loss of accuracy, and I don't run out of RAM.
I need to map the behavior of the energy at the different angles as a function of these 5 tunable parameters.
OK, but map them for what purpose? For visualization? How are you going to make a 4D plot?
I map them so I can determine what exactly happens in the experiments I did, and if the developed theory is consistent with the experimental results. I don't plot them in 4D. I plot intensity-graphs of the energy (E) for the different situations, and also the difference with the energy of the symmetric ideal case. Therefore I will with (2 times the number of possible combinations of the 5 tunable parameters) graphs. Plotting them all gives me the opportunity to see how the different parameters affect the energy.

Iniciar sesión para comentar.

Más respuestas (2)

Bradley Steel
Bradley Steel el 8 de Oct. de 2012
Editada: Bradley Steel el 8 de Oct. de 2012
There are multiple ways to improve this; I'm not certain if you're already doing this, so two improvements without parallelisation:
  • preallocate space for EOR
  • vectorise the expressions where possible, eg:
A=repmat(reshape(a,[],1,1,1),[1 length(b) length(c) length(d)]);
B=repmat(reshape(b,1,[],1,1),[length(a) 1 length(c) length(d)]);
C=repmat(reshape(c,1,1,[],1),[length(a) length(b) 1 length(d)]);
D=repmat(reshape(d,1,1,1,[]),[length(a) length(b) length(c) 1]);
E2 = A.^2.*B + C.*D.^2 + A.*C;
Within the parallel toolbox, turning the inner loop into a forloop should run, but is likely to be slower due to memory overhead. The one you want to parellise is probably the outermost loop, but as set it won't be run because MATLAB doesn't know what values j,ii,jj hold when it creates the forloop. An alternative would be:
E=struct('x',[]);
parfor i=1:length(a)
E(i).x = zeros(length(b),length(c),length(d));
for j=1:length(b)
for ii=1:length(c)
for jj=1:length(d)
E(i).x(j,ii,jj) = a(i)^2 * b(j) + c(ii) * d(jj) ^2 + a(i) * c(ii);
end
end
end
end
You then need to turn the structure E back into the matrix you need. Without testing I would expect this to still be substantially slower than the vectorised version above, although it may be you have some cases which you cannot vectorise.

2 comentarios

Avoid REPMAT, if you can. Use BSXFUN instead.
Thanks for the reply, I did preallocate all the arrays. Your solution does increase speed significantly, but REPMAT command is quite slow. As Matt J suggested, the BSXFUN function is a lot faster.
Your parfor-loop does work like a charm, and is actually the answer to my question, but it is way slower than the vectorized version.
Thanks for the good and very accurate answer! Now I know how to use PARFOR on multidimensional calculations!

Iniciar sesión para comentar.

If you have a matrix E(dim1, dim2,...) I believe it iterates dim1 first, then increments dim 2, etc. We know that MATLAB goes down rows (the first column in a 2D matrix) in the first column, before it moves over to the next column to go down rows in that column. So if you have large matrices, you might get some speedup by inverting the order of your loops so that dim1 is the inner most loop, dim2 is the next inner loop, etc. Might be worth a try to see if it makes it faster.
for jj=1:length(d)
for ii=1:length(c)
for j=1:length(b)
for i=1:length(a)
E(i,j,ii,jj) = a(i)^2 * b(j) + c(ii) * d(jj) ^2 + a(i) * c(ii);
end
end
end
end
Another thing to try is to increase your priority of MATLAB via the task list (type control-shift-Escape to bring up the task list, right-click on MATLAB process), though this might not help if you have lots of idle time and no other program is competing for CPU time.

Categorías

Más información sobre Loops and Conditional Statements en Centro de ayuda y File Exchange.

Productos

Etiquetas

Preguntada:

el 8 de Oct. de 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by