Parallelization is not working in Matlab

3 visualizaciones (últimos 30 días)
Chang seok Ma
Chang seok Ma el 30 de Mzo. de 2021
Respondida: Nipun el 14 de Mayo de 2024
Hello,
I am trying to paralleliza my code in Matlab. Below is a part of my code.
Basically, I am finding the minimum position and the value given 4 state variables.
I want to speed up my code by using parallelization so I changed 'for i_a = 1:Na' to 'parfor i_a = 1:Na' but it doesn't seem like Matlab parallelize the code because if I see the result at the end of each loop (disp([i_a i_d i_y i_t toc])), it seems like Matlab is calculating minimum value one by one. (And I don't think the code is faster also)
Am I doing something wrong? (I don't think there is false sharing issue here because the one I update is Vnew and Vnew is not used during the calculation)
for i_a = 1:Na %Loop over state variable a
for i_d = 1:Nd %Loop over state variable d
for i_y = 1:Ny %Loop over state variable y
for i_t = 1:Nt %Loop over state variable t
tic;
utilityadj = @(adjf) -(((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R + (1-delta)*D(i_d) - adjf(1) - adjf(2) - fixed*(1-delta)*D(i_d))>0)*((1/(1-elasticity)*( ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R + (1-delta)*D(i_d) - adjf(1) - adjf(2) - fixed*(1-delta)*D(i_d))^relutility) * (adjf(2)^(1-relutility)) )^(1-elasticity)) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(1),T(1))*transition(i_y,1)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(2),T(1))*transition(i_y,2)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(3),T(1))*transition(i_y,3)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(4),T(1))*transition(i_y,4)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(5),T(1))*transition(i_y,5)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(6),T(1))*transition(i_y,6)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(7),T(1))*transition(i_y,7)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(1),T(2))*transition(i_y,1)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(2),T(2))*transition(i_y,2)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(3),T(2))*transition(i_y,3)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(4),T(2))*transition(i_y,4)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(5),T(2))*transition(i_y,5)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(6),T(2))*transition(i_y,6)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(7),T(2))*transition(i_y,7)*prob(2)) ...
+ ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R + (1-delta)*D(i_d) - adjf(1) - adjf(2) - fixed*(1-delta)*D(i_d))<=0)*(-1e10));
noadj_damount = (1-delta)*D(i_d);
if i_d == 1
noadj_damount = d_min;
end
utilitynoadj = @(noadjf) -(((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R - noadjf)>0)*((1/(1-elasticity)*( ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R - noadjf)^relutility) * (((1-delta)*D(i_d))^(1-relutility)) )^(1-elasticity)) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(1),T(1))*transition(i_y,1)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(2),T(1))*transition(i_y,2)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(3),T(1))*transition(i_y,3)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(4),T(1))*transition(i_y,4)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(5),T(1))*transition(i_y,5)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(6),T(1))*transition(i_y,6)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(7),T(1))*transition(i_y,7)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(1),T(2))*transition(i_y,1)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(2),T(2))*transition(i_y,2)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(3),T(2))*transition(i_y,3)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(4),T(2))*transition(i_y,4)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(5),T(2))*transition(i_y,5)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(6),T(2))*transition(i_y,6)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(7),T(2))*transition(i_y,7)*prob(2)) ...
+ ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R - noadjf)<=0)*(-1e10));
lb = [a_min-10,d_min-10];
ub = [a_max+10,d_max+10];
a = [];
b = [];
aeq = [];
beq = [];
x0 = (lb + ub) / 2;
options = optimoptions('fmincon','Display','off');
[adjchoice,adjval] = fmincon(utilityadj,x0,a,b,aeq,beq,lb,ub,a,options);
[noadjchoice,noadjval] = fmincon(utilitynoadj,a_min,a,b,aeq,beq,a_min-10,a_max+10,a,options);
Vnew(i_a,i_d,i_y,i_t) = -min(adjval, noadjval);
if Vnew(i_a,i_d,i_y,i_t) == -adjval
indpol_ap(i_a,i_d,i_y,i_t) = adjchoice(1);
indpol_dp(i_a,i_d,i_y,i_t) = adjchoice(2);
indadj(i_a,i_d,i_y,i_t) = 1;
else
indpol_ap(i_a,i_d,i_y,i_t) = noadjchoice;
indpol_dp(i_a,i_d,i_y,i_t) = noadj_damount;
indadj(i_a,i_d,i_y,i_t) = 0;
end
disp([i_a i_d i_y i_t toc])
end
end
end
end

Respuestas (1)

Nipun
Nipun el 14 de Mayo de 2024
Hi Chang Seok,
I understand that you intend to parallelize your MATLAB code to enhance its performance, particularly by utilizing the `parfor` loop for iterating over one of the state variables in your nested loop structure. However, you've observed that the execution does not seem to be parallelized effectively, as indicated by the sequential display of loop iterations and a lack of noticeable speed improvement.
1. Parallel Pool
Ensure that a parallel pool is active before executing the `parfor` loop. If a parallel pool is not already open, MATLAB will attempt to open one, which can add overhead, especially if the loop is relatively short. You can manually start a parallel pool using `parpool`
2. Overhead vs. Computation Time
The effectiveness of parallelization is more pronounced for loops where each iteration takes a significant amount of time. If the computations within each iteration are relatively quick, the overhead of distributing tasks among workers can outweigh the benefits. Your nested `fmincon` optimizations seem computationally intensive, which should, in theory, benefit from parallelization.
3. Data Transfer and Dependencies
`parfor` loops work best when each iteration is independent of others, minimizing the need for data transfer between workers. Your code appears to follow this principle, as each iteration writes to unique indices of `Vnew`, `indpol_ap`, `indpol_dp`, and `indadj`. However, ensure that all variables used within the loop are properly initialized and that there are no hidden dependencies.
4. Profiling and Optimization
Consider using MATLAB's Profiler to identify bottlenecks in your code. The Profiler can help you understand where the most time is spent and guide optimization efforts:
profile on
% Your code here
profile off
profile viewer
If after these considerations, you still do not observe improved performance, it might be worth examining more closely the specific computations within the loop or consulting MATLAB's parallel computing documentation for more advanced techniques, such as chunking iterations or optimizing data transfer between workers.
Hope this helps.
Regards,
Nipun

Categorías

Más información sobre Parallel for-Loops (parfor) en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by