Unroll for
-Loops and parfor
-Loops
When the code generator unrolls a for
-loop or
parfor
-loop, instead of producing a loop in the generated code, it
produces a copy of the loop body for each iteration. For small, tight loops, unrolling can
improve performance. However, for large loops, unrolling
can significantly increase code generation time and generate inefficient code.
Force for
-Loop Unrolling by Using coder.unroll
The code generator uses heuristics to determine when to unroll a
for
-loop. To force loop unrolling, use
coder.unroll
. This affects
only the for
loop that is immediately after
coder.unroll
. For
example:
function z = call_myloop() %#codegen z = myloop(5); end function b = myloop(n) b = zeros(1,n); coder.unroll(); for i = 1:n b(i)=i+n; end end
Here is the generated code for the for-loop:
z[0] = 6.0; z[1] = 7.0; z[2] = 8.0; z[3] = 9.0; z[4] = 10.0;
To control when a for
-loop is unrolled, use the
coder.unroll
flag
argument. For example, unroll the loop only when the number of
iterations is less than
10.
function z = call_myloop() %#codegen z = myloop(5); end function b = myloop(n) unroll_flag = n < 10; b = zeros(1,n); coder.unroll(unroll_flag); for i = 1:n b(i)=i+n; end end
To unroll a for
-loop, the code generator must be able to
determine the bounds of the for
-loop. For example, code generation
fails for the following code because the value of n
is not known at
code generation
time.
function b = myloop(n) b = zeros(1,n); coder.unroll(); for i = 1:n b(i)=i+n; end end
Set Loop Unrolling Threshold for All for
-Loops and parfor
-Loops in the MATLAB Code
If a for
-loop is not preceded by
coder.unroll
, the code generator uses a loop unrolling
threshold to determine whether to automatically unroll the loop. If the number of loop
iterations is less than the threshold, the code generator unrolls the loop. If the
number of iterations is greater than or equal to the threshold, the code generator
produces a for
-loop. By using the loop unrolling threshold, you can
also unroll parfor
-loops.
The default value of the threshold is 5
. By modifying this
threshold, you can fine-tune loop unrolling. To modify the threshold:
In a configuration object for standalone code generation (
coder.CodeConfig
orcoder.EmbeddedCodeConfig
), set theLoopUnrollThreshold
property.In the MATLAB® Coder™ app, on the Speed tab, set Loop unrolling threshold.
Unlike the coder.unroll
directive, the
threshold applies to all
for
-loops in your MATLAB code. The threshold can also apply to some for
-loops
produced during code generation.
For an individual loop, a coder.unroll
directive takes precedence
over the loop unrolling optimization.
Unroll Simple for
-Loops
Consider this function:
function [x,y] = call_myloops() %#codegen x = myloop1(5); y = myloop2(5); end function b = myloop1(n) b = zeros(1,n); for i = 1:n b(i)=i+n; end end function b = myloop2(n) b = zeros(1,n); for i = 1:n b(i)=i*n; end end
To set the value of the loop unrolling threshold to 6
, and then
generate a static library, run:
cfg = coder.CodeConfig; cfg.LoopUnrollThreshold = 6; codegen call_myloops -config cfg
This is the generated code for the for
-loops. The code
generator unrolled both
for
-loops.
x[0] = 6.0; y[0] = 5.0; x[1] = 7.0; y[1] = 10.0; x[2] = 8.0; y[2] = 15.0; x[3] = 9.0; y[3] = 20.0; x[4] = 10.0; y[4] = 25.0;
Unroll Nested for
-Loops
Suppose that your MATLAB code has two nested for
-loops.
If the number of iterations of the inner loop is less than the threshold, the code generator first unrolls the inner loop. Subsequently, if the product of the number of iterations of the two loops is also less than the threshold, the code generator unrolls the outer loop. Otherwise the code generator produces the outer
for
-loop.If the number of iterations of the inner loop is equal to or greater than the threshold, the code generator produces both
for
-loops.
This behavior is generalized to multiple nested
for
-loops.
Consider the function nestedloops_1
with two nested
for
-loops:
function y = nestedloops_1 %#codegen y = zeros(2,2); for i = 1:2 for j = 1:2 y(i,j) = i+j; end end end
Generate code for nestedloops_1
with the loop unrolling
threshold set to the default value of 5
. Here is the generated
code for the for
-loops. The code generator unrolled both
for
-loops because the product of the number of iterations of
the two loops is 4
, which is less than the threshold.
y[0] = 2.0; y[2] = 3.0; y[1] = 3.0; y[3] = 4.0;
Now, generate code for the function nestedloops_2
with the loop
unrolling threshold set to the default value of 5
.
function y = nestedloops_2 %#codegen y = zeros(3,2); for i = 1:3 for j = 1:2 y(i,j) = i+j; end end end
The number of iterations of the inner loop is less than the threshold. The code
generator unrolls the inner loop. But the product of the number of iterations of the
two loops is 6
, which is greater than the threshold. Therefore,
the code generator produces code for the outer for
-loop. Here is
the generated code for the for
-loops.
for (i = 0; i < 3; i++) { y[i] = (double)i + 2.0; y[i + 3] = ((double)i + 1.0) + 2.0; }
Unroll parfor
-Loops
Consider this MATLAB function:
function [x,y] = parallel_loops() %#codegen x = myloop1(5); y = myloop2(6); end function b = myloop1(n) b = zeros(1,n); parfor (i = 1:n) b(i)=i+n; end end function b = myloop2(n) b = zeros(1,n); parfor (i = 1:n) b(i)=i*n; end end
cfg = coder.CodeConfig; cfg.LoopUnrollThreshold = 6; codegen parallel_loops -config cfg
static void myloop1(double b[5]) { b[0] = 6.0; b[1] = 7.0; b[2] = 8.0; b[3] = 9.0; b[4] = 10.0; } static void myloop2(double b[6]) { int i; #pragma omp parallel for num_threads(omp_get_max_threads()) for (i = 0; i < 6; i++) { b[i] = ((double)i + 1.0) * 6.0; }} void parallel_loops(double x[5], double y[6]) { if (!isInitialized_parallel_loops) { parallel_loops_initialize(); } myloop1(x); myloop2(y);}
The code generator unrolled only the parfor
-loop that has five
iterations, which is less than the threshold value.