Usage of structure as input of a function. Is it efficient ?

Question

Antoine Laurin el 10 de Ag. de 2020

5
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/577627-usage-of-structure-as-input-of-a-function-is-it-efficient

Editada: James Tursa el 12 de Ag. de 2020

Hello Mathworks community,

I have a general questioning about the efficiency / relevance of using structures as function inputs. I'll try to be clear...

I'm working on a code that has around 300 variables. I find very convenient to gather / sort these variables as fields of a few (~5) structures to sort them by "type". That clears my workspace and allow me to make shorter function calls. For example, I have a structure Model that contains fields .param1 .param2 ... param40 (all different, can be scalars, small vectors or small 3D matrices).

So, when I want to make some computation, I just do :

[output] = some_function(Model, other_input) ;

Furthermore, I use in my code a lot of functions / sub-functions / sub-sub-functions... I know that this might not be the most relevant in MatLab, but I think I have a good reason to do so. First that simplifies the structure of my code since I have a lot of sequential actions, so having small divided tasks is way more easy to manage. And secondly I intend to translate the code in C using MatLab Coder to run on an embedded platform.

That scheme was fine for me, until I started working on optimization / computing time and realized that there was a big difference between the 2 solutions below:

function [output] = some_function(Model, other_input)
% [...]
% Solution 1        
% (tic)
% (for i = 1:1e6)
    output = some_sub_function(Model.param1, Model.param2, Model.param3, other_input) ;
% (end)
% (toc)             ==> ~4 seconds
    
% Solution 2        
P1 = Model.param1 ;
P2 = Model.param2 ;
P3 = Model.param3 ;
% (tic)
% (for i = 1:1e6)
    output = some_sub_function(P1, P2, P3, other_input) ;
% (end)
% (toc)             ==> ~0.4 seconds, 10 times faster !
% [...]
end

Since computing time is critical for my work, I start to doubt about the choices I made. The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...

I've struggled to find ressources online about that matter (maybe I didn't find the right keywords), so I am curious about advices or feedback you could have ?

Thanks !

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Rik el 10 de Ag. de 2020

Do you have that many iterations in your real code? It looks to me like simply retrieving the fields and storing them in temporary variables for you function calls might already be the most important factor here.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Bruno Luong el 10 de Ag. de 2020

3
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/577627-usage-of-structure-as-input-of-a-function-is-it-efficient#answer_477670

Editada: Bruno Luong el 10 de Ag. de 2020

Abrir en MATLAB Online

"I've struggled to find ressources online about that matter (maybe I didn't find the right keywords"

No, it's not your fault, TMW rarely makes any comment about such thing. They thing's it's not programmer business to understand how things work. I just dislike the way they hide the details for programmers.

I worked with MATLAB for about 20 years, and I did a lot of timings trying to understand how things work. Mex programming also helps since it exposes me to mxArray data structure and that gives the hints what behind MATLAB statements.

In your case, accessing the structure fields, MATLAB store the fieldnames as char-arrays and everytime you call

s.param1

it must goes throught all the list of fielnames of the structure, doing string comparison until it matches the string 'param1', then it returns the corresponding mxArray (structure field). That explains why it is not fast to use it within the for loop. If you move the out

p1 = s.param1

the fieldname matching is done outside the loop and p1 strores the mxArray ready to be used.

Personally I do a lot of function arguments like you, meaning few structures as input arguments (usulay one nested structure) and single output as output. Just do need to be careful with for-loop and prepare the body of for-loop to have very simple operations, and avoid calling functions, accessing structure fields, table, multiple-level indexing, etc.... and you'll be fine. You need also to pay attention to function/statement that make memory-copies of your data, this also kill the performance of MATLAB. Using object programming is also slow, so avoid them especially if they don't deal with large/complex data and comes with array. Most of the time that comes in the purest form as "vectorization" and the for-loop disappears entirely.

21 comentarios
Mostrar 19 comentarios más antiguosOcultar 19 comentarios más antiguos

James Tursa el 10 de Ag. de 2020

Editada: James Tursa el 10 de Ag. de 2020

Abrir en MATLAB Online

I should have qualified that it can depend on MATLAB version and the size of the variable involved and how the expression is used. The rules for when to create shared data copies or reference copies for assignments and function arguments is not published (as you already know) and has changed multiple times in the past, and it is hard to keep up. An example test in Win64 R2017b:

>> format debug
>> s.param1 = 0
s = 
  struct with fields:
    param1: 0
>> s.param1
ans =
Structure address = 47c590c0
m = 1
n = 1
pr = 1b08152a0
pi = 0
     0
>> printAddress(s.param1)
0000000019DE8FE0
>> printStructFieldAddress(s)
0000000047C59670

So we have

47c590c0 = mxArray address of s.param1 at the m-file level
19DE8FE0 = mxArray address of s.param1 as passed into the mex function
47C59670 = mxArray address of s.param1 picked off of the data area of s directly

You can see that the original address of the s.param1 variable, 47C59670, does not match the other two addresses. So either a shared data copy or deep copy was made (I didn't check). It can be the case that MATLAB will make deep copies of scalars in some cases where otherwise it would make shared data copies.

And yet running a different example with a non-scalar field element yields:

>> s.param1 = 1:5
s = 
  struct with fields:
    param1: [1 2 3 4 5]
>> s.param1
ans =
Structure address = 47c59910
m = 1
n = 5
pr = f6066ea0
pi = 0
     1     2     3     4     5
>> printAddress(s.param1)
0000000047C592F0
>> printStructFieldAddress(s)
0000000047C59910

So we have

47c59910 = mxArray address of s.param1 at the m-file level
47C592F0 = mxArray address of s.param1 as passed into the mex function
47C59910 = mxArray address of s.param1 picked off of the data area of s directly

So the s.param1 expression at the m-file level produces the original address picked off the s data area directly, but when used as a function argument to a mex routine the address does not match indicating that a shared data copy was made. This behaviour of making shared data copies when field or cell elements are used in expressions or as function arguments is what I have observed in the past.

In a related note, it used to be that top-level workspace variables were passed as shared data copies to m-file function but passed as original addresses to mex functions. Some years ago (R2015b) even that changed so that shared data copies were passed to mex functions also. Details of when this happened can be found in my matlab_version FEX submission. See this link:

https://ch.mathworks.com/matlabcentral/answers/102641-how-can-i-obtain-the-memory-address-for-a-variable-in-matlab-7-7-r2008b?s_tid=answers_rc1-1_p1_MLT

Complications in later versions of MATLAB can be that simple assignments at the m-file level can produce reference copies of variables instead of shared data copies in some cases when the variable is "small". Again, rules for when this happens are not published.

The mex functions:

// printAddress.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    if( nrhs ) {
        mexPrintf("%p\n",prhs[0]);
    }
}

and

// printStructFieldAddress.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mxArray **vp;
    if( nrhs && (vp = mxGetData(prhs[0])) ) {
        mexPrintf("%p\n",*vp);
    }
}

James Tursa el 10 de Ag. de 2020

Editada: James Tursa el 10 de Ag. de 2020

But this is a fundamental change from how they used to do it. It used to be that they always passed in shared data copies of variables into mfile functions (or deep copies of scalars). Then there was no special logic needed inside the function ... the usual copy-on-write rules applied same as always. But now it looks like they are passing in the original addresses and keeping track off to the side whether it is an input argument. So the copy-on-write gets triggered because it is an input argument, not because it is a shared data copy of another variable. So, effectively MATLAB is treating the variable as a shared data copy off to the side. There is nothing in the variable itself (i.e., the CrossLink list) that would tell you this, making it impossible for a mex routine to detect even simple variable sharing just by looking at the mxArray details. I think I am going to have to download R2020a and experiment. Maybe arguments are now passed in as reference copies instead of shared data copies. Makes in place variable changes in a mex routine even more vulnerable than it used to be.

Bruno Luong el 12 de Ag. de 2020

Before the data sharing is at the data pointer under mxArray structure.

Now it seems it go up one level and the sharing is at the mxArray pointer itself.

May be they move up the "super-cross-link" on top root level and localed now somewhere as a master table handled by MATLAB engine and no loger accessible to users?

James Tursa el 12 de Ag. de 2020

Editada: James Tursa el 12 de Ag. de 2020

That's what I suspect. I haven't done exhaustive testing, but here is what I am seeing in the mxArray header:

R2019a

Reverse Cross Llink pointer spot, points to "previous" variable in shared data copy linked list

Cross Link pointer spot, points to "next" variable in shared data copy linked list

R2020a

Reverse Cross Link pointer spot, points to an integer that contains the number of shared data copies in existence.

Cross Link pointer spot, NULL

Not sure if the change happened in R2019b or R2020a ... will have to check this later.

You can still detect that it is a shared data copy by looking at that integer, but you can't find the copies because that linked list is no longer part of the mxArray header that I can see. I guess it is good that you can still detect data sharing in a mex routine (e.g., as a check prior to doing any inplace operation), but I prefer the old way where you can actually find the copies through the linked list.

As a bit of history, many years ago it used to be that shared data copies was always the method used for individual variable sharing and reference copies was the method used for cell and struct element sharing. Then they started using reference copies for individual variable sharing a few years ago. Now it seems they have taken it a step farther and are using reference copies for function arguments. Makes sense because reference copies are more efficient, but functions like reshape( ) and typecast( ) must still use the shared data copy method because stuff in the mxArray header changes.

Iniciar sesión para comentar.

Answer 2

Matt J el 10 de Ag. de 2020

3
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/577627-usage-of-structure-as-input-of-a-function-is-it-efficient#answer_477793

Editada: Matt J el 12 de Ag. de 2020

The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...

But if your code is doing such intricate things as to be "gigantic", then surely the struct access timings that you have shown us, whether 4 sec. or 0.4 sec, are going to be small overhead compared to the rest of your actual computation. Why care about a 4 sec. difference if the whole compuation takes 20 minutes or so?

If you really are in a situation where the main work of the function is 10 times faster than the work of unpacking the input from a struct, then I am suspicious if encapsulating those particular commands inside a function is really necessary and worthwhile. It means the function must be doing almost nothing. The overhead of even calling the function would be significant compared to the actual work that it does.

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Antoine Laurin el 11 de Ag. de 2020

Thanks for your replies.

"Do you have that many iterations in your real code? It looks to me like simply retrieving the fields and storing them in temporary variables for you function calls might already be the most important factor here."

"If you really are in a situation where the main work of the function is 10 times faster than the work of unpacking the input from a struct, then I am suspicious if encapsulating those particular commands inside a function is really necessary and worthwhile. It means the function must be doing almost nothing. The overhead of even calling the function would be significant compared to the actual work that it does. "

I have quite a lot of iterations in my code yes, since I try to reproduce a real-time algorithm on a large time-scale. For each iteration, I don't have "that many" actions, but still a few.

I can surely reduce a little bit the number of functions I use.

Personally I do a lot of function arguments like you, meaning few structures as input arguments (usulay one nested structure) and single output as output. Just do need to be careful with for-loop and prepare the body of for-loop to have very simple operations, and avoid calling functions, accessing structure fields, table, multiple-level indexing, etc.... and you'll be fine. You need also to pay attention to function/statement that make memory-copies of your data, this also kill the performance of MATLAB. Using object programming is also slow, so avoid them especially if they don't deal with large/complex data and comes with array. Most of the time that comes in the purest form as "vectorization" and the for-loop disappears entirely.

Noted.

I think my difficulty here is to understand when MatLab makes memory-copies and when it doesn't. And to evaluate whether this memory-copy is time consuming regarding to what the function does. That is not easy... The profiler for example is of no help for that.

Antoine Laurin el 11 de Ag. de 2020

Exactly.

Looking on the documentation I found :

https://fr.mathworks.com/help/matlab/matlab_prog/avoid-unnecessary-copies-of-data.html

"Copy-on-Write

If a function does not modify an input argument, MATLAB does not make a copy of the values contained in the input variable."

https://blogs.mathworks.com/loren/2006/05/10/memory-management-for-functions-and-variables/

"Structures and Memory

Each structure member is treated as a separate array in MATLAB. This means that if you modify one member of a structure, the other members, which are unchanged, are not copied. It's time for an illustration here."

So this is comforting.

Matt J el 11 de Ag. de 2020

Another thing to consider, is that the overhead you see in Matlab may not be there in the final C/C++ version that you obtain from the Matlab Coder. I would expect struct indexing in C/C++ to be less encumbered, because the compiler has the opportunity to sort out in advance which fields are being accessed, and where the relevant memory is located.

Iniciar sesión para comentar.

Answer 3

Walter Roberson el 12 de Ag. de 2020

2
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/577627-usage-of-structure-as-input-of-a-function-is-it-efficient#answer_478479

Abrir en MATLAB Online

Earlier I mentioned that I had done some timing tests. I was not able to find the code for that, so I created some new code.

In the below tests, on my R2020a Mac system, this is a summary of the results:

accessing a parameter and accessing a local variable are fastest, and indistinguishable in this test; legends for these are 'parameter' and 'local'. These do not access the structure passed in, and are there to give baselines to compare struct access against
Surprisingly, using a constant value is a bit slower and irregular timing. This does not make sense to me at the moment. (This is the first line created below, a blue line in the plot, legend 'constant')
Accessing a fixed field name of an input structure is roughly 25% slower than accessing a plain local or parameter, still quite fast. The timings for first (of 1000) fields or last of them in the struct are slightly different but quite close; legend entries 'Fstat' and 'Lstat'. Considering the other timing tests, we have to rule out the possibility that for repeated calls to get the same field that it has to search through the fieldname list every time. We cannot, however, currently rule out the possibility that the JIT is remembering the location after the first iteration
Accessing a struct field name dynamically takes roughly twice as long as accessing a static field name. The timings for first field in the struct, legend 'Fdyn', and last field in the struct, legend 'Ldyn', and middle field in the struct, legend 'MDyn', and random field in the struct, legend 'Rdyn', are pretty irregular and not at all well separated. The timings do not rule out the possibility that first (Fdyn) is slightly faster than the others, but the uncertainty is high.
If, hypothetically, structure reference required searching the fieldname list linearly each time, then Fstat (first static) and Fdyn (first dynamic) would be nearly indistuinguishable, and Lstat (last static) and Ldyn (last dynamic) would be nearly indistinguishable, and there would be a clear difference between the First and Last cases, but neither of these are true. If, hypothetically, the JIT is merely caching the last fieldname accessed under the hypothesis that it might be used again, with the same hypothetical linear search mechanism being used for both cases, then the large random variation for the dynamic access should be no more than the random variation for the static accesses, but the random variations are clearly quite different. Static fieldname access appears to be implemented through a different mechanism than dynamic fieldname access -- or at least is JIT'd quite differently.
In the functions, the assignments to local that appear in all functions are there in all tests in order to eliminate the possibility that creating the local variable was "significantly more work" for the test (legend 'local') that would only be done for that test. On the other hand, it could be the case that the JIT is discarding the initialization if the variable is not used, so a more careful test would make sure to force a "use" of the variable in a way that could not be JIT'd away. Similar logic to force use of the structure input and auxillary input should also be undertaken in a more thorough test
An analysis weakness of the current test is that it passes in the same struct each time, so we cannot at present speak to Bruno's point that it cannot know the offset of a particular static field name.

Anyhow: Yes, using a field from a struct that is passed in is pretty efficient compared to many other possibilities. Not as efficient as using a parameter. But a more throughout test would require passing in 1000 parameters and comparing timing to access "first" or "last" or "random" parameter, because hypothetically parameter names are not hard-wired to parameter number references. (The Mathworks documentation on speed of variable access does imply that named parameter access is hard-wired to a stack offset by the JIT, and does imply that parameter access could be faster than local variables.)

N = 100;
NFs = 1000;
S.walla = 1;
for K = 1 : NFs-2
    name = char(randi(0+['a', 'z'], 1, 5));
    S.(name) = K + 1;
end
S.bingo = NFs;
FN = fieldnames(S);
middle = FN{floor(end/2)};
Fs = {@() time_constant(S, 7), ...
      @()time_parameter(S, 7), ...
      @()time_local(S,7), ...
      @()time_first_static(S, 'walla'), ...
      @()time_last_static(S,'bingo'), ...
      @()time_dynamic(S, 'walla'), ...
      @()time_dynamic(S, 'bingo'), ...
      @()time_dynamic(S, middle) };
  Labels = {'constant', 'parameter', 'local', 'Fstat', 'Lstat', 'Fdyn', 'Ldyn', 'Mdyn', 'Rdyn'};
  
  NFs = length(Fs);
  times = zeros(N, NFs+1);
  
  for J = 1 : NFs; F = Fs{J}; for K = 1 : N; times(K,J) = timeit(F, 0); end; end
  for J = NFs+1; for K = 1 : N; lab = FN{randi(NFs)}; F = @()time_dynamic(S,lab); times(K,J) = timeit(F,0); end; end
  plot(times);
  legend(Labels);
  
  function r = time_constant(S, const) %#ok<INUSD>
      local = -123; %#ok<NASGU>
      r = -1;
  end
  
  function r = time_parameter(S, const) %#ok<INUSL>
      local = -123; %#ok<NASGU>
      r = const;
  end
  
  function r = time_local(S, const) %#ok<INUSD>
      local = -123; %#ok<NASGU>
      r = local;
  end
  
  function r = time_first_static(S, label) %#ok<INUSD>
      local = -123; %#ok<NASGU>
      r = S.walla;
  end
  
  function r = time_last_static(S, label) %#ok<INUSD>
      local = -123; %#ok<NASGU>
      r = S.bingo;
  end
  
  function r = time_dynamic(S, label)
      local = -123; %#ok<NASGU>
      r = S.(label);
  end
  

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Usage of structure as input of a function. Is it efficient ?

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuesta aceptada

21 comentarios
Mostrar 19 comentarios más antiguosOcultar 19 comentarios más antiguos

Más respuestas (2)

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Usage of structure as input of a function. Is it efficient ?

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuesta aceptada

21 comentarios Mostrar 19 comentarios más antiguosOcultar 19 comentarios más antiguos

Más respuestas (2)

4 comentarios Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

21 comentarios
Mostrar 19 comentarios más antiguosOcultar 19 comentarios más antiguos

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos