How to preallocate memory for storing data in same mat file?

Question

Sunny el 20 de Oct. de 2018

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/425041-how-to-preallocate-memory-for-storing-data-in-same-mat-file

Comentada: Guillaume el 26 de Oct. de 2018

Hi, I wrote the below code and I would like to preallocate memory so that the code will run faster. Once I preallocate I know that I cannot use append but need to index to store output. Can you suggest how to get output for code below?

Here the value of f is a 1*5449 double. Final output is 5449*5449 double.

clc;
n=1; %system order 
m=1; %number of inputs
p=6;%number of outputs
Final = []; 
for i = 1:7783
  for j = 1:50
      if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
          load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
          A1 = A{1}; 
          A1 = A1 / max(abs(eig(A1)));
          B1 = B{1}; 
          C1 = C{1};
          index = 1;
          for k = 1:7783
              for l = 1:50
                      if exist(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat'],'file')
                          load(['ID_',num2str(k),'_file_',num2str(l),'_Variables','.mat']);
                          A2 = A{1}; 
                          A2 = A2 / max(abs(eig(A2)));
                          B2 = B{1};  
                          C2 = C{1};
                          f(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
                          index = index + 1;
                      end
              end
          end 
          Final = [Final;f];
      end
  end
end
save('Distance','Final');

5 comentarios
Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

Sunny el 21 de Oct. de 2018

Editada: Sunny el 21 de Oct. de 2018

Abrir en MATLAB Online

Thanks. I changed the program to this. I think this is faster. A is 10*10 double, B is 1*10 and C is 6*10. Now the structs f, o and g are 1*5449.

clc;
n=10; %system order 
m=1; %number of inputs
p=6;%number of outputs
Final = [];
k = 1;
for i = 1:7783
  for j = 1:50
      if exist(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat'],'file')
           load(['ID_',num2str(i),'_file_',num2str(j),'_Variables','.mat']);
           f{k} = A{1};
           o{k} = B{1}; 
           g{k} = C{1};
           k = k+1;
       end
   end
end
 save('Rescaled_A_Values_All_States','f');
 save('Rescaled_B_Values_All_States','o');
 save('Rescaled_C_Values_All_States','g'); 
for c = 1:5449          
          A1 = f{c}; 
          A1 = A1 / max(abs(eig(A1)));
          B1 = o{c}; 
          C1 = g{c};
          index = 1;
          for d = 1:5449
                          A2 = f{d}; 
                          A2 = A2 / max(abs(eig(A2)));
                          B2 = o{d};  
                          C2 = g{d};
                          q(index) = distance1_matlab(A1,A2,B1,B2,C1,C2);
                          index = index + 1;
            end 
         Final = [Final;q];
  end

Guillaume el 21 de Oct. de 2018

Well, yes it's going to be much faster. You're reading each file only once. You're still doing N^2 unnecessary eigs and related calculations. And nearly 99% of the files you test for existence don't exist, so it'd be faster to do a dir so the OS just tells you which files are there.

Finally, depending on what distance1_matlab does, it may well be that your 2nd loop is not needed.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Guillaume el 21 de Oct. de 2018

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/425041-how-to-preallocate-memory-for-storing-data-in-same-mat-file#answer_342536

Abrir en MATLAB Online

Depending on what distance1_matlab does, this code could be significantly improved.

I'm also assuming that all files that match the pattern ID_*_file_*_Variables.mat' need to be loaded.

filelist = dir('ID_*_file_*_Variables.mat');      %get list of files that exist
fileids = regexp({filelist.name}, 'ID_(\d+)_file_(\d+)_', 'tokens', 'once')  %extract numeric ids as text
fileids = str2double(vertcat(fileids{:}));   %and convert to numeric
%you may want to sort fileids and filelist to match the order of your original loops
%it's trivial to do. For now I assume it does not matter.
filedata = struct('A', cell(numel(filelist), 1), 'B', [], 'C', []);  %preallocate structure to receive file content and final result
%note that A, B and C are very poor field names.
for fileiter = 1:numel(filelist)
   filecontent = load(filelist(fileiter).name));
   filedata(fileiter).A = filecontent.A{1} / max(abs(eig(A{1})));
   filedata(fileiter).B = filecontent.B{1};
   filedata(fileiter).C = filecontent.C{1};
end
[cartprod1, cartprod2] = ndgrid(filedata);  %cartesian product of all files with themselves
distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2);  %assumes that the result of distance1_matlab is scalar

Note that that last line assumes distance1_matlab returns a scalar. If not, change it to:

distance = arrayfun(@(s1, s2) distance1_matlab(s1.A, s2.A, s1.B, s2.B, s1.C, s2.C), cartprod1, cartprod2, 'UniformOutput', false);

If you want the result in the same form as your original Final, then:

distance = distance(:);  %if scalar result out of 
distance = vertcat(distance{:});   %otherwise

2 comentarios
Mostrar NingunoOcultar Ninguno

Sunny el 26 de Oct. de 2018

@Guillaume

Can I use parfor instead of for to speed up execution with parallel processing? Does the loops synchronize?

Guillaume el 26 de Oct. de 2018

I doubt that using parfor for the loading loop would help much. The slow part of that is not the processor but the disk access. If anything, it's possible that parfor will slow things down as parallel threads compete for disk access. You'll only know if you try.

I don't know if the parallel toolbox can parallelise arrayfun (I don't have the toolbox). arrayfun is a for loop in disguise. Parallelising that code could certainly result in a speed-up

However, as I've said (twice now) depending on what distance_matlab does, it's likely that this 2nd loop/arrayfun is not needed at all and that the function can be vectorised. This would probably be the most efficient way to improve your code. Hence why I asked for the details of this function.

Iniciar sesión para comentar.

How to preallocate memory for storing data in same mat file?

5 comentarios
Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How to preallocate memory for storing data in same mat file?

5 comentarios Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

5 comentarios
Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno