Loops slowing down dramatically with increased iterations

66 visualizaciones (últimos 30 días)
Timothy
Timothy el 4 de Feb. de 2020
Comentada: Timothy el 6 de Feb. de 2020
I am running a code that splits a folder of long-duration acoustic files into shorter-duration files. Time is kept track of by sample number and timestamps in the filenames. Once chunks of data per shorter time period are loaded in, these shorter files are saved with the appropriate timestamp. There are series of for loops and if statements to meet the needs of the goal. Unfortunately, the script starts out really fast and then increasingly becomes slower and slower with increasing iterations. I have read similar questions regarding this issue, but I am unable to identify why this is happening still. It appears that memory handling might be the source?
Any suggestions on how to prevent this would be greatly appreciated, as well as any suggestions to increase efficiency. I am using Matlab 2017a on a Linux OS.
clear all; close all; clc;
datadir = uigetdir('Select data folder with all files to be split');
cd(datadir)
flist = dir('*.wav');
rec_length = input('Input length of recording required in seconds: ');
fname = flist(1).name;
Ndot = find(fname == '_'); %will look for the indices of underscores
NdotPrefix = Ndot(3); %which underscore precedes the timestamp
FilePrefix = fname(1:NdotPrefix); %Gets filename part that precedes timestamp
NdotSuffix = Ndot(5); %which underscore follows the timestamp
FileSuffix = fname(NdotSuffix:end); %Gets filename part that follows timestamp
clear fname
for nf = 1:length(flist)
disp(['Input file No.: ',num2str(nf)])
fname = flist(nf).name;
if nf == 1
I = audioinfo(fname);
Ns = I.TotalSamples; %total number of samples
Fs = I.SampleRate;
NS_rec = rec_length*Fs; % section length in number of samples, e.g. 60*Fs for 1 min files
Nrec = floor(Ns/NS_rec); %This determines how many complete files will be made from the original file, e.g. how many minutes
timestep = 0;
OrigTime = fname(NdotPrefix+1:NdotSuffix-1);
start_time = datenum(OrigTime,'yymmdd_HHMMSS');
for nr = 1:Nrec+1 %breaking up recording to X min, then the +1 indicates what is left over as may not make a whole X min file
if nr <= Nrec
S = audioread(fname, [NS_rec*(nr-1)+1, NS_rec*nr]);
time = start_time + timestep*rec_length/24/60/60;
dst = datestr(time,31);
if nr == 1
newfname =['S_',FilePrefix,dst(3:4),dst(6:7),dst(9:10),'_',dst(12:13),dst(15:16),dst(18:19),FileSuffix];
else
newfname =['S_',FilePrefix,dst(3:4),dst(6:7),dst(9:10),'_',dst(12:13),dst(15:16),dst(18:19),FileSuffix];
end
audiowrite(newfname, S, Fs);
timestep = timestep + 1;
clear S time dst newfname
else % this is for the remaining data at the end of the original that is under X min and wont make a whole Xmin file
SExtra = audioread(fname, [NS_rec*(nr-1)+1, I.TotalSamples]);
end
end
clear I Ns Fs NS_rec Nrec OrigTime
else
I = audioinfo(fname);
Ns = I.TotalSamples + length(SExtra);
Fs = I.SampleRate;
NS_rec = rec_length*Fs;
Nrec = floor(Ns/NS_rec);
offset = length(SExtra);
for nr = 1:Nrec+1
if nr <=Nrec
if nr == 1
S = [SExtra;(audioread(fname, [NS_rec*(nr-1)+1, NS_rec*nr-offset]))];
time = start_time + timestep*rec_length/24/60/60;
dst = datestr(time,31);
newfname =['S_',FilePrefix,dst(3:4),dst(6:7),dst(9:10),'_',dst(12:13),dst(15:16),dst(18:19),FileSuffix];
audiowrite(newfname, S, Fs);
timestep = timestep + 1;
clear SExtra S time dst newfname
else
S = audioread(fname, [NS_rec*(nr-1)+1-offset, NS_rec*nr-offset]);
time = start_time + timestep*rec_length/24/60/60;
dst = datestr(time,31);
newfname =['S_',FilePrefix,dst(3:4),dst(6:7),dst(9:10),'_',dst(12:13),dst(15:16),dst(18:19),FileSuffix];
audiowrite(newfname, S, Fs);
timestep = timestep + 1;
clear S time dst newfname
end
else
SExtra = audioread(fname, [NS_rec*(nr-1)+1-offset, I.TotalSamples]);
if nf == length(flist)
time = start_time + timestep*rec_length/24/60/60;
dst = datestr(time,31);
newfname =['S_',FilePrefix,dst(3:4),dst(6:7),dst(9:10),'_',dst(12:13),dst(15:16),dst(18:19),FileSuffix];
audiowrite(newfname, SExtra, Fs);
timestep = timestep + 1;
end
end
end
clear I Ns Fs NS_rec Nrec offset
end
end

Respuesta aceptada

Stephen23
Stephen23 el 5 de Feb. de 2020
Editada: Stephen23 el 5 de Feb. de 2020
"...any suggestions to increase efficiency"
  • Do not use cd to read data files, always use absolute/relative filenames. Yes really. When you add folders to the MATLAB Search Path it means that MATLAB has to keep track of any changes to the files in those folders. As you add files it will slow MATLAB down.
  • clear is rarely required... and almost always clearvars is a better, more efficient alternative anyway. But the best choice is to leave garabage collection up to MATLAB: the more you micro-manage variable clearing, the more you interrupt MATLAB's efficient inbuilt memory management.
  • Move file I/O out of the loops, or up to parent loops as much as possible. File I/O is still expensive.
  • Read this: https://www.mathworks.com/help/matlab/matlab_prog/techniques-for-improving-performance.html
  3 comentarios
Stephen23
Stephen23 el 5 de Feb. de 2020
Editada: Stephen23 el 5 de Feb. de 2020
"Would you add the folder to the path of Matlab ..."
No, I would not do that.
"...and then just call them individually given their path"
I have no idea what you mean by "call them individually". Functions are called. Files are opened, closed, written, read, imported, etc. I wrote that you should use absolute/relative filenames when reading/writing data files:
"As the new files need to be saved would it be appropriate to save them to a different folder that is not in the current directory?"
It is a good practice to keep your code files and your data files separate. All functions that read/write to data files accept absolute/relative filenames, and this is the most efficient way to access data files. As I wrote in my answer, polluting your Search Path with data folders just slows things down. Using cd makes debugging harder.
Timothy
Timothy el 6 de Feb. de 2020
Thanks a million Stephen. Not opening and writing files to the current directory did the trick. I had no idea previously about how much that could slow things down...pretty much to a complete standstill. That mistake will not be made again.

Iniciar sesión para comentar.

Más respuestas (1)

the cyclist
the cyclist el 4 de Feb. de 2020
I have not gone through your code, but this is a classic symptom of failing to preallocate matrices, and instead letting them grow incrementally. See this documentation for guidance.
  1 comentario
Timothy
Timothy el 5 de Feb. de 2020
Editada: Timothy el 5 de Feb. de 2020
Thanks. I am not resizing matrices, but instead replacing the values within matrices/vectors within the loop. I thought this might be a problem of memory resulting from the copying of old versions of vectors before replacing values. Therefore, I just decided to clear those variables and start with a clean workspace on each iteration of the loop. Would I want to still preallocate in some way?

Iniciar sesión para comentar.

Categorías

Más información sobre Debugging and Analysis en Help Center y File Exchange.

Productos


Versión

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by