MATLAB Answers

How do I efficiently stitch together logged data from several v4 MAT files into a huge structure with timeseries data?

3 views (last 30 days)
Michael on 7 Jun 2018
Edited: Michael on 11 Jun 2018
I am running a large Simulink model for a long time and need to save the data of a huge multi-level hierarchical bus. The To File block has some limitations, so I've written my own similar block. It logs data to v4 MAT files because it is a very easy (and fast) way to write a simple header and then stream an arbitrary amount of data to a file. However, like all MAT file versions prior to v7.3 files, the size is limited. I need to save roughly 12GB of data so I save it in chunks. So each .mat file first gets a variable with a character array that specifies the full list of bus leaf nodes in a comma-separated list so that the bus hierarchy can later be rebuilt into a structure where each leaf node is a timeseries. Then each .mat file gets a single MxN 2-D array where M is the number of leaf nodes (plus one for simulation time) and N is a number of data points that I can safely stuff in each .mat file without hitting the max file size or max elements per array. When each .mat file is "full", then I close it (writing the value of N to the header since it wasn't known initially) and start working on a new file.
The above process works nicely for the logging itself but there are a few very slow and memory expensive operations to post-process the data. The first is stitching the data together. Right now, I just go through the files one at a time and chunk them together like this:
data_varTMP = [data_varTMP data_var__];
where data_var__ is the MxN array in each file. Once I have the entire huge array stitched together then I build the hierarchical structure variable making each leaf node a timeseries using this:
varNameCellArray__ = textscan(commaSepSigNames_var__,'%s','delimiter',',');
varNameCellArray__ = varNameCellArray__{1};
simtime__ = data_var__(1,:);
for idxVar__=2:numel(varNameCellArray__)
eval([fileName_var__ '.' tmpAvarName__ ' = timeseries(data_var__(' num2str(idxVar__) ',:)'', simtime__,''Name'', tmpAvarName__ );']);
This too takes a long time. The final step, which also takes a long time, is to save my final structure variable as a v7.3 MAT file.
I expect that there are better ways to handle some of these slow steps but I am not sure the best approach. I am thinking the matfile function might help, but it talks about only v7.3 files being efficient and not being able to work with fields within structures, I'm not sure how I could leverage it best.


Sign in to comment.

Answers (0)

Translated by