MATLAB Answers

How to use timetable with hierarchical data

9 views (last 30 days)
Jon
Jon on 25 Jul 2019
Edited: Eric Sofen on 29 Jul 2019
I have made wide use of MATLAB tables for analyzing my experimental data. Since much of my experimental data is time based (measurements are made at discrete times) I would also like to take advantage of the additional functionality of MATLAB timetables for my analysis.
Much of my data naturally has a heirarchical structure. So for example I have a machine which has various sections, within each section are multiple cavities, within each cavity are multiple location types. At each of these locations I make temperature measurements and record the times at which the measurements were taken. A typical example is given in the attached file.
This type of data is handled very well by MATLAB tables. I just make a columns for the sample time, section number, cavity number, location type, location name and measured temperature. I can then use the section numbers, cavity numbers etc as grouping variables, or otherwise use logical indexing to get the rows I want.
The problem comes when I try to use timetables to hold this data. While I can simply use table2timetable to convert the original table to a timetable, I do not seem be able to take advantage of some of the basic timetable functionality. For example retime, using regular spacing, aggregating using a mean, with a specified time step, results in finding the mean of the cavity numbers, the mean of the location numbers, as well a lots of NaN
Running the lines of code below on the attached table illustrates the problem.
First retime aggregating by means will not work at all when the table has a column of names (character variables), because the aggregating character variable using a mean is not defined
% convert to time table
Ttt = table2timetable(Temperature);
% attempt to retime when timetable has the name column (character variables)
Trt = retime(Ttt,'regular','mean','TimeStep',seconds(18))
results in
Error using timetable/retime (line 140)
All variables in input timetables must be numeric, datetime, or duration when synchronizing using 'mean'.
If I eliminate the name column, the command executes, but the results are not what would be hoped for. All the columns are aggregated. This results in meaningless values such as mean cavity and location values (they can only be integers, there is no cavity 1.5).
% convert to time table
Ttt = table2timetable(Temperature);
% attempt to retime without the name column
Trt = retime(Ttt(:,[1 2 3 5]),'regular','mean','TimeStep',seconds(18))
Similar problems will occur with trying to synchronize such timetables, and other useful timetable functions.
I think this is a fundamental limitation of working with timetables. It seems that I would have to break my data up into multiple time tables, one for each section, cavity, and location within the cavity. This is not desirable however, and I wonder if there is any other approach, inwhich I can maintain a single timetable with all of my data.
Any approaches to keeping heirarchical data in timetables would be appreciated.

  2 Comments

Guillaume
Guillaume on 25 Jul 2019
Well, yes retime and synchronise have no concept of grouping variables. What's not clear to me is what you want these functions to do when you have grouping variables. Apply retime to each group independently? Apply a common timebase to all the groups? Even less clear is how you'd synchronise two tables/timetables, do you synchronisation independently each common groups of the two tables (what about if one table has a group that the other hasn't, what then?)?
If that's what you want, I'm sure it's not that complicated to implemented (although I've not really investigated yet) but certainly it's not something that's currently implemented.
Jon
Jon on 25 Jul 2019
Good questions. So I am really interested in the case where my measurements are on the leaves of a tree stucture. (This is maybe a more precise way to say that the data is heirarchical) Different physical measurements may be made at multiple times for each leaf.
So for example we could have temperature, pressure,intensity, brightness etc all for the same leaf.
So for example in my case a leaf would be section 3, cavity 2, location 1.
So in the retime case it would make sense to retime each of the leaves as a group.
If I had two timetables with data for the same sets of leaves (e.g. data for sections, cavities, and locations) one with say temperature and pressure taken at 1 second intervals another with intensity and brightness, then I think it would make sense to synchronize them leaf to leaf.

Sign in to comment.

Accepted Answer

Eric Sofen
Eric Sofen on 29 Jul 2019
Edited: Eric Sofen on 29 Jul 2019
For synchronizing leaf-to-leaf, you probably will end up wanting to write a for-loop, looping over leaf grouping variables, syncrhonizing each subset, and then vertcat them back together. Something like:
dataVars1 = ["Temp", "Pressure"];
dataVars2 = ["Intensity", "Brightness"];
leaf = unique(tt1.Leaf)
leafTTs = {};
for l = leaf
leafTTs{l} = synchronize(tt1(tt1.Leaf = l,dataVars1), tt2(tt2.Leaf = l,dataVars1), newtimestep,method);
end
tt = vertcat(leafTTs{:});

  0 Comments

Sign in to comment.

More Answers (0)

Tags

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by