MATLAB Answers

Labelling columns of large array in a searchable way?

3 views (last 30 days)
Jack Bray
Jack Bray on 20 Mar 2021
Commented: Jack Bray on 22 Mar 2021
I'm working for some advice on working with large datasets.
I am trying to label individual strips of continous data in a way that the label can then be used to group/sort by a specific tag: 1, 2 or 3.
Currently for each dataset I am reshaping into an array (800x43200), generating variable names containing tag (1x43200 cell), making a table and saving as a txt.
Then I need if I need all the 1 tags from all datasets, I have to read each table, use a for-loop and str2num on the variable name, parse out the tag and use that to gather correct columns.
This doesn't seem like the best way of doing it, I thought perhaps I should be using tabularText datastores or tall tables but these don't seem to help with my sorting/averaging of specific tags.
Any advice you can offer to point me in the right direction will be greatly appreciated.
Jack Bray
Jack Bray on 20 Mar 2021
Thanks for the quick replies, sorry I wasn't clearer, I'll try to explain what I mean in matlab:
% each dataset starts as one long column, I have over 1000 datasets
% currently it looks something like this:
for kk = 1:numel(datasets)
rawdata = load(datasets{kk}); % rawdata = 1x34560000 double
epcdata = reshape(rawdata,800,43200);
for ii = 1:43200
tag(ii) = %use data to get tag: 1,2 or 3
% tag = 1x43200 double
for ii = 1:43200
vnames{ii} = [num2str(ii) '_' num2str(tag(ii))];
T = array2table(epcdata,'VariableNames',vnames);
% This is all so I can search each dataset for specific tags like this for tag = 1:
for kk = 1:numel(Tables)
T = readtable(Tables{kk})
for ii = 1:43200
tag(1,ii) = str2double(T.Properties.VariableNames{ii}(end));
ones(:,kk) = mean(T(:,find(tag == 1)),2);
This method of using the variable name of the table as a label to search for seems silly but I can't figure out how something like this should be done.

Sign in to comment.

Accepted Answer

Seth Furman
Seth Furman on 22 Mar 2021
table supports custom metadata properties.
In your case, you could add a "tag" custom variable property to T as in the following example.
rng default
rawdata = randi(100,1,34560000); % rawdata = 1x34560000 double
epcdata = reshape(rawdata,800,43200);
vnames = string(1:43200);
T = array2table(epcdata,'VariableNames',vnames);
tag = randi(3,1,43200);
T = addprop(T,"tag","variable");
T.Properties.CustomProperties.tag = tag;
Now the "tag" and variable name properties are distinct
>> T(1:5,1:5)
ans =
5×5 table
1 2 3 4 5
__ __ __ __ __
82 69 68 82 25
91 14 44 19 39
13 73 70 13 44
92 12 26 83 84
64 12 1 64 83
>> T.Properties.CustomProperties.tag(1:5)
ans =
3 3 1 1 1
Note that you will have to write your table to a MAT file instead of a text file in order to preserve the custom property you added.
save T.mat T
Please let me know if this meets your use case.
  1 Comment
Jack Bray
Jack Bray on 22 Mar 2021
Thank you for your answer, I think I can use this approach to make my code much more efficient.
You saved me a lot of hassle as I was just about to attempt to convert it all into HDF5 and use the attributes as a custom tag but using tables will be much more conveinent.
Thanks again!

Sign in to comment.

More Answers (1)

Jan on 20 Mar 2021
vnames{ii} = [num2str(ii) '_' num2str(tag(ii))];
This hides the tags in the names of the variables. This complicated method requires even more complicated methods to access the tags later.
Store the tags as numbers, e.g. as additional column.
  1 Comment
Jack Bray
Jack Bray on 20 Mar 2021
Thanks for this, I had thought about adding an extra row for tags and just using the numbers but it still requires reading each table in order to search for specific tags. Maybe this is just the easiest way to do it.
I just imagined there might be an easier way of organising/searching this kind of columnar data, perhaps using datastores or tall tables etc

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by