How to filter a structure based on part of nested field names

16 visualizaciones (últimos 30 días)
Hello everyone,
I am trying to find a way to filter a large dataset. This dataset is stored as a structure with many nested structures.
Let's say my main structure is S.
Inside S are 1000 structures with field names 'A_1....A_1000'
Inside 'A_i' are between 2 and 3 structures. 'A_i" has one structure with a name of this nature: D121406..
I need to find a way to filter away any structure 'A_i' that does not have a nested structure with a fieldname that begins with D10 or D11 or D12...D18
(For example: D07XXXX needs to be removed, D19XXXX needs to be removed, D11XXXX is accepted).
I am defintiely a rookie when it comes to MATLAB structures. I have achieved filtering at the level of the 1000 fields, but cannot figure out to make MATLAB go a step further into each of the 1000 A_i and filter those not containing a nested structure with the name (D10-18)
Any guidance and/or help is much appreciated :) I've attrached a photo to help show where in the structure I am trying to perform the filter
**(In this photo, A_1 is accepted, and A_1000 is removed)
  2 comentarios
Stephen23
Stephen23 el 5 de Oct. de 2020
Rather than using a scalar structure with lots of numbered fieldnames, most likely your task would be a lot simpler and more efficient if you used a non-scalar structure instead (with just one/two/a few fieldnames):
Then you could use indexing and comma-separated lists to access the structure content:
Ryan Baker
Ryan Baker el 5 de Oct. de 2020
In my google searches for topics like mine I came accross several comments suggesting the comma separated lists. I wasn;t sure if it was exactly what I would need, but I will absolutely follow your advice and look into it a lot further :) Thank you!

Iniciar sesión para comentar.

Respuesta aceptada

J. Alex Lee
J. Alex Lee el 5 de Oct. de 2020
It's not 100% clear where you are stuck, but it sounds like knowing about "fieldnames()" will help.
For data structure design, it seems to me a better scheme would be to make S into a structure array so that you can simply index into the element rather than access a named element where the name is basically an index.
Further, rather than name the actual data set (DXXXXXX), why not create another field to hold the name of the data (in addition to "info" and the data itself) and call the data field what it is generically: Data.
S(234).Name % would return, say D121406
S(234).Info % would be whatever meta data you have
S(234).Data % the actual data
Then you can generate the [ordered] list of data names as
vertcat(S.Name)
This will be more convenient if you are using the "newer" Matlab data type of "string" (rather than char array).
Then you can filter on the resulting array of strings using something like Regexp.
Alternatively you can pre-digest your variable name "DXXYYYY" into Meta fields
S(234).ID = XX
S(234).SubID = YYYY
So you can do number operations on the field ID like
[S.ID] > 9 & [S.ID] < 19
  5 comentarios
J. Alex Lee
J. Alex Lee el 5 de Oct. de 2020
Regarding your later comments, this example might help after you convert to a structure array
% create a new structure array of length 5
% there can be other fields in each element, but for now just have Name
S(5).Name = "D101234";
S(4).Name = "D121234";
S(3).Name = "D201234";
S(2).Name = "D071234";
S(1).Name = "D151234";
% create a string array of all variable names
AllNames = vertcat(S.Name)
% Assuming all Names have the same format DXXYYYY
% extract XX as a number
digitsToFilter = arrayfun(@(s)str2double(s.extractBetween(2,3)),AllNames)
% "filter" on the digits XX
mask = digitsToFilter < 19 & digitsToFilter > 9
% return the subset
ViableS = S(mask)
Regarding getting your data into a more suitable form first, maybe this will work
for i = 1000:-1:1
% copy contents of the field name which should actually be an index, into a structure array
SNew(i) = S.(sprintf("A_%0d",i));
% find the fieldname that contains the name of your variable
tmpFieldNames = fieldnames(SNew(i));
% some logic to determine which fieldname it is you want
idx = 2; % but you want to test for it using regexp, or ~= Info or whatever
SNew(i).Name = tmpFieldNames{2};
end
Ryan Baker
Ryan Baker el 5 de Oct. de 2020
Woah! I have never used Logical arrays; that is definitely a useful tool :) I have spent the last hour going through each part of each line, making sure I understand the rationale and function. It is tremendously helpful and I will try applying it to my actual dataset!
I am still a bit worried about how I'll retreive all DXXYYY fields and create an array with them...as they are individually nested within the 'name', but I am certiain the method you provided is where I need to be headed, and now to work on tinkering it to work with my dataset :)
Thanks again J. Alex Lee...you're fantastic!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Logical en Help Center y File Exchange.

Productos


Versión

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by