Most efficient way to rename struct fields according to a map?

11 visualizaciones (últimos 30 días)
Aditya
Aditya el 16 de Sept. de 2016
Respondida: Steven Lord el 16 de Sept. de 2016
Hi, I am importing ASCII data from multiple sources into MATLAB structs. Each data source uses a different naming convention for the same parameters, so I need to make sure the imported struct data uses a standardized naming convention for fields. For example, one set of data might look like:
data1 =
a1: [1x500 double]
b1: [1x500 double]
And another set of data might look like:
data2 =
a2: [1x500 double]
b2: [1x500 double]
I want to convert them both to standardized field names 'a' and 'b':
data1 =
a: [1x500 double]
b: [1x500 double]
data2 =
a: [1x500 double]
b: [1x500 double]
I have written a small function shown below that does this field replacement:
function s = repfield(s,oldList,newList)
f = fieldnames(s);
v = struct2cell(s);
for i = 1:length(oldList)
[c,i1] = intersect(f,oldList{i});
if isempty(c)
continue
elseif length(c) > 1
error('Non-unique field mapping found')
else
f{i1} = newList{i};
end
end
s = cell2struct(v,f);
So I can call
>> oldList = {{'a1','a2'},{'b1','b2'}};
>> newList = {'a','b'};
>> data1 = repfield(data1,oldList,newList)
data1 =
a: [1x500 double]
b: [1x500 double]
The code is simple enough and works pretty well. But the data I'm handling is rather large and even the maps become quite long (e.g. the variables "oldList" and "newList" in the above function call would be hundreds of elements long). So I was wondering, is this already the most efficient way or does anything else exist, for example by using containers.Map maybe? I have no experience in data handling, but this sounds to me like a common enough problem that should have an efficient solution.
Thanks!
  3 comentarios
Stephen23
Stephen23 el 16 de Sept. de 2016
Editada: Stephen23 el 16 de Sept. de 2016
"this sounds to me like a common enough problem that should have an efficient solution"
Rather than importing into lots of named variables (almost always a very bad idea), it would likely be much more efficient to name that data correctly in the first place, while it is being laoded. Load into a variable, and merge that into one non-scalar structure, making any adjustments to the names before merging.
Aditya
Aditya el 16 de Sept. de 2016
Hey, I get your point about using correct names during importing, rather than after. I've been using the default MATLAB functions such as readtable, which directly imports the data by using the header line in the ASCII file for field names. But I guess I can change that behavior by writing my own import function. Thanks!

Iniciar sesión para comentar.

Respuestas (2)

Jan
Jan el 16 de Sept. de 2016
You can try the fast C-Mex for renaming fields: FEX: RenameField

Steven Lord
Steven Lord el 16 de Sept. de 2016
If you're operating on scalar struct arrays, use dynamic field names.
function newS = canonicalNames(oldS, oldNames, newNames)
newS = struct;
for whichField = 1:numel(oldNames)
oldN = oldNames{whichField};
if ~isfield(oldS, oldN)
error(sprintf('Field %s does not exist in old struct array!', oldN));
end
newN = newNames{whichField};
newS.(newN) = oldS.(oldN);
end
I haven't tested this but it should work. Add additional error checking as desired; that oldS is in fact a scalar struct array, that both oldNames and newNames are cell arrays, that they have the same number of elements, that the names in newNames are valid struct field names, etc.

Categorías

Más información sobre Structures en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by