Fast subsetting or indexing of data

4 visualizaciones (últimos 30 días)
Louise Wilson
Louise Wilson el 29 de Sept. de 2020
Comentada: Rik el 30 de Sept. de 2020
I am working with large datasets which I am subsetting into various categories and saving as smaller files. What I am doing right now is working but it is quite time consuming and error prone, as it involved a lot of copy and paste.
For example, I have many files I have split into those with boats and those without boats. I then split those into season. Would there be a faster way to do this where I apply the same command to prescribed set of variables?
%% Comparisons... Season using water temp
boatsAbsent_t=boatsAbsent.Var1; %time variables
[BA_spring, BA_summer, BA_autumn, BA_winter]=indexSeasons(boatsAbsent_t); %index times into seasons
boatsPresent_t=boatsPresent.Var1;
[BP_spring, BP_summer, BP_autumn, BP_winter]=indexSeasons(boatsPresent_t);
%Subset PSD outputs and write to file
S=withtol(BA_spring,seconds(1));
BA_spring=boatsAbsent(S,:);
writetable(timetable2table(BA_spring),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Spring.csv')));
S=withtol(BA_summer,seconds(1));
BA_summer=boatsAbsent(S,:);
writetable(timetable2table(BA_summer),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Summer.csv')));
S=withtol(BA_autumn,seconds(1));
BA_autumn=boatsAbsent(S,:);
writetable(timetable2table(BA_autumn),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Autumn.csv')));
S=withtol(BA_winter,seconds(1));
BA_winter=boatsAbsent(S,:);
writetable(timetable2table(BA_winter),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Winter.csv')));
S=withtol(BP_spring,seconds(1));
writetable(timetable2table(BP_spring),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Spring.csv')));
S=withtol(BP_summer,seconds(1));
writetable(timetable2table(BP_summer),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Summer.csv')));
S=withtol(BP_autumn,seconds(1));
writetable(timetable2table(BP_autumn),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Autumn.csv')));
S=withtol(BP_winter,seconds(1));
writetable(timetable2table(BP_winter),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Winter.csv')));
  3 comentarios
Stephen23
Stephen23 el 29 de Sept. de 2020
Meta-data is data, and data does not belong in variable names! Sticking meta-data into variable names, e.g. the season names:
BA_spring, BA_summer, BA_autumn, BA_winter
means that you force yourself into writing slow, inefficient code or doing lots of copy-and-paste. Rik correctly recommends that you should put all of your data in arrays, rather than splitting into separated variables.
Louise Wilson
Louise Wilson el 29 de Sept. de 2020
Awesome, this helps a lot, thank you Stephen! I am glad I asked.

Iniciar sesión para comentar.

Respuesta aceptada

Rik
Rik el 29 de Sept. de 2020
Whenever you find yourself copy-pasting code in Matlab, you should consider an array.
seasons={'Spring','Summer','Autumn','Winter'};
boatsPresent_t=boatsPresent.Var1; %time variables
boatsAbsent_t=boatsAbsent.Var1; %time variables
BP=cell(1,4);BA=cell(1,4);
[BP{:}]=indexSeasons(boatsPresent_t); %index times into seasons
[BA{:}]=indexSeasons(boatsAbsent_t); %index times into seasons
for n=1:numel(seasons)
S=withtol(BP{n},seconds(1));
BP_part=boatsPresent(S,:);
writetable(timetable2table(BP_part),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_',seasons{n},'.csv')));
S=withtol(BA{n},seconds(1));
BA_part=boatsAbsent(S,:);
writetable(timetable2table(BA_part),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_',seasons{n},'.csv')));
end
If you have more states than just present and absent you should consider putting those states in an array so you can use it to generate logical indices.
  5 comentarios
Louise Wilson
Louise Wilson el 30 de Sept. de 2020
I think my issue is just how to name the variable the table is being stored in? BP.seasons{n} doesn't work.
Rik
Rik el 30 de Sept. de 2020
If you want to have a dynamic field name you need to use this syntax:
name='foo';
S.(name)='bar';
But what is wrong with the code you posted? You shouldn't be storing data (i.e. the season) in a variable name. If you do, that will cause the same issue every time you want to use the variables.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Logical en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by