For loop only working/filling cell array for half of data

1 visualización (últimos 30 días)
Claudia
Claudia el 11 de Nov. de 2022
Editada: Karim el 12 de Nov. de 2022
I am trying to use a for loop to fill a cell array containing tables with various statistics (e.g. mean, median ...) for sites within a large dataset.
The aim is to end up with a cell array 1x42, with a table for each variable.
The loop seems to only work for the first 16 variables. The remaining tables are empty. However, if I run the same loop specifiying a single variable (eg. i = 20), the code works and that output gives a filled table.
Code and input data are attached.
clear variables; clc; load x.mat;
for i = 1:(size(x,2))
x = x(~isnan(table2array(x(:,i))),:);
[site_num,ia,obs_count] = unique(x.site_num,'sorted');
ans_mean = accumarray(obs_count,table2array(x(:,i)),[],@(x)mean(x,'omitnan')); ans_mean = [array2table(ans_mean)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_mean.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_mean = renamevars(ans_mean,'ans_mean',header);
ans_median = accumarray(obs_count,table2array(x(:,i)),[],@(x)median(x,'omitnan')); ans_median = [array2table(ans_median)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_median.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_median = renamevars(ans_median,'ans_median',header);
ans_std = accumarray(obs_count,table2array(x(:,i)),[],@(x)std(x,'omitnan')); ans_std = [array2table(ans_std)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_std.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_std = renamevars(ans_std,'ans_std',header);
ans_lq = accumarray(obs_count,table2array(x(:,i)),[],@(x)quantile(x,0.25)); ans_lq = [array2table(ans_lq)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_lq.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_lq = renamevars(ans_lq,'ans_lq',header);
ans_uq = accumarray(obs_count,table2array(x(:,i)),[],@(x)quantile(x,0.75)); ans_uq = [array2table(ans_uq)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_uq.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_uq = renamevars(ans_uq,'ans_uq',header);
obs_count = array2table(accumarray(obs_count,1)); txt1 = x(:,i).Properties.VariableNames; header = strcat(txt1,{'_'},{'obs_count'}); obs_count = renamevars(obs_count,'Var1',header);
all{i} = [array2table(site_num) ans_mean ans_median ans_std ans_lq ans_uq obs_count];
end
Any thoughts/help/tips would be greatly appreciated! Thank you!
Apologies if my code is quite inefficient, I'm still in the learning process :)
  2 comentarios
Claudia
Claudia el 11 de Nov. de 2022
Thanks so much for the tip Stephen! I will make sure to that in the future :)

Iniciar sesión para comentar.

Respuesta aceptada

Karim
Karim el 11 de Nov. de 2022
Editada: Karim el 12 de Nov. de 2022
One issue was the reuse of the variable name "x" directly after entering the loop, you overwrite your orinal data by removing elements with a nan. After a few loops you are left with no data.
It's better to create a temporary variable, I called it "currData" to extract the data on which your are working in the current loop. I shortend the code a bit and added a few comments.
% load mat file
load(websave('myFile', "https://www.mathworks.com/matlabcentral/answers/uploaded_files/1189013/x.mat"));
% allocate a cell array for the output data
AllData = cell(1,size(x,2));
for i = 1:size(x,2)
% extract data for current loop, and convert to array
% EDIT: included Stephen23's proposal to extract the data
currData = x{:,i};
% figure out which values are a number
NumIdx = ~isnan( currData );
% only keep the numbers for further processing
currData = currData(NumIdx);
% sort the "site num" for the numbers in tha array
[site_num,~,obs_count] = unique(x.site_num(NumIdx) ,'sorted');
% get the name of the current variable
currVarName = x(:,i).Properties.VariableNames + "_";
% do the processing
ans_mean = accumarray(obs_count,currData,[],@(x)mean(x,'omitnan'));
ans_median = accumarray(obs_count,currData,[],@(x)median(x,'omitnan'));
ans_std = accumarray(obs_count,currData,[],@(x)std(x,'omitnan'));
ans_lq = accumarray(obs_count,currData,[],@(x)quantile(x,0.25));
ans_uq = accumarray(obs_count,currData,[],@(x)quantile(x,0.75));
% create the table names for the current variable
varNames = [ currVarName + "site_num";
currVarName + "ans_mean";
currVarName + "ans_median";
currVarName + "ans_std";
currVarName + "ans_lq";
currVarName + "ans_uq"
currVarName + "obs_count";];
% gather the data in a table
currTable = table(site_num, ans_mean, ans_median, ans_std, ans_lq, ans_uq, accumarray(obs_count,1),...
'VariableNames',varNames);
% store the table in the output cell array
AllData{i} = currTable;
end
% have a look at the data in the output cell
AllData
AllData = 1×42 cell array
{130×7 table} {130×7 table} {130×7 table} {130×7 table} {92×7 table} {76×7 table} {57×7 table} {104×7 table} {99×7 table} {53×7 table} {130×7 table} {104×7 table} {67×7 table} {102×7 table} {98×7 table} {45×7 table} {26×7 table} {26×7 table} {18×7 table} {68×7 table} {81×7 table} {69×7 table} {27×7 table} {62×7 table} {9×7 table} {0×7 table} {66×7 table} {66×7 table} {65×7 table} {65×7 table} {15×7 table} {8×7 table} {46×7 table} {27×7 table} {51×7 table} {29×7 table} {51×7 table} {29×7 table} {29×7 table} {17×7 table} {16×7 table} {9×7 table}
  1 comentario
Claudia
Claudia el 11 de Nov. de 2022
Thank you soooooo much Karim! You have literally saved the day :)
Thank you for your detailed, thoughtful and super helpful answer! I really appreciate it!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Matrix Indexing en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by