Using accumarray to organize large CSV
    3 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
I have a CSV with 4 columns, containing several thousands of rows of data. I have attached a sample data sheet with a small amount of data as an example. 
I'm trying to write a code to import the data in a way that will be easy to create scatter plots/visualize the data, as well as calculate some basic stats (mean, median, std. dev., etc).
I want to group the data based on TWO of the columns: SITE NUMBER and DEPTH
The other two columns, d18O and date are what I need to plot/visualize
Essentially, my goal is to have separate variables or columns in a table that show d18O and DATE for each SITE NUMBER and DEPTH
I started using accumarray to accomplish this but I am having issues with the arrays, they are not showing up with all of the unique rows of data
T = readtable("data1.csv", 'VariableNamingRule','preserve');
[G,ID] = findgroups(T.("Site number"));                                            
data = accumarray(G, T{:,4},[], @(x){T{x,:}});
0 comentarios
Respuesta aceptada
  dpb
      
      
 el 22 de Sept. de 2023
        
      Editada: dpb
      
      
 el 22 de Sept. de 2023
  
      tT=readtable("data1.csv");
tT.Properties.VariableNames(2)={'Site'};    % shorten to be more convenient to use
head(tT)
G=grpstats(tT,{'Site','Depth'},{'mean','median','std'},'DataVars',{'Date','d18O'})
hSc=rowfun(@doit,tT,'GroupingVariables',{'Site','Depth'},'InputVariables',{'Date','d18O'},'OutputFormat','uniform');
function h=doit(x,y)
  figure
  h=scatter(x,y,'filled');
  xlabel('Date'),ylabel('d18O')
end
2 comentarios
  dpb
      
      
 el 22 de Sept. de 2023
				NOTA BENE that you can get much more creative with the user function...
tT=readtable("data1.csv");
tT.Properties.VariableNames(2)={'Site'};    % shorten to be more convenient to use
hSc=rowfun(@doit,tT,'GroupingVariables',{'Site','Depth'},'InputVariables',{'Date','d18O','Site','Depth'},'OutputFormat','uniform');
function h=doit(x,y,s,d)
  figure
  h=scatter(x,y,'filled');
  xlabel('Date'),ylabel('d18O')
  legend(compose('Site %3d Depth %3d',s(1),d(1)))
end
Más respuestas (1)
  Voss
      
      
 el 21 de Sept. de 2023
        
      Editada: Voss
      
      
 el 21 de Sept. de 2023
  
      "my goal is to have separate variables or columns in a table that show d18O and DATE for each SITE NUMBER and DEPTH"
Something like this?
T = readtable("data1.csv", 'VariableNamingRule','preserve');
disp(T);
gvars = ["Site number","Depth"];
vars = setdiff(T.Properties.VariableNames,gvars);
T_summary = groupsummary(T,gvars,@(x){x});
T_summary = removevars(T_summary,"GroupCount");
T_summary = renamevars(T_summary,"fun1_"+vars,vars)
T_summary.Date{1}
T_summary.d18O{1}
1 comentario
  Voss
      
      
 el 22 de Sept. de 2023
				Or maybe it's more convenient just to sort the table so that rows with the same Site number and Depth are together:
T = readtable("data1.csv", 'VariableNamingRule','preserve');
disp(T);
T_sorted = sortrows(T,["Site number","Depth"]);
disp(T_sorted);
Ver también
Categorías
				Más información sobre Tables en Help Center y File Exchange.
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!














