Restructure table and add missing values

7 visualizaciones (últimos 30 días)
Anna Marie Ingenrieth
Anna Marie Ingenrieth el 31 de Oct. de 2019
Comentada: Maadhav Akula el 8 de Nov. de 2019
I want to write an program, that changes the structure of a table from
T1 = {[name1], [year1], [value] ; [name1], [year2], [value] ; [name2], [year2],[value] ; [name3], [year1], [value]}
to
T2 ={[name], [year], [value],[year2], [value]; [name2], [year1],[value],[year2] [value]; [name3], [year1], [value], [year2], [value]}.
The not existing years of T1 should be filled with 0 in value.
For the first name I reach the structure, but acutally the program does not come to an end and gives an error when it reaches name2. Can anybody give me an hit? Thanks a lot!
function fin = buildbig()
%% load table
in = readtable('chemistryunordered.csv');
i = 4; %table starts with 4
new = [];
joined = [];
fin = [];
[row, column] = size(in);
while i<row-3 % table end at row-3
j = i;
l = 0;
%% Countes how many times a university is in
while strcmp(table2cell(in(i,1)), table2cell(in(i+1,1)))
l = l+1;
i = i+1;
end
i = j;
k = 0;
%% if uni does not have a ranking for under 20 years add all zeros
if ~(strcmp(table2cell(in(i,2)), '20 years and under'))
name = in(i,1);
joined = [name,{'20 years and under'},{'0'},{'0'},{'0'},{'0'},{'0'},{'0'},{'0'}];
joined.Properties.VariableNames = {'NumberOfAcademicStaffIn_11_ChemistryOr_113_ChemistryAtSpecified' 'Var2' 'Var3' 'Var4' 'Var5' 'Var6' 'Var7' 'Var8' 'Var9'}; % elseif k ==0
k = 0;
for k = k:l
joined = join(joined, in(i+k,:), 'Key', 'NumberOfAcademicStaffIn_11_ChemistryOr_113_ChemistryAtSpecified');
k = k+1
if (isempty(fin))
new = joined
else
fin = vertcat(new, joined);
end
end
else
joined = joined1;
for k = k:l
joined = join(joined, in(i+k+1,:), 'Key', 'NumberOfAcademicStaffIn_11_ChemistryOr_113_ChemistryAtSpecified');
if (isempty(fin))
new = joined
else
fin = vertcat(new, joined);
end
end
i = i+l+1;
end
end
  1 comentario
Maadhav Akula
Maadhav Akula el 8 de Nov. de 2019
It would be helpful if you could provide the error message or the data(.csv) file.

Iniciar sesión para comentar.

Respuestas (1)

Mohammad Sami
Mohammad Sami el 4 de Nov. de 2019
If I understand your question correctly, there are 2 very powerful matlab functions that you can use.
These are named "unstack" and "groupsummary".
Using groupsummary you can summarise the values in the table for each university using your specified grouping variables and statistical functions (min max mean e.t.c.)
Using the unstack function you can acheive your restructuring from T1 to T2. Essentially a cross tabulation of data. (Long table to wide table)
You can combine the two together based on your need
T1 = in(4:end-3,:);
% filter the table to remove the missing 20 years and under
% you can use the contains function to find rows containing 20 years and younger
% idx = contains(T1.(2),'20 years and under');
T2 = unstack(T1,'value','year'); % doc unstack for more info
% assume variable names are 'name' 'year' 'value' change as needed
% This will output a crosstabulated table where rows would university name and columns name would be years
% and each cell will contain the value in the corresponding value variable.
% Name x_1999 x_2000 ....
% Uni1 val1 val2 ....
% ....
If you want to count how many times a university appears in your data you can use groupsummary
T1Count = groupsummary(T1,'name'); % check doc groupsummary for more options

Categorías

Más información sobre Matrices and Arrays en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by