How do I select max columns values based on certain column values?

2 visualizaciones (últimos 30 días)
Dear Experiences ...
i have table look like the following..
Obs-name var1 var2 var3 ...... varn seg_id
ob1 0.12 0.14 0.17 1.2 1
ob2 1.2 0.2 0.14 0.0 1
ob3 1.5 0.3 1.5 7.2 2
ob4 2.4 4.5 2.2 0.0 3
.......................................etc.
i'm doing the following procedures:
1- i eliminate the first var content .. (obs name), using the following
T = readtable('data.xls');
Tnew=T(:,2:end)
2- calculate averages of all vars values based on segment_ID
Tavg=varfun(@mean,Tnew,'GroupingVariable','Seg_ID')
- now the tavg table includes all the average values for all vars based on Seg_ID.
Now i need to select the top var values (K) for every Seg_ID (K, where k=5).. then write table includes the Seg_ID , GroupCount , top K vars names look like the following table :
seg_id(1) Group count (3) var1 name var4 name var100 name
seg_id(2) group count (5) var2 name var3 name var15 name
seg_id(3) Group count (10)var1 name var12 name varn name
...etc..
attached file include a portion of my data table.
i will thanks any one can give a solution for this issue .. thanks
  3 comentarios
ahmed obaid
ahmed obaid el 4 de Abr. de 2017
mean of a particular var of values which belong to same segment ID?
ahmed obaid
ahmed obaid el 4 de Abr. de 2017
i have updates my codes and find good solution till select top (k) vars and print their names

Iniciar sesión para comentar.

Respuesta aceptada

Guillaume
Guillaume el 4 de Abr. de 2017
Editada: Guillaume el 4 de Abr. de 2017
Not really clear on your question. If you're wanting to calculate the mean of the variables, grouped by seg_ID, then:
varfun(@mean, T, 'InputVariables', 2:width(T)-1, 'GroupingVariables', 'seg_ID')
I'm not sure what you want to do after that
edit: maybe you're looking for this function
function varargout = filterrows(rowvalues, varnames)
%function to be used with rowfun, returns the 5 highest column means of the rows together with their names, interlaced.
%requires 'SeparateInputs', false in the rowfun options.
rowmeans = mean(rowvalues, 1);
[~, order] = sort(rowmeans, 'descend');
varargout = [num2cell(rowmeans(order(1:5))); num2cell(varnames(order(1:5)))]; %will be reshaped into a row vector by rowfun
end
which you can use with rowfun:
rowfun(@(rowvalues) filterrows(rowvalues, T.Properties.VariableNames(2:end-1)), ...
T, ...
'InputVariables', 2:width(T)-1, ...
'GroupingVariables', 'seg_ID', ...
'SeparateInputs', false, ...
'NumOutputs', 10)
Not sure it's a good idea, though.
  3 comentarios
Guillaume
Guillaume el 4 de Abr. de 2017
Isn't the output of my rowfun code (all the bit after edit) exactly what you've described?
If it's just the variable names, without the values that you want, then changing it to
function varargout = filterrows(rowvalues, varnames)
%function to be used with rowfun, returns the 5 highest column means of the rows together with their names, interlaced.
%requires 'SeparateInputs', false in the rowfun options.
rowmeans = mean(rowvalues, 1);
[~, order] = sort(rowmeans, 'descend');
varargout = num2cell(varnames(order(1:5)));
end
and calling it with
rowfun(@(rowvalues) filterrows(rowvalues, T.Properties.VariableNames(2:end-1)), ...
T, ...
'InputVariables', 2:width(T)-1, ...
'GroupingVariables', 'seg_ID', ...
'SeparateInputs', false, ...
'NumOutputs', 5)
will give you that.
ahmed obaid
ahmed obaid el 4 de Abr. de 2017
Thank you a lot .... very very smart solution .... thanks.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Workspace Variables and MAT-Files en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by