Data Analysis advance filtering

2 visualizaciones (últimos 30 días)
Devansh Patel
Devansh Patel el 19 de Feb. de 2019
Editada: Sean de Wolski el 19 de Feb. de 2019
I have table which has 200 variables. 4 different types. A1,A2...A50,B1,B2...B50,C1,C2....C50,D1,D2.....D50. "THESE ARE THE COLUMN NAMES." I want to apply multiple filters to multiple columns.
Like
(( B1,B2... 50 > 3) & (A1, A2....A50 > 0.1) & (C1,C2....C50 < 1))
And I want to remove each row which does not fit the criteria from the table.
Please help.

Respuestas (2)

Bob Thompson
Bob Thompson el 19 de Feb. de 2019
You can do this with logic indexing.
data = ... % Your main data block
conditions = data(:,1)>0.1 & data(:,2)>3 & data(:,3)<1;
data2 = dat(conditions,:);
  4 comentarios
Bob Thompson
Bob Thompson el 19 de Feb. de 2019
Editada: Bob Thompson el 19 de Feb. de 2019
I'm a bit confused on what you mean by 'columns.' Do you have multiple T variables? You're T1 variable already contains multiple rows and columns which are being filtered, so I'm going to assume you're referring to different tables, rather than different columns within a table. If this is incorrect, please feel free to say so and explain where I am wrong.
If your data is in multiple variables, is it possible to put it into a larger matrix? Possibly using a cell matrix, where each cell is a variable? This will make it possible to index through the different variables, and then you are able to use a for loop to filter all of the data.
T = {T1,T3};
for i = 1:size(T,2)
T2{i} = T{i}(T{i}{:,2}>3 & T{i}{:,3}<0.1,:);
end
EDIT**
If you do actually mean 'columns' of a single variable, and all of your data is just doubles, I would strongly advise turning your cell array into a regular matrix.
T1 = cell2mat(T1);
This will make the logical operations much easier, and you will be less likely to run into issues with concatonating that may occur with cells.
I think that you will be better off keeping your logic checks as single rows, and using more checks for a single statement. This is my personal opinion, but I think it is better because while the statements themselves do get longer and more convoluted it is easier to make appropriate adjustments to columns or conditions to be met.
T2 = T1(T1(:,7)>3 & T1(:,8)>3 & T1(:,2)<0.1 & T1(:,3)<0.1,:);
Devansh Patel
Devansh Patel el 19 de Feb. de 2019
All the columns are in the same table. A1...A2... D49,D50 are all columns. What I'm trying to do is filter A1,A2...A50 with same conditions like
(( B1,B2... 50 > 3) & (A1, A2....A50 > 0.1) & (C1,C2....C50 < 1))
And filter out all row of a table. Even if single criteria does not match.

Iniciar sesión para comentar.


Sean de Wolski
Sean de Wolski el 19 de Feb. de 2019
Editada: Sean de Wolski el 19 de Feb. de 2019
Tested in R2018b. May need to convert strings to cellstrs in older releases.
%% Sample data
Data = array2table(rand(2,200));
names = ["A";"B";"C";"D"]+(1:50);
Data.Properties.VariableNames = names(:);
Thresh = array2table(rand(1,4), 'VariableNames',["A"; "B"; "C"; "D"]);
%% engine
rix = false(height(Data), 1); % Remove rows
for ii = 1:width(Thresh)
% Corresponding variables
vars = startsWith(Data.Properties.VariableNames, Thresh.Properties.VariableNames{ii});
% previously set to remove or matches criteria here.
rix = rix | any(Data{:, vars} > Data{1, ii}, 2);
end
Data(rix,:) = [] %Remove them
Here I use > for every operation. You could use an operation function handle as well if necessary. Store the operations in a cell array {@le, @ge, @lt} and then use the corresponding one on that iteration of the loop.

Categorías

Más información sobre Matrices and Arrays en Help Center y File Exchange.

Productos


Versión

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by