creating a string multiple string filter on multiple columns

Question

Michael Angeles el 7 de Feb. de 2022

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1644940-creating-a-string-multiple-string-filter-on-multiple-columns

Editada: DGM el 9 de Feb. de 2022

Hello, I have a n x m (row-column data) that I previously was able to do some basic analysis on.

How can I create a multiple "string filter" for each column and remove the unwanted "strings" , after filtering I then need to concatenate the column after removing the unwanted strings.

data = randn(n,m);
results = cell(1,m);
for jj = 1:m
    results{jj} = perform_analysis(data(:,jj));
end

Example:

First Filter is AA, BB, CC, DD (independent of each other) then concatenate "some data" on the column x.

Continue this type of filter until all columns have removed the unwanted strings while the data is concatenated for all columns.

Thanks...

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Michael Angeles el 7 de Feb. de 2022

HI DGM,

the Example array would be something like below but stores the whole new filtered data into a new n x m array variable. I was thinking of a nested for loop but I couldn't get it to work...

Jan el 7 de Feb. de 2022

I do not understand, what you are asking for. What does this mean: concatenate "some data" on the column x ?

What is the shown table? A string array? Then setdiff should work.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

dpb el 7 de Feb. de 2022

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1644940-creating-a-string-multiple-string-filter-on-multiple-columns#answer_890800

Abrir en MATLAB Online

Without knowing the real application and how the data are obtained so it is presumed to already be character type,

>> n=16;m=2;data =cellstr(char(randi([65 70],n,m)))
data =
  16×1 cell array
    {'FC'}
    {'FE'}
    {'DD'}
    {'AD'}
    {'AF'}
    {'BB'}
    {'FE'}
    {'BE'}
    {'EC'}
    {'BD'}
    {'FA'}
    {'CA'}
    {'BD'}
    {'BE'}
    {'DF'}
    {'CA'}
>> result=data(~matches(data,{'AA','BB','CC','DD'}))
result =
  14×1 cell array
    {'FC'}
    {'FE'}
    {'AD'}
    {'AF'}
    {'FE'}
    {'BE'}
    {'EC'}
    {'BD'}
    {'FA'}
    {'CA'}
    {'BD'}
    {'BE'}
    {'DF'}
    {'CA'}
>> 

Since you pasted an image instead of data, the starting array is the same; pasting in the actual example data is much better for responders and more likely to get solution to particular problem if it is more highly data-dependent than this particular one.

If OTOH, the data are really generated as numeric and then combined as above, then one can get their directly from the numerics...

>> result=cellstr(char(data(data(:,1)~=data(:,2),:)))
result =
  14×1 cell array
    {'FC'}
    {'FE'}
    {'AD'}
    {'AF'}
    {'FE'}
    {'BE'}
    {'EC'}
    {'BD'}
    {'FA'}
    {'CA'}
    {'BD'}
    {'BE'}
    {'DF'}
    {'CA'}
>> 

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Michael Angeles el 9 de Feb. de 2022

I needed something that would reiterate to multiple columns and remove the extra string then concatenate the remaining data for each column independently.

DGM el 9 de Feb. de 2022

Abrir en MATLAB Online

How would you reshape this array into 2D after removing the matches?

A = {'AA' 'AB' 'AC'; 'BB' 'BA' 'BC'; 'CA' 'CB' 'CC'}
A = 3×3 cell array
    {'AA'}    {'AB'}    {'AC'}
    {'BB'}    {'BA'}    {'BC'}
    {'CA'}    {'CB'}    {'CC'}

Arrays must be rectangular, so what is an acceptable workaround? Padding the columns with empty cells?

Iniciar sesión para comentar.

Answer 2

DGM el 7 de Feb. de 2022

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1644940-creating-a-string-multiple-string-filter-on-multiple-columns#answer_890795

Abrir en MATLAB Online

Assuming you're dealing with a cell array of chars or string arrays:

A = {'AA'; 'AB'; 'BA'; 'BB'; 'AC'; 'CA'; 'BC'; 'CB'; 'CC'};
toremove = {'AA','BB','CC'};
% you could do it with ismember()
B = A(~ismember(A,toremove))
B = 6×1 cell array
    {'AB'}
    {'BA'}
    {'AC'}
    {'CA'}
    {'BC'}
    {'CB'}
% or you could use setdiff()
C = setdiff(A,toremove,'stable')
C = 6×1 cell array
    {'AB'}
    {'BA'}
    {'AC'}
    {'CA'}
    {'BC'}
    {'CB'}

2 comentarios
Mostrar NingunoOcultar Ninguno

Michael Angeles el 9 de Feb. de 2022

Will this work on 24 columns? I seem to be getting an error...

DGM el 9 de Feb. de 2022

Editada: DGM el 9 de Feb. de 2022

Abrir en MATLAB Online

It should work fine on 2D arrays, but you have to realize that the result will necessarily not be 2D anymore.

A = {'AA'; 'AB'; 'BA'; 'BB'; 'AC'; 'CA'; 'BC'; 'CB'; 'CC'};
A = [A A(randperm(numel(A))) A(randperm(numel(A)))]; %replicate to 3 columns
toremove = {'AA','BB','CC'};
% you could do it with ismember()
B = A(~ismember(A,toremove))
B = 18×1 cell array
    {'AB'}
    {'BA'}
    {'AC'}
    {'CA'}
    {'BC'}
    {'CB'}
    {'AB'}
    {'AC'}
    {'CA'}
    {'BA'}
    {'BC'}
    {'CB'}
    {'BA'}
    {'BC'}
    {'CB'}
    {'AB'}
    {'AC'}
    {'CA'}
% or you could use setdiff()
C = setdiff(A,toremove,'stable')
C = 6×1 cell array
    {'AB'}
    {'BA'}
    {'AC'}
    {'CA'}
    {'BC'}
    {'CB'}

Note that setdiff() returns only the unique values, whereas using ismember() returns everything. Since A in this case is three randomly permuted copies of the same column, the result from B is three times that of C, as it contains three copies of each matching element.

If you are getting errors, you'll have to describe exactly what you're doing and what error you're getting.

EDIT:

Regarding columnwise filtering and padding:

A = {'AA'; 'AB'; 'BA'; 'BB'; 'AC'; 'CA'; 'BC'; 'CB'; 'CC'};
A = repmat(A,[1 3]);
A(:) = A(randperm(numel(A))) % 3x3 but matches aren't uniformly distributed
A = 9×3 cell array
    {'BA'}    {'CC'}    {'BC'}
    {'CB'}    {'BB'}    {'AA'}
    {'CA'}    {'BB'}    {'AB'}
    {'AB'}    {'AA'}    {'AC'}
    {'CA'}    {'AC'}    {'CB'}
    {'BC'}    {'CA'}    {'CC'}
    {'CC'}    {'BB'}    {'AB'}
    {'CB'}    {'BA'}    {'AA'}
    {'BA'}    {'BC'}    {'AC'}
toremove = {'AA','BB','CC'};
B = cell(size(A));
maxr = 0;
for c = 1:size(A,2)
    thisb = A(~ismember(A(:,c),toremove),c);
    B(1:numel(thisb),c) = thisb;
    maxr = max(maxr,numel(thisb));
end
B = B(1:maxr,:)
B = 8×3 cell array
    {'BA'}    {'AC'      }    {'BC'      }
    {'CB'}    {'CA'      }    {'AB'      }
    {'CA'}    {'BA'      }    {'AC'      }
    {'AB'}    {'BC'      }    {'CB'      }
    {'CA'}    {0×0 double}    {'AB'      }
    {'BC'}    {0×0 double}    {'AC'      }
    {'CB'}    {0×0 double}    {0×0 double}
    {'BA'}    {0×0 double}    {0×0 double}

Alternatively, you could put each column in a nested cell array:

B = cell([1 size(A,2)]);
maxr = 0;
for c = 1:size(A,2)
    B{c} = A(~ismember(A(:,c),toremove),c);
end
B
B = 1×3 cell array
    {8×1 cell}    {4×1 cell}    {6×1 cell}

Again, similar can be done with setdiff() if you only want the unique results.

Iniciar sesión para comentar.

creating a string multiple string filter on multiple columns

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

creating a string multiple string filter on multiple columns

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

2 comentarios
Mostrar NingunoOcultar Ninguno