Split and count unique string in cell array

I have a cell array in the form of:
A =
B25
A35
L35 J23
K32 I25
B25 ...
where cetain elements repeat. I need to count how many unique elements there are and then list number of occurences of each element. For the above example it would be something like:
B25 ... 2
L35 ... 1
K32 ... 1 etc.
I tried using different combinations of strplit, regexp and unique, but some returned errors, others returned an array with the whole row counted as unique, so for the example above it would say there are 4 unique elements instead of 6 because L35 J23 is counted as 1, not 2. There is a hint that converting to categorical might help, but I am not sure how to utilize its functions in order to get the desired result.

 Respuesta aceptada

A = {'B25';'A35';'L35 J23';'K32 I25';'B25'};
B = regexp(A,'\S+','match');
T = cell2table([B{:}].');
S = groupsummary(T,'Var1')
S = 6×2 table
Var1 GroupCount _______ __________ {'A35'} 1 {'B25'} 2 {'I25'} 1 {'J23'} 1 {'K32'} 1 {'L35'} 1

1 comentario

Josipe Jurcic
Josipe Jurcic el 25 de Mzo. de 2022
Thanks for your reply.
This one seems to do it. Thanks again.

Iniciar sesión para comentar.

Más respuestas (2)

Mohammed Hamaidi
Mohammed Hamaidi el 25 de Mzo. de 2022
A loop solution:
C=unique(A);nc=length(C);
B=char(A);nb=length(B);
D=zeros(nc,1);
for i=1:nc
for j=1:nb
if strcmp(B(j,:),char(C{i}))
D(i)=D(i)+1;
end
end
end
for i=1:nc
disp([char(C{i}) ' ' num2str(D(i))])
end

1 comentario

Josipe Jurcic
Josipe Jurcic el 25 de Mzo. de 2022
Thanks for your reply.
It throws this error:
Index in position 1 exceeds array bounds. Index must not exceed 6.

Iniciar sesión para comentar.

Use function groupsummary
A = {'B25';'A35';'L35 J23';'K32 I25';'B25'};
T = table(A);
groupsummary(T,'A')
ans = 4×2 table
A GroupCount ___________ __________ {'A35' } 1 {'B25' } 2 {'K32 I25'} 1 {'L35 J23'} 1

3 comentarios

Josipe Jurcic
Josipe Jurcic el 25 de Mzo. de 2022
First of all, thanks for your reply.
This won't do as it's not counting the occurence of each 'XXX' string. Thinking back I don't think it was clear enough: K32 and I25 should be counted as separate, which is why I mentioned strsplit as I was trying to separate that row into two rows, first one having K32 and the second I25. I tried that in a for loop but I got an error in dimensions mismatch. If you do:
A = {'B25';'A35';'L35 J23';'K32 I25';'B25'; 'L35 J10'};
then it will just add L35 J10 in its own group but not tell me that there are two L35 in this array. Hopefully it's more clear now.
Just add a few things as follows:
A = {'B25';'A35';'L35 J23';'K32 I25';'B25'; 'L35 J10'};
B = cellfun(@(x) strsplit(x),A,'uni',0); % Split them
C = cat(2,B{:})'; % Combine as a column
T = table(C);
groupsummary(T,'C')
ans = 7×2 table
C GroupCount _______ __________ {'A35'} 1 {'B25'} 2 {'I25'} 1 {'J10'} 1 {'J23'} 1 {'K32'} 1 {'L35'} 2
Josipe Jurcic
Josipe Jurcic el 25 de Mzo. de 2022
Thanks for your reply.
This works as well. Thanks again.

Iniciar sesión para comentar.

Categorías

Más información sobre Characters and Strings en Centro de ayuda y File Exchange.

Preguntada:

el 25 de Mzo. de 2022

Comentada:

el 25 de Mzo. de 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by