strcmp and rows of dataset table

6 visualizaciones (últimos 30 días)
Bina
Bina el 27 de Dic. de 2011
i have a text dataset with 4 columns and n rows : cl1 cl2 cl3 cl4 i want to know how can i use strcmp() to show which rows are with the same CL2 and CL3 (no CL2=CL3)for example ,according to the dataset below i want to show row1 and row4 , becouse they have same cl2 and cl3,
cl1 cl2 cl3 cl4
---------------------------
a b c d
d j h n
s b v y
q b c g
and as i said dataset has "n" rows so some rows have same CL2-Cl3 and... i want to make domains, for example Domain1={some rows with same CL2-CL3} Domain2={some rows with another same CL2-Cl3} , ...
pleasecheck code below and give me idea what should i do? how to use strcmp() in this case? and how to show the target rows?
fid = fopen('Input2.txt','r')
data = textscan(fid,'%s %s %s %s')
fclose(fid)
indices = strcmp(data{2}{1},data{2})&&(data{1})
sum(indices)

Respuesta aceptada

Matt Tearle
Matt Tearle el 27 de Dic. de 2011
Sounds like a job for categorical arrays! Huzzah! (Assuming you have Statistics Toolbox.) BTW, you said "dataset" but you're using cell arrays, so I assume you don't mean the dataset array in Stats TB. However, they may be a useful way to package your data. Anyway... why not make a new variable that is the combination of columns 2 and 3, and look for the unique values of that array:
twoandthree = nominal(strcat(data{2},'-',data{3}))
data = [data{:}];
domains = getlabels(twoandthree)
for k=1:length(domains)
foo = data(twoandthree==domains{k},:)
end
If you don't have Stats TB, you can achieve the same result with unique and strcmp:
twoandthree = strcat(data{2},'-',data{3})
data = [data{:}];
domains = unique(twoandthree)
for k=1:length(domains)
foo = data(strcmp(twoandthree,domains{k}),:)
end
Also, note I'm using [data{:}] to extract the four columns (each being a cell array) and concatenate them together into a single four-column table (ie a single n-by-4 cell array containing strings). If you're going to be accessing by rows, that's a nicer arrangement of data.
But, as I mentioned, dataset arrays may also make life nice, depending on what you're doing to do with the subsets.
data = dataset(data{:},'VarNames',strcat('cl',cellstr(num2str((1:4)'))))
twoandthree = nominal(strcat(data.cl2,'-',data.cl3))
domains = getlabels(twoandthree)
for k=1:length(domains)
foo = data(twoandthree==domains{k},:)
end
  3 comentarios
Matt Tearle
Matt Tearle el 27 de Dic. de 2011
[Strikes heroic pose] Don't thank me. Thank logical indexing. [Rides off into sunset]
Walter Roberson
Walter Roberson el 27 de Dic. de 2011
Indexing! Indexing! Get your red-hot Logical Indexing here!
Authorized! Signed! Get your red-hot Logical Indexing!
Vectorized! Multidimensional! Endorsed by "Shane" Tearle!
Get your read-hot Logical Indexing!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Cell Arrays en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by