Sort cell strings according to specific subsets of those cell strings

4 visualizaciones (últimos 30 días)
Let's say I have a cell string with values:
filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...
'2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...
'2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
'2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';...
'2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
'2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
'2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
'2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC';...
'2009.272.17.57.29.3695.AZ.SMER..BHZ.R.SAC';...
'2009.272.17.57.29.9445.AZ.BZN..BHN.R.SAC';...
'2009.272.17.57.30.0000.AZ.RDM..BHN.R.SAC';...
'2009.272.17.57.30.8000.AZ.RDM..BHZ.R.SAC';...
'2009.272.17.57.31.8250.AZ.LVA2..BHZ.R.SAC';...
'2009.272.17.57.31.8500.AZ.LVA2..BHE.R.SAC';...
'2009.272.17.57.31.9195.AZ.BZN..BHZ.R.SAC';...
'2009.272.17.57.32.0000.AZ.WMC..BHZ.R.SAC';...
'2009.272.17.57.32.6750.AZ.WMC..BHN.R.SAC';...
'2009.272.17.57.33.3195.AZ.KNW..BHZ.R.SAC';...
'2009.272.17.57.33.4750.AZ.TRO..BHN.R.SAC';...
'2009.272.17.57.33.7750.AZ.PFO..BHN.R.SAC';...
'2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC';...
'2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC';...
'2009.272.17.57.34.8000.AZ.TRO..BHZ.R.SAC';...
'2009.272.17.57.35.0000.AZ.WMC..BHE.R.SAC';...
'2009.272.17.57.35.0750.AZ.RDM..BHE.R.SAC';...
'2009.272.17.57.35.8945.AZ.KNW..BHE.R.SAC';...
'2009.272.17.57.36.0250.AZ.FRD..BHE.R.SAC';...
'2009.272.17.57.36.2250.AZ.CRY..BHZ.R.SAC';...
'2009.272.17.57.36.3500.AZ.CRY..BHN.R.SAC';...
'2009.272.17.57.36.4500.AZ.SND..BHE.R.SAC';...
'2009.272.17.57.36.5000.AZ.TRO..BHE.R.SAC';...
'2009.272.17.57.36.5195.AZ.KNW..BHN.R.SAC';...
'2009.272.17.57.36.5750.AZ.CRY..BHE.R.SAC'};
I want to be able to assume that I do not know what character the station name (e.g., CRY) or component name (e.g., BHE) starts and ends on. Though, the number of periods (".") will be consistent.
I have something fairly clunky to do this, but I am wondering if anyone can suggest a quick one/two-liner that would assume a string format of the general form:
YYYY.DDD.HH.MM.SS.ssss.$1.$2..$3.R.SAC
where:
$1 = Array name $2 = Station name $3 = Component name
And then sort the list with the primary and secondary sort order according to $2 and $3, respectively, so that the first 6 rows in the cell string would be:
2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC
2009.272.17.57.29.9445.AZ.BZN..BHN.R.SAC
2009.272.17.57.31.9195.AZ.BZN..BHZ.R.SAC
2009.272.17.57.36.5750.AZ.CRY..BHE.R.SAC
2009.272.17.57.36.3500.AZ.CRY..BHN.R.SAC
2009.272.17.57.36.2250.AZ.CRY..BHZ.R.SAC
...
  4 comentarios
Jan
Jan el 22 de En. de 2012
It looks like the parts do *not* have the same length:
'2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC'
'2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC'
Dr. Seis
Dr. Seis el 22 de En. de 2012
Oh, his question was related to the "component" name, which are all the same number of characters (i.e., 3). The "station" names are not the same - they range from 3 to 4 characters.

Iniciar sesión para comentar.

Respuesta aceptada

Oleg Komarov
Oleg Komarov el 22 de En. de 2012
% Split using |'.'| as the delimiter
splt = regexpi(filename,'\.','split');
% Sort according to the 8th and 10th column
[sorted,idx] = sortrows(cat(1,splt{:}),[8,10])
Now you can use the sorted split array or apply idx to filename

Más respuestas (2)

Jan
Jan el 22 de En. de 2012
filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...
'2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...
'2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
'2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';...
'2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
'2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
'2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
'2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC'};
n = numel(filename);
C2 = cell(1, n);
C3 = cell(1, n);
for iC = 1:n
D = textscan(filename{iC}(27:end), '%s', 'Delimiter', '.');
C2{iC} = D{1}{1};
C3{iC} = D{1}{3};
end
% A kind of SORTROWS:
[dummy, ind3] = sort(C3);
[dummy, ind2] = sort(C2(ind3));
index = ind3(ind2);
filename = filename(index);
  3 comentarios
Dr. Seis
Dr. Seis el 22 de En. de 2012
Thanks for the updated code... +1!
Jan
Jan el 23 de En. de 2012
While Oleg's REGEXP is much nicer than calling TEXTSCAN in a loop, SORTROWS does exactly the same as my sorting method, but with a lot of overhead.

Iniciar sesión para comentar.


Dr. Seis
Dr. Seis el 22 de En. de 2012
Here is the clunky version I have been using:
numFiles = numel(filename);
sortcell = {''};
sortind = zeros(numFiles,4);
for i = 1 : numFiles
sortind(i,2)=strfind(filename{i},'..')-1;
for j = sortind(i,2):-1:1
if isequal(filename{i}(j),'.')
break;
end
sortind(i,1)=j;
end
sortind(i,3)=sortind(i,2)+3;
for j = sortind(i,3):length(filename{i})
if isequal(filename{i}(j),'.')
break;
end
sortind(i,4)=j;
end
sortcell(i,1)=cellstr(filename{i}(sortind(i,1):sortind(i,2)));
sortcell(i,2)=cellstr(filename{i}(sortind(i,3):sortind(i,4)));
end
[tempcell,tempind1]=sort(sortcell(:,2));
[tempcell,tempind2]=sort(sortcell(tempind1,1));
filename = filename(tempind1(tempind2));

Categorías

Más información sobre Shifting and Sorting Matrices en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by