Sort cell strings according to specific subsets of those cell strings

Question

0 votos

Let's say I have a cell string with values:

filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...  
              '2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...   
              '2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
              '2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';... 
              '2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
              '2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
              '2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
              '2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC';... 
              '2009.272.17.57.29.3695.AZ.SMER..BHZ.R.SAC';...
              '2009.272.17.57.29.9445.AZ.BZN..BHN.R.SAC';...
              '2009.272.17.57.30.0000.AZ.RDM..BHN.R.SAC';...
              '2009.272.17.57.30.8000.AZ.RDM..BHZ.R.SAC';...
              '2009.272.17.57.31.8250.AZ.LVA2..BHZ.R.SAC';...
              '2009.272.17.57.31.8500.AZ.LVA2..BHE.R.SAC';...
              '2009.272.17.57.31.9195.AZ.BZN..BHZ.R.SAC';... 
              '2009.272.17.57.32.0000.AZ.WMC..BHZ.R.SAC';...   
              '2009.272.17.57.32.6750.AZ.WMC..BHN.R.SAC';...   
              '2009.272.17.57.33.3195.AZ.KNW..BHZ.R.SAC';...   
              '2009.272.17.57.33.4750.AZ.TRO..BHN.R.SAC';...   
              '2009.272.17.57.33.7750.AZ.PFO..BHN.R.SAC';...   
              '2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC';...   
              '2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC';...  
              '2009.272.17.57.34.8000.AZ.TRO..BHZ.R.SAC';...   
              '2009.272.17.57.35.0000.AZ.WMC..BHE.R.SAC';...   
              '2009.272.17.57.35.0750.AZ.RDM..BHE.R.SAC';...   
              '2009.272.17.57.35.8945.AZ.KNW..BHE.R.SAC';...   
              '2009.272.17.57.36.0250.AZ.FRD..BHE.R.SAC';...   
              '2009.272.17.57.36.2250.AZ.CRY..BHZ.R.SAC';...  
              '2009.272.17.57.36.3500.AZ.CRY..BHN.R.SAC';...   
              '2009.272.17.57.36.4500.AZ.SND..BHE.R.SAC';...   
              '2009.272.17.57.36.5000.AZ.TRO..BHE.R.SAC';...   
              '2009.272.17.57.36.5195.AZ.KNW..BHN.R.SAC';...   
              '2009.272.17.57.36.5750.AZ.CRY..BHE.R.SAC'};

I want to be able to assume that I do not know what character the station name (e.g., CRY) or component name (e.g., BHE) starts and ends on. Though, the number of periods (".") will be consistent.

I have something fairly clunky to do this, but I am wondering if anyone can suggest a quick one/two-liner that would assume a string format of the general form:

YYYY.DDD.HH.MM.SS.ssss.$1.$2..$3.R.SAC

where:

$1 = Array name $2 = Station name $3 = Component name

And then sort the list with the primary and secondary sort order according to $2 and $3, respectively, so that the first 6 rows in the cell string would be:

272.17.57.27.9445.AZ.BZN..BHE.R.SAC
272.17.57.29.9445.AZ.BZN..BHN.R.SAC
272.17.57.31.9195.AZ.BZN..BHZ.R.SAC
272.17.57.36.5750.AZ.CRY..BHE.R.SAC
272.17.57.36.3500.AZ.CRY..BHN.R.SAC
272.17.57.36.2250.AZ.CRY..BHZ.R.SAC
...

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Jan el 22 de En. de 2012

It looks like the parts do *not* have the same length:

'2009.272.17.57.33.9000.AZ.PFO..BHZ.R.SAC'

'2009.272.17.57.34.1750.AZ.LVA2..BHN.R.SAC'

Dr. Seis el 22 de En. de 2012

Oh, his question was related to the "component" name, which are all the same number of characters (i.e., 3). The "station" names are not the same - they range from 3 to 4 characters.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Oleg Komarov el 22 de En. de 2012

Abrir en MATLAB Online

2 votos

% Split using |'.'| as the delimiter
splt = regexpi(filename,'\.','split');
% Sort according to the 8th and 10th column
[sorted,idx] = sortrows(cat(1,splt{:}),[8,10])

Now you can use the sorted split array or apply idx to filename

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Dr. Seis el 22 de En. de 2012

Just what I was looking for. Thanks, Oleg!

Jan el 23 de En. de 2012

+1 for the compact REGEXP call.

Iniciar sesión para comentar.

Answer 2

Jan el 22 de En. de 2012

Abrir en MATLAB Online

1 voto

filename = {'2009.272.17.57.23.8445.AZ.SMER..BHE.R.SAC';...  
            '2009.272.17.57.24.5500.AZ.FRD..BHN.R.SAC';...   
            '2009.272.17.57.27.5445.AZ.SMER..BHN.R.SAC';...
            '2009.272.17.57.27.8000.AZ.SND..BHZ.R.SAC';... 
            '2009.272.17.57.27.9445.AZ.BZN..BHE.R.SAC';...
            '2009.272.17.57.28.7000.AZ.SND..BHN.R.SAC';...
            '2009.272.17.57.29.1250.AZ.FRD..BHZ.R.SAC';...
            '2009.272.17.57.29.2250.AZ.PFO..BHE.R.SAC'};
n = numel(filename);
C2 = cell(1, n);
C3 = cell(1, n);
for iC = 1:n
  D      = textscan(filename{iC}(27:end), '%s', 'Delimiter', '.');
  C2{iC} = D{1}{1};
  C3{iC} = D{1}{3};
end
% A kind of SORTROWS:
[dummy, ind3] = sort(C3);
[dummy, ind2] = sort(C2(ind3));
index         = ind3(ind2);
filename      = filename(index);

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Dr. Seis el 22 de En. de 2012

Thanks for the updated code... +1!

Jan el 23 de En. de 2012

While Oleg's REGEXP is much nicer than calling TEXTSCAN in a loop, SORTROWS does exactly the same as my sorting method, but with a lot of overhead.

Iniciar sesión para comentar.

Answer 3

Dr. Seis el 22 de En. de 2012

Abrir en MATLAB Online

0 votos

Here is the clunky version I have been using:

     numFiles = numel(filename);
     sortcell = {''};
     sortind = zeros(numFiles,4);
     for i = 1 : numFiles
         sortind(i,2)=strfind(filename{i},'..')-1;
         for j = sortind(i,2):-1:1
             if isequal(filename{i}(j),'.')
                 break;
             end
             sortind(i,1)=j;
         end
         sortind(i,3)=sortind(i,2)+3;
         for j = sortind(i,3):length(filename{i})
             if isequal(filename{i}(j),'.')
                 break;
             end
             sortind(i,4)=j;
         end
         sortcell(i,1)=cellstr(filename{i}(sortind(i,1):sortind(i,2)));
         sortcell(i,2)=cellstr(filename{i}(sortind(i,3):sortind(i,4)));
     end
     [tempcell,tempind1]=sort(sortcell(:,2));
     [tempcell,tempind2]=sort(sortcell(tempind1,1));
     filename = filename(tempind1(tempind2));

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Sort cell strings according to specific subsets of those cell strings

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Más respuestas (2)

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Sort cell strings according to specific subsets of those cell strings

4 comentarios Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar Ninguno Ocultar Ninguno

Más respuestas (2)

3 comentarios Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos