Extracting specific repeating lines of text after a heading using fgetl and textscan

Question

Vincent Scalfani el 19 de Jul. de 2016

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan

Comentada: Vincent Scalfani el 21 de Jul. de 2016

Here is an example of the data I am working with. I would like to extract the line directly following each KEY tag. The files have many thousands of these, so I need to create a loop with textscan or something similar.

> <NAME>
mary
> <AGE>
30
> <KEY>
RDHQFKQIGNG
> <NAME>
john
> <AGE>
56
> <KEY>
JFJNNFNFKFNN

Desired result:

RDHQFKQIGNG
JFJNNFNFKFNN

Here is where I am at (adapted from a similar question in the past), the code does not seem to be moving the cursor, and instead works for the first one, and then grabs all data after it, instead of just the data following the KEY line.

f = fopen('data.txt', 'rt'); 
tline = fgetl(f);
while isempty(strfind(tline, '> <KEY>'))
    if tline == -1 
        break;
    end
    line = fgetl(f);
end
if tline ~= -1
    data = textscan(f,'%s','Delimiter','\r\n');
else
    disp('not found');
end
fclose(f);

Thanks!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Stephen23 el 19 de Jul. de 2016

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan#answer_229053

Abrir en MATLAB Online

temp1.txt

>> str = fileread('temp1.txt');
>> C = regexp(str,'(?<=> <KEY>\s+)\S+','match')
C = 
  'RDHQFKQIGNG'    'JFJNNFNFKFNN'

Tested on this file:

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Stephen23 el 20 de Jul. de 2016

Abrir en MATLAB Online

temp1.m

Try this:

  E = regexp(str,'^> <KEY>\s+\S+','match','lineanchors');
  E = strtrim(strrep(E,'> <KEY>',''));

And have a play with this script:

Vincent Scalfani el 21 de Jul. de 2016

Amazing!!! PERFECT. It took 1 second to process over 4 million lines of text. Thanks so much for your time.

Iniciar sesión para comentar.

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo