How can I extract line numbers of text data?

Question

Paschalis Garouniatis el 31 de Jul. de 2016

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/297881-how-can-i-extract-line-numbers-of-text-data

Editada: Paschalis Garouniatis el 3 de Ag. de 2016

portion.txt

Hello everyone. I have attached a .txt file (portion.txt) which contains a portion of my data. What I need is to create a script which will identify strings that correspond to pairs of x-y coordinates and return their line numbers. For instance, in the .txt file the first set of coordinates begins at line 3 and ends at line 138 (the number of those pairs is written above each set of coordinates, which at this case is 136). So the script should return those two numbers. Then this process should be done for the whole file. I suppose that the process can be repeated with loop since every next set of coordinates begins after 2 lines from the previous one. How can this be done? Thanks in advance.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Azzi Abdelmalek el 31 de Jul. de 2016

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/297881-how-can-i-extract-line-numbers-of-text-data#answer_230379

Abrir en MATLAB Online

str=[]
fid=fopen('portion.txt')
l=fgetl(fid)
while ischar(l)
  str{end+1,1}=l;
  l=fgetl(fid);
end
fclose(fid)
str
idx=str(cellfun(@numel,regexp(str,'[\d\.]+'))==2)

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Azzi Abdelmalek el 1 de Ag. de 2016

Abrir en MATLAB Online

    str=[]
fid=fopen('portion.txt')
l=fgetl(fid)
while ischar(l)
    str{end+1,1}=l;
    l=fgetl(fid);
end
fclose(fid)
clc
str
f=regexpi(str,'[e\-\+\d\.]+')
idx=cellfun(@numel,f)
id=idx==2
ii1=strfind([0 id'],[0 1])  % Begin
ii2=strfind([id' 0],[1 0])  % End

Paschalis Garouniatis el 1 de Ag. de 2016

Editada: Paschalis Garouniatis el 2 de Ag. de 2016

Thanks a lot Azzi for your response. It worked just fine.

Iniciar sesión para comentar.

Answer 2

dpb el 31 de Jul. de 2016

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/297881-how-can-i-extract-line-numbers-of-text-data#answer_230380

Editada: dpb el 1 de Ag. de 2016

Abrir en MATLAB Online

fid=fopen('portion.txt','r');
i=0;              % loop counter
n=[];
while ~feof(fid)  % until we run out of data
  i=i+1;                                                  % increment counter
  n(i)=cell2mat(textscan(fid,'%d %*[^\n]',1,'headerlines',1));
  d(i)=textscan(fid,'%f %f',n(i),'collectoutput',1);         % read the section
  fgetl(fid);  % straighten out file pointer end of record
end
fid=fclose(fid);    % done with file

You'll have a list of the sizes and a cell array of M sets of nx2 coordinates to do with as wish...

Running on the file here I get...having named the m-file portion.m

>> portion
>> n
n =
 136   162
>> d
d = 
  [136x2 double]    [162x2 double]
>> cumsum([[3 2+n(1:end-1)].' [2+n].'])  % the start/stop positions from the lengths
ans =
   3   138
 141   302
>>

5 comentarios
Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

dpb el 1 de Ag. de 2016

Editada: dpb el 1 de Ag. de 2016

Always possible, sure....I presumed the only point in knowing which line numbers contained the data was to later use those to read the data. Hence, I just read the data figuring that would be the end result desired. :)

If it really is just the section locations that is wanted/needed, simply save the N values as well and if you really don't want the other data, no need to save d; just don't bother to assign it.(*)

() NB: If the data aren't of interest at all but only the position (can't think of why that could possibly be of any interest--oh, guess one could be using external editor and doing macro for line replacement or somesuch. If that's the case, remember that line numbers will change if you begin from the beginning of the file and do anything that modifies the number of lines in a section, but that's getting rather far astray) then you can use the N in computing a new 'headerlines' argument for subsequent reads in the line reading the next N and never actually read the data itself at all. This internally might revert to a *for loop of N fgetl calls similar to some other posters' solutions of counting lines altho it's possible the actual implementation is a search for that many \n characters and an fseek to that point; not sure how much TMW has worked on optimizations inside textscan; it's fairly new so is undoubtedly still evolving from one release to next.

I note that the "aircode" I typed is missing a couple details, the first being that one needs an fgetl to resynch the record location after the read for the section. I updated Answer to insert these mod's as an alternative solution.

dpb el 2 de Ag. de 2016

Abrir en MATLAB Online

No need to create a new file, simply skip the odd headerlines before getting to the portion of the file that is regular and go from there--

fid=fopen('portion.txt','r');
for i=1:7, fgetl(fid); end  % skip preliminary stuff
...

From this point everything's the same excepting for the real file you'll need to add 7 to all the line numbers obtained if you're going to use them with respect to that file.

Paschalis Garouniatis el 3 de Ag. de 2016

Editada: Paschalis Garouniatis el 3 de Ag. de 2016

Thank you very much for your help dpb.

Iniciar sesión para comentar.

Answer 3

Shameer Parmar el 1 de Ag. de 2016

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/297881-how-can-i-extract-line-numbers-of-text-data#answer_230411

Editada: Shameer Parmar el 1 de Ag. de 2016

Abrir en MATLAB Online

 Data = textread('portion.txt', '%s', 'delimiter', '');
 LineIndex = {};
 count = 1;
 for i=1:length(Data)
     if ~isempty(strfind(Data{i},'           '))
         temp_line = regexp(Data{i},'           ','split');
         LineIndex{count,1} = ['Begin at ',num2str(i+1)];
         LineIndex{count+1,1} = ['End at ',num2str(i + str2num(temp_line{1}))];
         count=count+2;
     end
 end

Make sure that your file "portion.txt" is in current directory.

to check output just type "LineIndex"

Output:

 LineIndex = 
    'Begin at 3'
    'End at 138'
    'Begin at 141'
    'End at 302'

2 comentarios
Mostrar NingunoOcultar Ninguno

Paschalis Garouniatis el 1 de Ag. de 2016

Thanks a lot for your answer Shameer.

Paschalis Garouniatis el 1 de Ag. de 2016

I ran your code and the cell LineIndex has two specific subcells which represend the 'End at' with two numbers instead of one.

Iniciar sesión para comentar.

How can I extract line numbers of text data?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (2)

5 comentarios
Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

How can I extract line numbers of text data?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (2)

5 comentarios Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

5 comentarios
Mostrar 3 comentarios más antiguosOcultar 3 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno