Borrar filtros
Borrar filtros

A problem while splitting a text input with regexp

6 visualizaciones (últimos 30 días)
Samyukta Ramnath
Samyukta Ramnath el 14 de Ag. de 2013
I have a text file with the input as
sammy yo yo
yoyo with you
Samyukta
and I tried the following code to put each word into an element of an array.
fid = fopen('test4.txt');
table = fscanf(fid,'%c');
table2 = regexp(table,'\n','split');
this means that when I refer to table2{1}, it returns 'sammy yo yo' then I split every line individually with strsplit and ' ' (whitespace) as the delimiter. Therefore, when I refer to table2{1}{2} , it returns 'ýo'. But, the last word of every line has more number of letters than appears i.e. size(table2{1}{2},2) = 3 rather than 2. But when I strcmp it with '\n' and ' ' or any other thing, it returns 0. So now I don't know what to do.
  2 comentarios
Walter Roberson
Walter Roberson el 15 de Ag. de 2013
What shows up for
table2{1}{2}(end) + 0
I suspect you will find it is 13 (carriage return)
Samyukta Ramnath
Samyukta Ramnath el 15 de Ag. de 2013
it gives an error : cell contents reference from a non cell array object

Iniciar sesión para comentar.

Respuesta aceptada

Cedric
Cedric el 15 de Ag. de 2013
>> fprintf('%d,', table) ; fprintf('\n') ;
115,97,109,109,121,32,121,111,32,121,111,13,10,121,111,121,111,32,119,105,
116,104,32,121,111,117,13,10,83,97,109,121,117,107,116,97,13,10,
As you can see, at the end of each line, there are 13 (carriage return: '\r') and 10 (new line: '\n').
If you just want to split words, why don't you split using REGEXP only with a pattern which matches whitespaces? For example:
>> buffer = fileread('test4.txt') ;
>> words = regexp(buffer, '\s+', 'split')
words =
'sammy' 'yo' 'yo' 'yoyo' 'with' 'you' 'Samyukta' ''
with this, you would just have to delete the last cell when empty (which happens when your file ends with '\r\n'), and you would be done.
  2 comentarios
Walter Roberson
Walter Roberson el 15 de Ag. de 2013
Only if the file was created with an older MS Windows editor. More modern MS Windows editors only put in \n (newline) without \r (carriage return). Linux and OS-X have never used \r . (MacOS before OS-X might have used \r )
Cedric
Cedric el 15 de Ag. de 2013
Editada: Cedric el 15 de Ag. de 2013
Or MATLAB editor actually (I used it to generate this file on 2012b, Win7/64).
The pattern '\s+' works in all cases though.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre String Parsing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by