Textscan won't read the dates with spaces

Textscan doesn't work when date elements are separated by spaces. However the same format works when used with datetime function. Have a look at the following code:
textscan('1959 05 21','%{yyyy MM dd}D') % Doesn't work when there are spaces
datetime('now','Format','yyyy MM dd') % same format works with datetime function
textscan('1959-05-21','%{yyyy-MM-dd}D') % Works when when space is replaced with non letter character
textscan('1959 05 21','%{yyyy MM dd}D','whitespace','') % Works when whitespace is set to none
textscan('1959 05 21 567','%{yyyy MM dd}D%d','whitespace','') % Doesn't work
How can I make last line of the code to work?
Thanks
PS: Read Walter Roberson and per isakson's comments on the accepted answer

1 comentario

per isakson
per isakson el 15 de Ag. de 2017
Editada: per isakson el 17 de Ag. de 2017
"How can I make last line of the code to work?" I don't think it's possible. It seems that Matlab cannot handle the double use of space, as part of the date format and at the same time as list separator before the integer.
I assume that your problem is not to parse the string, but to find the limits of textscan.

Iniciar sesión para comentar.

 Respuesta aceptada

Jeremy Hughes
Jeremy Hughes el 16 de Ag. de 2017
Hi Per,
The issue is that textscan's delimiter is space by default. Parsing happens first, then datatype conversion. In this case, you're getting
"1959 05 21" -> "1959","05","21"
and trying to convert each of these into it own datetime. This is a pretty common confusion.
The trick to parsing this correctly is to supply 'Delimiter' to textscan. Try:
textscan('1959 05 21','%{yyyy MM dd}D','Delimiter',',')
Hope this helps, Jeremy

2 comentarios

Note that the attempts
textscan('1959 05 21 567','%{yyyy MM dd}D%d','whitespace','')
or
textscan('1959 05 21 567','%{yyyy MM dd}D%d','Delimiter',',')
will not work, and
textscan('1959 05 21 567','%{yyyy MM dd}D%*[ ]%d','whitespace','')
will not work either. In each of those case, the entire group 1959 05 21 567 gets grabbed and passed to datetime for parsing. textscan parsing is greedy that way, just as is the case for numeric fields:
>> textscan('123456', '%d4%d')
ans =
1×2 cell array
{[123456]} {0×1 int32}
>> textscan('123456', '%d%*[4]%d')
ans =
1×2 cell array
{[123456]} {0×1 int32}
>> textscan('123456', '%d%d','delimiter','4')
ans =
1×2 cell array
{[123456]} {0×1 int32}
>> textscan('123e56', '%de%d')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
>> textscan('123e56', '%d%*[e]%d')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
>> textscan('123e56', '%d%d','delimiter','e')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
>> textscan('123e56', '%d%d','whitespace','e')
ans =
1×2 cell array
{[2147483647]} {0×1 int32}
per isakson
per isakson el 16 de Ag. de 2017
Editada: per isakson el 22 de Ag. de 2017
Had the task been to read and parse the string, '1959 05 21 567', I would have tried
cac = textscan('1959 05 21 567','%10c%d');
dt = datetime( cac{1},'Format','yyyy MM dd');
Note: this doesn't work with (a varying number of) leading spaces.
>> textscan(' 1959 05 21 567','%12c%d')
ans =
'1959 05 21 5' [67]
The Conversion Specifier, %c, lets me decide how many characters to read (except for leading (white)spaces) and these two lines are surprisingly efficient. ( %c is cheap.)
However, with 'whitespace','' it works
>> textscan(' 1959 05 21 567','%12c%d', 'Whitespace','')
ans =
' 1959 05 21' [567]
>> textscan(' 1959 05 21 567 890','%12c%d%d', 'whitespace','')
ans =
' 1959 05 21' [567] [890]

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Etiquetas

Preguntada:

D
D
el 15 de Ag. de 2017

Editada:

D
D
el 23 de Ag. de 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by