reading a text file by correct date format

Hi,
I have a text file like below. How can I read the 2nd column/format to read it using MATLAB. I am having a difficulty is reading it using the correct format for yyyy/mm/dd hh:mm. Any help is appreciated.
Thanks in advance.
filePattern = fullfile(myFolder, '*.txt');
csvFiles = dir(filePattern);
fmt='%d %4d/%2d/%2d %4d %*[^\n]';
for i=1:length(csvFiles)
fid = fopen(fullfile(myFolder,csvFiles(i).name));
c=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1,'delimiter','\t'));
fid=fclose(fid);
end

4 comentarios

dpb
dpb el 31 de Dic. de 2015
Attach a short section of the file so somebody can test, but I'd venture you don't need (or want) the 'delimiter','\t' parameter. When you do that, you get rid of the default other whitespace characters as delimiters and also causes repeated delimiters to be interpreted as multiple; thus returning empty values. Just use the default white space delimiter string until it's shown to be mandatory otherwise.
You can, if you have recent release, also use the '%D' date format to convert directly.
Damith
Damith el 31 de Dic. de 2015
Editada: Damith el 31 de Dic. de 2015
I have MATLAB 2014a and 2011b versions.
I actually realized that I do not need to read the second column and need 1st and 3rd. (as you can see there are some missing values but that's fine). I need to obtain the maximum value of the 3rd column from whatever available.
But, I need to specify the correct format string. But, none of the codes below reads the entire rows in each column. Any idea?
I tried,
fmt='%d %4d/%2d/%2d %2d:%2d %4d %*[^\n]';
c=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1,'delimiter','\t'));
output
Then, I tried with whitespace,
c=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1,'delimiter',' '));
output
dpb
dpb el 1 de En. de 2016
Again, we can't test your file unless you attach a section of it but as I said earlier, forget setting the 'delimiter' field entirely; let it default. Failing that, again, give us the actual data, not a picture of it.
Damith
Damith el 2 de En. de 2016
I let it defaulted but failed again. Results still the same. See the attached data.

Iniciar sesión para comentar.

 Respuesta aceptada

dpb
dpb el 2 de En. de 2016
OK, as I suspected, the file format is NOT tab delimited but fixed-width columns except the records aren't the same length; the record is terminated after the last data in the record. So, the lines are as follows:
'010802 2015/01/01 00:00 AR'
'010802 2015/01/01 00:15 AR'
...
'010802 2015/04/20 00:00 15.20 '
'010802 2015/04/20 00:15 15.20 '
...
where I've enclosed the two record types in single quotes to be able to see what's actually in the file. Since C (and hence Matlab) formatted input ignores whitespace excepting if you read counted characters (as in '%Nc'), you can't write a single format string to parse both data records at the same time for the whole file.
It is possible, however, to read a record at a time with the same format string w/ textscan as it will pick up after the error of the failed conversion for an empty field--I created a very short version of your file without the preamble other than the header line and a few of each record type for demonstration purposes:
>> fmt=['%d %4d/%2d/%2d %2d:%2d %f%*[^\n]'];
>> fid=fopen('test.txt');
>> fgetl(fid); % throw away the header line
>> while ~feof(fid) % read record at a time, echo to terminal
textscan(fid,fmt,1,'headerlines',1)
end
ans =
[10802] [2015] [4] [19] [22] [45] [0x1 double]
ans =
[10802] [2015] [4] [19] [23] [0] [0x1 double]
ans =
[10802] [2015] [4] [19] [23] [15] [0x1 double]
...
[10802] [2015] [4] [20] [0] [0] [15.2000]
ans =
[10802] [2015] [4] [20] [0] [15] [15.2000]
ans =
[10802] [2015] [4] [20] [0] [30] [15.2000]
ans =
...
ans =
[10802] [2015] [5] [1] [1] [30] [180.3000]
ans =
[10802] [2015] [11] [1] [12] [15] [35.7300]
ans =
[10802] [2015] [11] [1] [12] [30] [35.7300]
ans =
[10802] [2015] [11] [1] [12] [45] [35.7300]
...
ans =
[10802] [2015] [11] [29] [23] [45] [67.1200]
ans =
[10802] [2015] [11] [30] [0] [0] [0x1 double]
ans =
[10802] [2015] [11] [30] [0] [15] [0x1 double]
ans =
[10802] [2015] [11] [30] [0] [30] [0x1 double]
...
ans =
[10802] [2015] [12] [11] [10] [45] [0x1 double]
ans =
Columns 1 through 6
[0x1 int32] [0x1 int32] [0x1 int32] [0x1 int32] [0x1 int32] [0x1 int32]
Column 7
[0x1 double]
>> fid=fclose(fid);
The last record is the EOF case...
Alternatively, you can
1. read the file into a cellstring array,
2. convert to character array which will pad the short records,
3. convert the fixed width substring fields in memory.
Or, an even better choice, avoid all this hassle and create a delimited, regular file format that can be parsed easily.
Here's another case where Fortran FORMAT wins, hands down--it would read the empty data field although would need fixed record length file.

Más respuestas (1)

per isakson
per isakson el 2 de En. de 2016
Editada: per isakson el 2 de En. de 2016
It's a challenge to read files like yours with Matlab.
>> clear
>> [c1,sdn,c3,c4] = cssm( '010802_Q_1997.txt' );
>> whos
Name Size Bytes Class Attributes
c1 20256x1 81024 int32
c3 20256x1 162048 double
c4 20256x1 2286052 cell
sdn 20256x1 162048 double
where
function [c1,sdn,c3,c4] = cssm( filespec )
fmt = '%6c%25c%9c%[^\n]';
fid = fopen( filespec );
cac = textscan( fid, fmt, 'Headerlines',18, 'Whitespace','' );
fclose( fid );
c1 = textscan( cac{1}', '%6d' );
c1 = c1{:};
sdn = datenum( cac{2}, 'yyyy/mm/dd HH:MM' );
str = permute( cac{3}, [2,1] );
ise = arrayfun( @(ix) all(isspace(str(:,ix))), (1:length(str)) );
str( 7:9, ise ) = repmat( permute( 'nan', [2,1] ), 1, sum(ise) );
c3 = textscan( str, '%9f' );
c3 = c3{:};
c4 = strtrim(cac{4});
end

1 comentario

Apologies for the late reply. This seems serving the purpose. Can you briefly explain what these 3 lines does? Thanks a lot.
str = permute( cac{3}, [2,1] );
ise = arrayfun( @(ix) all(isspace(str(:,ix))), (1:length(str)) );
str( 7:9, ise ) = repmat( permute( 'nan', [2,1] ), 1, sum(ise) );

Iniciar sesión para comentar.

Preguntada:

el 31 de Dic. de 2015

Comentada:

el 5 de En. de 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by