How to import some large data please

Hi all I have a file called DJ.csv which has 5 columns. 1) Dates (01/02/2007), 2) Times (30.42.0), 3) prices 12553, 12442, 4) Codes (DJ123) and 5) trade size.
I want to take column 3 and 5 (price and trade size into matlab). I am having some trouble as the csv is quite big.
I tried this:
fileID = fopen('K:\test\test\DJ.csv');
A = fread(fileID,'double');
fclose(fileID);
But it only gives me a vector of values which are not the same as my data. Any help would be very much appreciated.
Thanks.

1 comentario

Mate 2u
Mate 2u el 24 de Dic. de 2013
As a note, importdata works, but it is not suitable for very large files.

Iniciar sesión para comentar.

 Respuesta aceptada

dpb
dpb el 25 de Dic. de 2013
Editada: dpb el 26 de Dic. de 2013
fread is for stream unformatted files; you have formatted delimited file--
doc textscan % and friends
If you really only want/need the two columns sotoo (air-code, untested)
[p,s]=textread('K:\test\test\DJ.csv','%*s%*s,%f%*f%f','delimiter',',');
ought to do unless the third column is indeed a comma-for-a-decimal point as well as a comma-delimited file. In that case you've got a problem. You'll have to read three values instead of just two or preprocess the file or otherwise handle the decimal separator as Matlab can't (and you can't expect it to) know the difference between comma-delimiters and decimal places.

7 comentarios

Mate 2u
Mate 2u el 2 de En. de 2014
Editada: Mate 2u el 2 de En. de 2014
Hi, unfortunately this does not work.
My data is in this form:
01/02/2007 21:58.0 12541 DJH07 1
01/02/2007 22:50.0 12541 DJH07 1
01/02/2007 30:42.0 12545 DJH07 1
01/02/2007 11:31.0 12553 DJH07 2
01/02/2007 51:48.0 12554 DJH07 2
01/02/2007 13:30.0 12554 DJH07 1
01/02/2007 16:14.0 12554 DJH07 3
Could somebody help me please?
fid = fopen('K:\test\test\DJ.csv', 'r');
datacell = textscan(fid,'%*s%*s,%f%*s%f','delimiter',',');
fclose(fid);
prices = datacell{1};
tradesize = datacell{2};
Mate 2u
Mate 2u el 2 de En. de 2014
Hi there,
Still not working.
When I run above, I get fid=3, datacell [1x2 cell] which is blank, and blank for prices and tradesize.
To note the above data was pasted from excel. Thanks for all your help.
dpb
dpb el 2 de En. de 2014
What's the actual file look like is the question. Is there a header row, perhaps, ahead of the data so you also need 'headerlines',1 as an argument pair to the textscan call?
At least the fid=3 indicates did open the file successfully.
Remember when you're testing to always either
frewind(fid)
or
fid= fclose(fid);
and then reopen between attempts or you'll leave the file pointer somewhere besides the beginning which will be bound to cause confusion at best.
Mate 2u
Mate 2u el 2 de En. de 2014
Editada: Walter Roberson el 3 de En. de 2014
Hi there, There is no header row just data. The data as open in notepad is the following:
01/02/2007,00:15:00.000,12540,DJH07,1
01/02/2007,00:21:58.000,12541,DJH07,1
01/02/2007,00:22:50.000,12541,DJH07,1
01/02/2007,00:30:42.000,12545,DJH07,1
01/02/2007,01:11:31.000,12553,DJH07,2
01/02/2007,01:51:48.000,12554,DJH07,2
01/02/2007,02:13:30.000,12554,DJH07,1
01/02/2007,02:16:14.000,12554,DJH07,3
01/02/2007,02:21:40.000,12554,DJH07,1
01/02/2007,02:26:48.000,12558,DJH07,1
01/02/2007,02:50:44.000,12555,DJH07,1
01/02/2007,03:14:57.000,12557,DJH07,1
01/02/2007,03:22:41.000,12559,DJH07,1
But each data entry is different lines within the notepad file. Thanks so much for your help.
datacell = textscan(fid,'%*s%*s%f%*s%f','delimiter',',');
The previous version had a stray comma in the format.
Mate 2u
Mate 2u el 5 de En. de 2014
Thank you, worked well.

Iniciar sesión para comentar.

Más respuestas (0)

Preguntada:

el 24 de Dic. de 2013

Comentada:

el 5 de En. de 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by