use xlread to read csv file with mixed type data is very slow

5 visualizaciones (últimos 30 días)
roudan
roudan el 27 de Ag. de 2020
Editada: roudan el 29 de Ag. de 2020
Hi
I have a csv file with mixed data type like below, first column is string, 2nd column is numerical, 3rd column tothe end columns is date string. Total row number is 25000.
I used xlread() shown below to read the two csv files. it tooks me 10 minutes to read two csv file. The first csv file has only 10 columns while 2nd file has 400 columns. I am wondering if it is possible to speed it up using different approaches? Thanks
[num_steamdate,txt_steamdate,raw_steamdate] =xlsread(filename_WellDate);
[num_fracdate,txt_fracdate,raw_fracdate] =xlsread(filename_WellFracDate);

Respuestas (1)

Cris LaPierre
Cris LaPierre el 27 de Ag. de 2020
I would suggest using readtable instead. It creates a table, which supports mutiple data types. If you need help, consider sharing your data files. I'm not feeling motivated enough to transcribe your screenshot for testing.
  16 comentarios
Cris LaPierre
Cris LaPierre el 29 de Ag. de 2020
Ok, it appears you cannot use implicit expansion on datetime arrays. You must make sure both arrays have the same number of rows AND columns. You need to repeat the column of wellfracdate_array to match the width of wellsteamdate_array (391 columns). You can do this with repmat.
filter_bfFrac_c1=and(wellsteamdate_array >= repmat(wellfracdate_array(:,1),[1,size(wellsteamdate_array,2)]),...
wellsteamdate_array < repmat(wellfracdate_array(:,2),[1,size(wellsteamdate_array,2)]));
filter_bfFrac_c2=and(wellsteamdate_array >= repmat(wellfracdate_array(:,3),[1,size(wellsteamdate_array,2)]),...
wellsteamdate_array < repmat(wellfracdate_array(:,4),[1,size(wellsteamdate_array,2)]));
filter_bfFrac_c3=and(wellsteamdate_array >= repmat(wellfracdate_array(:,5),[1,size(wellsteamdate_array,2)]),...
wellsteamdate_array < repmat(wellfracdate_array(:,6),[1,size(wellsteamdate_array,2)]));
filter_bfFrac_c4=and(wellsteamdate_array >= repmat(wellfracdate_array(:,7),[1,size(wellsteamdate_array,2)]),...
wellsteamdate_array < repmat(wellfracdate_array(:,8),[1,size(wellsteamdate_array,2)]));
filter_bfFrac_c5=and(wellsteamdate_array >= repmat(wellfracdate_array(:,9),[1,size(wellsteamdate_array,2)]),...
wellsteamdate_array < repmat(wellfracdate_array(:,10),[1,size(wellsteamdate_array,2)]));
filter_onFrac_c1=(wellsteamdate_array == repmat(wellfracdate_array(:,2),[1,size(wellsteamdate_array,2)]));
filter_onFrac_c2=(wellsteamdate_array == repmat(wellfracdate_array(:,4),[1,size(wellsteamdate_array,2)]));
filter_onFrac_c3=(wellsteamdate_array == repmat(wellfracdate_array(:,6),[1,size(wellsteamdate_array,2)]));
filter_onFrac_c4=(wellsteamdate_array == repmat(wellfracdate_array(:,8),[1,size(wellsteamdate_array,2)]));
filter_onFrac_c5=(wellsteamdate_array == repmat(wellfracdate_array(:,10),[1,size(wellsteamdate_array,2)]));
roudan
roudan el 29 de Ag. de 2020
Editada: roudan el 29 de Ag. de 2020
Wow, Cris, Yes it works finally. Thank you so much, I really appreciate it!! i l got very goof learning on readtable(). thank you!
But going back to use xlsread(), the speed in my computer is the same, but using xlsread() actually leads to cleaner code. i have the other file with same number of row and column but with numberic data, reading it using readcsv is much faster.
Here is my code, you can give it a try for timing check.
[num_steamdate,txt_steamdate,raw_steamdate] =xlsread(filename_WellDate);
[num_fracdate,txt_fracdate,raw_fracdate] =xlsread(filename_WellFracDate);
wellsteamdate_str=raw_steamdate(:,4:end);
wellsteamdate=datenum(wellsteamdate_str(:),'mm/dd/yyyy'); % convert it to 1D column since datenum is expecting column or row not array
wellsteamdate_array=reshape(wellsteamdate,size(wellsteamdate_str)); % convert 1D back to original shape
wellfracdate_str=raw_fracdate(:,4:end);
wellfracdate=datenum(wellfracdate_str(:),'mm/dd/yyyy'); % convert it to 1D column since datenum is expecting column or row not array
wellfracdate_array=reshape(wellfracdate,size(wellfracdate_str));
filter_bfFrac_c1=and(wellsteamdate_array>=wellfracdate_array(:,1), wellsteamdate_array<wellfracdate_array(:,2));
filter_bfFrac_c2=and(wellsteamdate_array>=wellfracdate_array(:,3), wellsteamdate_array<wellfracdate_array(:,4));
filter_bfFrac_c3=and(wellsteamdate_array>=wellfracdate_array(:,5), wellsteamdate_array<wellfracdate_array(:,6));
filter_bfFrac_c4=and(wellsteamdate_array>=wellfracdate_array(:,7), wellsteamdate_array<wellfracdate_array(:,8));
filter_bfFrac_c5=and(wellsteamdate_array>=wellfracdate_array(:,9), wellsteamdate_array<wellfracdate_array(:,10));
filter_onFrac_c1=(wellsteamdate_array==wellfracdate_array(:,2));
filter_onFrac_c2=(wellsteamdate_array==wellfracdate_array(:,4));
filter_onFrac_c3=(wellsteamdate_array==wellfracdate_array(:,6));
filter_onFrac_c4=(wellsteamdate_array==wellfracdate_array(:,8));
filter_onFrac_c5=(wellsteamdate_array==wellfracdate_array(:,10));

Iniciar sesión para comentar.

Categorías

Más información sobre Time Series Objects en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by