Performance issues with readtable vs. xlsread

14 visualizaciones (últimos 30 días)
Mike
Mike el 2 de Dic. de 2020
Editada: Mike el 3 de Dic. de 2020
I am working in R2020a and have converted the Excel file read in one of my MATLAB scripts from xlsread to readtable, since xlsread is now deprecated.
However, I am having performance issues with readtable compared to xlsread.
The Excel file is a 15MB .xlsb file with six sheets, and I am using named ranges to locate the data on the approrpriate sheet.
tic; [perforations, perforations_header] = xlsread(xls_file,sheet,'PERFORATIONS'); toc
Elapsed time is 4.354259 seconds.
tic; T=readtable(xls_file,'Sheet',sheet,'Range','PERFORATIONS'); toc
Elapsed time is 15.777517 seconds.
perforations = table2array(T); perforations_header=T.Properties.VariableNames; toc
Elapsed time is 15.803216 seconds.
So, two questions:
(1) Any suggestions on how to improve performance for readtable?
(2) Despite xlsread being deprecated, it clearly performs better than readtable. Is there any downside with staying with xlsread?
My thanks!
Mike
P.s. Attached is a cut down version of the Excel .xlsb file. It contains two named ranges: 'PERFORATIONS' and 'CELLS'. The full file was too large to attach.
  2 comentarios
Walter Roberson
Walter Roberson el 3 de Dic. de 2020
Consider experimenting with readtable() with the 'UseExcel', true flag.
Mike
Mike el 3 de Dic. de 2020
Very interesting!
Rough figures for the 'PERFORATIONS' range:
xlsread => 4 seconds
readtable => 16 seconds
readtable('UseExcel',true) => 14 seconds
So, some improvement but still hasn't reached the performance of xlsread.
I'm starting with a very small example despite the use of an .xlsb file. Most of my "in anger" files will be x100 larger...
Thanks for the suggestion!
Mike

Iniciar sesión para comentar.

Respuesta aceptada

Star Strider
Star Strider el 2 de Dic. de 2020
If you need to use xlsread, use it as long as you have it available.
Since you obviously want the header information as well, one option could be to use readtable once to read everything in once (likely using 'VariableNamingRule','preserve' in your call to it), perhaps extracting the variable names separately as <tablename>.Properties.VariableNames to a cell, then using save to put all the information you want in a .mat file. Then load the .mat file as necessary to retrieve the necessary information into your workspace. Either save the headers and data separately as different variables, or save it as a table.
I have no idea how efficient this would be, since I have no idea what your Excel file contains, or how large it is. It is simply an option to consider.
  2 comentarios
Mike
Mike el 2 de Dic. de 2020
Editada: Mike el 2 de Dic. de 2020
Thank you for the rapid reply. I have some control over the workflow for the project I'm working on, so I'll follow your lead and look into replacing the Excel files with a .mat save/load instead.
I don't have much experience with MATHWORKS/MATLAB. What does it mean for a function to be deprecated? Is there a management of change process to review possible transition issues before a function is removed?
Mike
Star Strider
Star Strider el 3 de Dic. de 2020
My pleasure!
My idea is to read them in once with readtable, save them to .mat files, then load them as necessary when you need them later.
If a function is deprecated, it will not be supported (and will likely not appear) beginning in some future release, although code using it may still work. (It’s complicated!)

Iniciar sesión para comentar.

Más respuestas (0)

Etiquetas

Productos


Versión

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by