Should table Indexing be Faster?

11 visualizaciones (últimos 30 días)
Paul
Paul el 15 de Sept. de 2024
Comentada: Paul el 18 de Sept. de 2024
Example code.
t = combinations(0:10,0:10,0:10,0:10);
tic
for ii = 1:10
for jj = 1:height(t)
u = t{jj,:};
end
end
toc
Elapsed time is 18.111169 seconds.
tic
tcell = table2cell(t);
for ii = 1:10
for jj = 1:height(tcell)
u = [tcell{jj,:}];
end
end
toc
Elapsed time is 0.179943 seconds.
tic
tarr = table2array(t);
for ii = 1:10
for jj = 1:height(tarr)
u = tarr(jj,:);
end
end
toc
Elapsed time is 0.017899 seconds.
Any ideas why indexing into a table to extract data is so slow?
  1 comentario
Walter Roberson
Walter Roberson el 15 de Sept. de 2024
By the way: we held a discussion of table indexing across rows, about two-ish years ago. I know that I contributed, and I seem to remember that Steven Lord contributed. The thread revived briefly a couple of months ago. Unfortunately I do not seem to be able to locate it at the moment.

Iniciar sesión para comentar.

Respuesta aceptada

Matt J
Matt J el 16 de Sept. de 2024
Editada: Matt J el 16 de Sept. de 2024
I haven't profiled it, but I would bet that the following line, from @tabular/braceReference
b = t.extractData(varIndices);
is bottlenecking the row extraction operations. Effectively, this runs table2array(t) on the entire table t every time a braceReferencing operation is done.
That probably should be done differently, since as a result, the time for even the smallest row-extraction operation is a very strong function of the size of the table, see example below:
T = combinations(1:100,1:100,1:40,1:40);
t=T(1,:);
timeit(@() t{1,:})
ans = 1.4134e-04
timeit(@() T{1,:})
ans = 0.2923
  6 comentarios
Matt J
Matt J el 18 de Sept. de 2024
Editada: Matt J el 18 de Sept. de 2024
From Tech Support:
Thank you for identifying a performance slowdown when extracting rows from a table. My colleagues are aware of the issue and are working on a fix. In the meantime, a workaround is to use parenthesis subscripting.
For example, currently you are extracting rows using the following syntax. Instead, extract rows using parentheses.
T = combinations(1:100,1:100,1:40,1:40);
t = T(1,:);
When I compared the elapsed time for both these syntaxes, the suggested workaround of parentheses was significantly faster than the original.
%Original example
tic, for i = 1:100, t1 = t{1,:}; end, toc
Elapsed time is 0.005140 seconds.
tic, for i = 1:100, t1 = T{1,:}; end; toc
Elapsed time is 4.458811 seconds.
% Workaround
tic, for i = 1:100, t1 = t(1,:); t1 = t1.Variables; end, toc
Elapsed time is 0.006460 seconds.
tic, for i = 1:100, t1 = T(1,:); t1 = t1.Variables; end; toc
Elapsed time is 0.005319 seconds.
Paul
Paul el 18 de Sept. de 2024
Just want to point out that creating the temporary variable (t1 in this case) isn't necessary if the result is to be used directly, like as an input to a function
T = combinations(1:100,1:100,1:40,1:40);
T(1,:).Variables
ans = 1×4
1 1 1 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Also, perhaps MathWorks should consider a near term update to use this exact workaround in rowfun when it's called with SeparateInputs=false

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Loops and Conditional Statements en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by