Implicit expansion with arrayfun (cpu vs gpu)
Mostrar comentarios más antiguos
I find very convenient that Matlab allows for implicit expansion since the 2016 version (for an explanation, see this nice article: https://blogs.mathworks.com/loren/2016/10/24/matlab-arithmetic-expands-in-r2016b/?s_tid=blogs_rc_1).
I was then puzzled to discover that arrayfun on the cpu does not allow for it, while arrayfun when called with gpu arrays does allow for implicit expansion. Below is a MWE to demonstrate this behavior.
Let me quickly explain it: if I have two vectors x with size [2,1] and y with size [1,2], I can calculate the sum x+y and get a matrix 2*2 as intended. This is better than the ugly and more memory-intensive
repmat(x,1,2)+repmat(y,2,1)
Unfortunately this does not work with arrayfun on the cpu!
Since I code both using normal arrays and GPU arrays, I find this different behavior of arrayfun quite misleading. It would be great if Matlab could allow implicit expansion also on arrayfun cpu. When I have large arrays, duplicating dimensions with repmat takes a lot of memory.
%% Demonstration of implicit expansion support in MATLAB and arrayfun
% This script shows that:
% 1) MATLAB supports implicit expansion for standard array operations.
% 2) arrayfun on the GPU supports implicit expansion.
% 3) arrayfun on the CPU does NOT support implicit expansion!!
%
% Implicit expansion allows a 2×1 vector to be added to a 1×2 vector,
% producing a 2×2 matrix.
clear; clc; close all;
% Define test vectors
x = [1; 2]; % Column vector (2×1)
y = [1, 2]; % Row vector (1×2)
%% Implicit expansion using standard MATLAB operations
F1 = myadd(x, y);
%% Implicit expansion using arrayfun on the GPU
F2 = arrayfun(@myadd, gpuArray(x), gpuArray(y));
%% Attempt implicit expansion using arrayfun on the CPU (expected to fail)
try
F3 = arrayfun(@myadd, x, y);
catch ME
fprintf('CPU arrayfun error (expected):\n%s\n\n', ME.message);
end
%% Function myadd
function F = myadd(x, y)
% Element-wise addition
F = x + y;
end
10 comentarios
Catalytic
el 4 de En. de 2026
It would be great if Matlab could allow implicit expansion also on arrayfun cpu. When I have large arrays, duplicating dimensions with repmat takes a lot of memory.
It's hard to imagine why you would ever be using arrayfun on the CPU when you have large arrays. It would be super-slow.
I can understand that sometimes you want to run CPU and GPU versions of the same code side by side to demonstrate speed-up, but in the case of arrayfun on large arrays, it's so obvious in advance that the CPU version will be vastly slower. How informative could such a comparison ever be?
It is not difficult to imagine wanting to arrayfun() between (for example) a 10000 x 1 array, and a 1 x 15000 array, and hoping that it is not necessary to explicitly form 10000 x 15000 input arrays to make it work.
10000 * 8 + 15000 * 8
ans / 10^9
10000 * 15000 * 8 * 2 %two input arrays
ans/10^9
Big difference.
It is not difficult to imagine caring about that on the GPU, but even attempting a computation of that size on the CPU, knowing that it will run with the speed of a for-loop over ~10^9 elements, is difficult to imagine.
Also, the extra memory consumption in your example isn't really all that dramatic - only twice what would be consumed in an implicit expansion implementation. You've basically saved a 1 GB temp array. What's 1 GB by today's standards of RAM?
Paul
el 4 de En. de 2026
Assume that the underlying function cannot be executed on the GPU.
In this case, are you suggesting that if the arrays are large but all have the same size that arrayfun should not be used?
What if the arrays are not large, but are compatible? As it stands now, arrayfun cannot be used even though it might be a very good approach if it supported implicit expansion.
If no vectorization or parallelization approaches are available, then I think the benefits of arrayfun over a plain old for-loop are pretty narrow in the scenarios you describe -- basically just syntactic sugar. I also speculate that they are pretty rare use cases. With such rare and marginal benefits, I can see why TMW would question whether it warrants developer time.
Frankly, I've never thought about using arrayfun with implicit expansion and so don't really know how useful it would or wouldn't be. But I can definitely envision use cases where it might come handy and save lines of code. Whether or not eliminating such lines of code is worth it is in the eye of the beholder.
Consider a situation where I have discrete time filter coefficients stored in two cell arrays: b for the numerator, and a for the denominator. Sometimes I want unique pairs of numerators and denominators (numel(b) == numel(a)), sometimes I want to analyze multiple sets of denominators with one numerator, and sometimes I want to analyze multiple numerators with one denominator, as shown below.
Define three sets of numerator coefficients and one set of denominator coefficients
rng(100);
w = linspace(0,pi,1000);
b = mat2cell(rand(3,2),[1,1,1]);
a = {[1 2]};
myfreqz = @(b,a) freqz(b{1},a{1},w);
Call @Matt J's function as defined in this Answer to compute all three responses with implicit expansion of the denominator. I changed two lines of code in Matt's function to return a cell array rather than a double as Matt's function does not (yet?) support the UniformOutput option.
h = arrayfunImplExp(myfreqz,b,a);
Verify
isequal(vertcat(h{:}),[freqz(b{1},a{1},w);freqz(b{2},a{1},w);freqz(b{3},a{1},w)])
To be sure, I could have written a loop with a couple of if statements to check isscalar on b or a to know how to call freqz inside the loop, but I rather like the one-liner.
function out = arrayfunImplExp(fun, varargin)
%ARRAYFUNIMPLEXP arrayfun with implicit expansion (CPU)
%
% NOTE:
% Useful for CPU-only execution. On the GPU, use arrayfun instead,
% which implements its own implicit expansion.
% Number of inputs and maximum dimensionality
numArgs = numel(varargin);
numDims = max(cellfun(@ndims, varargin));
% Collect per-input sizes (row-wise)
sizes = cellfun(@(c) size(c, 1:numDims), varargin, 'UniformOutput', false);
sizes = vertcat(sizes{:}); % [numArgs x numDims]
% Output size is max along each dimension
outSize = max(sizes, [], 1);
% Precompute row-wise strides for linear indexing
strides = [ ...
ones(numArgs, 1), ...
cumprod(sizes(:, 1:end-1), 2) ...
];
% Convert sizes to zero-based limits
sizes = sizes - 1;
% Allocate output
%out = nan(outSize);
out = cell(outSize);
argsIdx=nan(numArgs,1);
args=cell(1,numArgs);
A=1:numArgs;
% Main loop over output elements
for linIdx = 1:numel(out)
% Convert linear index → zero-based subscripts
idx = linIdx - 1;
subs = zeros(1, numDims);
for d = 1:numDims
sd = outSize(d);
subs(d) = mod(idx, sd);
idx = floor(idx / sd);
end
% Apply implicit expansion masking
Subs = min(subs,sizes);
% Row-wise sub2ind
argsIdxNew = sum(Subs .* strides, 2) + 1;
map=(argsIdxNew ~= argsIdx);
% Gather arguments
% v=varargin(map);
% k=argsIdxNew(map);
% c=0;
% for j=map
% c=c+1;
% args{j} = v{j}(k(c));
% end
for j = 1:numArgs
if map(j)
args{j} = varargin{j}(argsIdxNew(j));
end
end
argsIdx=argsIdxNew;
% Evaluate function
out{linIdx} = fun(args{:});
end
end
That's fair, but the use case you describe does not demand high speed or RAM efficiency. If syntactic sugar with implicit expansion is all you want, then it is trivial to make a one-liner function of your own (as @Matt J has done with his proposals).
Conversely, the OP has called out the built-in CPU arrayfun for not offering RAM-efficient scalar expansion. My real point was that, if your data is so big that RAM consumption is a concern, then arrayfun on the CPU never had a hope of being high performing to begin with. It's the difference between a computation than takes 100 years and a computation that takes 300 years.
I changed two lines of code in Matt's function to return a cell array rather than a double as Matt's function does not (yet?) support the UniformOutput option.
I was never a big fan of the UniformOuput flag. If the user wants cell-valued output, then I think the iterated input function should assume responsibility for that. In your case, that would look like,
rng(100);
w = linspace(0,pi,1000);
b = num2cell( rand(3,2) , 2);
a = {[1 2]};
myfreqz = @(b,a) {freqz(b{1},a{1},w)};
h = arrayfunImplExp(myfreqz,b,a)
Joss Knight
el 5 de En. de 2026
Movida: Matt J
el 5 de En. de 2026
arrayfun with expansion, particularly for expanding scalars, is certainly very convenient syntactic sugar for a for loop to make code more compact and readable, for instance, setting a bunch of settings on an object array, where the for loop is not going to be optimized so there is no advantage to it.
layers = arrayfun(@setHyperParams, layers, 0, [layers.L2Factor]); % Freeze learn rate
It's just easier to read, isn't it? Obviously there are other ways to do this particular operation on one line but I certainly see your point. However, the others have good points; arrayfun is almost always slower than a for loop or alternative approach, so taking action to encourage its use is something to do with caution.
Respuesta aceptada
Más respuestas (3)
F3 = cellfun(@myadd, [{x}], [{y}], 'UniformOutput', false)
will work.
Arrayfun tries to apply "myadd" to [first element of x,first element of y] and [second element of x, second element of y]. Thus x and y must be of the same size and - even if it would work - the result would be [2, 4] or [2;4].
I don't understand why it works for gpuArray and gives the result you want. Maybe input arrays of different size are automatically interpreted here as two separate single objects to which "myadd" is to be applied - as it is done if you use cell arrays together with "cellfun".
4 comentarios
Paul
el 3 de En. de 2026
The gpuArray version of gpuArray.arrayfun states: "The sizes of A1,...,An must match or be compatible." which I gather means the inputs A1 .. An are implicitly expanded.
So I guess the question is why the gpu version offers that expansion and the cpu version does not. I wonder if that could lead to problems for people who develop on the cpu version and then transition later to the gpu version, or vice versa.
Alessandro
el 3 de En. de 2026
If your contribution rather was a complaint and not a question, make a feature request:
I agree it is confusing, but the gpuArray version of arrayfun was never intended as a direct analgoue of the CPU version. Additionally, there are all kinds of other instances where the gpuArray support for a function differs in capabilities from its CPU version. Usually, though, the GPU version is less flexible, not moreso, as seems to be the case with arrayfun.
A more appropriate comparison of implicit expansion between CPU vs GPU (for binary functions) would probably be to use bsxfun instead:
% Define test vectors
x = rand(10000,1); % Column vector
y = rand(1,10000); % Row vector
xg=gpuArray(x);
yg=gpuArray(y);
myadd=@(a,b) a+b;
timeit( @() bsxfun(myadd, x, y) ) %CPU
ans =
0.3355
gputimeit( @() bsxfun(myadd, xg, yg) ) %GPU
ans =
0.0078
Joss Knight
el 5 de En. de 2026
Movida: Matt J
el 5 de En. de 2026
0 votos
Well, there are a couple of answers to that. Firstly and probably most importantly, GPU arrayfun is just an extension of gpuArray's existing compiler for element-wise operations, all of which support dimension expansion; it's also a natural strided memory access pattern to support for a GPU kernel, where each input has its own stride, and there is native support for this kind of memory access in GPU hardware. Secondly, it makes design sense and only isn't implemented for CPU for the reasons given. The historical explanation is that the idea of broadcast operations didn't even exist back when arrayfun was first created for the CPU, but it did exist by the time GPU arrayfun came along.
2 comentarios
Secondly, it makes design sense and only isn't implemented for CPU for the reasons given.
And are those reasons still an argument for not pursuing it now, or is it just because of historical legacy?
Joss Knight
el 6 de En. de 2026
The reasons given by various people. Should you really enhance something you want to discourage people from using?
Categorías
Más información sobre Matrices and Arrays en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!