Map a function over specific dimensions of matrix

Consider an N dimensional matrix M. Consider a function f that eats k-dimensional matrices.
I want to build a function F such that the following code :
B=F(M,@f,k);
Returns a matrix B having N-k dimensions. The coefficient B(i_1,...,i_{N-k}) is equal to f(squeeze(M(i_1,...,i_{N-k},:))
To do so I created the following function :
function B = mapping(M, f, k)
% MAPPING map a function f that must have a k dimension matrix as input to
% the last k dimensions of a matrix M
% B = mapping(M,@f,k) returns a matrix M of N-k dimensions where N is the
% dimension of M such that B(i1,..,i_{N-k})=f(M(i1...i_{N-k},:))
size_M=size(M);
% We start by creating an intermediate matrix A that will be required in
% the function arrayfun. This matrix has the same size as the returned B
% matrix but verifies A(k)=k where k is the 1 dimensional indice
if(numel(size_M)-k>1)
% general case where A (and B) will have at least 2 dimensions
A=[1:prod(size_M(1:end-k))];
A=reshape(A, size_M(1:end-k));
elseif(numel(size_M)-k==1)
A=[1:size_M(1)];
else
error("The function to map maps all the dimensions of the matrix or more.");
end
function fct = F(Ai)
% We first take the subscript associated to the number Ai
b_dim=numel(size_M(1:end-k));
subscripts=cell(b_dim,1);
% We take the indices corresponding to Ai
[subscripts{:}]=ind2sub(size_M(1:end-k),Ai);
subscripts=subscripts.';
% We then apply the function f to the good element of matrix M
fct=f(squeeze(M(subscripts{:},:)));
end
B=arrayfun(@F, A);
end
My question are :
Above all : does it actually already exist a matlab function doing the purpose I want ? Or am I forced to create it ?
It it doesn't already exist such a function I would like to know if there are tricks to speed up what I want, because here it is too slow for my purpose.
If you want to make a try :
M=rand(8,10,2)
% The matrix B will have 2 dimensions (B is a 8x10 matrix) and B(i,j)=norm(squeeze(M(i,j,:)))
B=mapping(M,@norm,1);

2 comentarios

Guillaume
Guillaume el 1 de Nov. de 2018
I'm not sure I completely understand what you're trying to do. It does look like it's more complicated than it should be.
One thing that is not consistent with your explanation and code: you say that f works on matrices with k dimensions, yet squeeze(M(i_1,...,i_{N-k},:) implemented in your code as squeeze(M(subscripts{:},:)| is always going to be a column vector.
M(subscripts{:},:) is going to be 1 x 1 x 1 ... x prod(size_M(end-k+1:end), that is a vector in the (N-k)th dimension where dimension N-k+1 to N are all concatenated in dimension N-k.
It may be a minor issue since you get the result you want, but the input to f is rarely going to have k dimensions.
StarBucK
StarBucK el 1 de Nov. de 2018
Actually you are right i just figured out by posting the message that it can only work with a function working with 1D matrix (vectors).
So two questions actually, how optimize the code (so for f working with vectors) and how can i modify to let f working on more than 1d matrices ? I need to replace M(subscripts{:},:) by another expression but i dont know how to make it.

Iniciar sesión para comentar.

 Respuesta aceptada

Guillaume
Guillaume el 2 de Nov. de 2018
You would have to modify your function F like this to achieve what I think you want:
function fct = F(Ai)
% We first take the subscript associated to the number Ai
b_dim=numel(size_M(1:end-k));
subscripts=cell(1, b_dim);
% We take the indices corresponding to Ai
[subscripts{:}]=ind2sub(size_M(1:end-k),Ai);
% replicate individual indices to the size of the k-dimensional submatrices
size_k = size_M(end-k+1:end);
subscripts = cellfun(@(idx) repmat(idx, size_k), subscripts, 'UniformOutput', false);
% emulate colon for the remaining dimension by explicitly listing all indices
k_subscripts = arrayfun(@(sz) 1:sz, size_k, 'UniformOutput', false);
[k_subscripts{:}] = ndgrid(k_subscripts{:});
% convert subscripts to linear indices and index
M_k = M(sub2ind(size(M), subscripts{:}, k_subscripts{:}));
% We then apply the function f to the good element of matrix M
fct=f(M_k);
end
That is you have to translate your subscripts back to linear indices and explicitly list all the indices for the remaining dimensions instead of an unknown number of colons.
However, yes the whole thing is way overly complicated. You only need to do:
function B = mapping(M, f, k)
B = cellfun(@(M_k) f(squeeze(M_k)), num2cell(M, k+1:ndims(M)));
end
Alternatively, you could permute the dimensions which would avoid the squeeze in the cellfun (and hence the cost of the anonymous function) but require a squeeze at the end. As permute is expensive, you may not gain anything:
function B = mapping(M, f, k)
B = squeeze(cellfun(f, num2cell(permute(M, [k+1:ndims(M), 1:k]), 1:k)));
end

4 comentarios

StarBucK
StarBucK el 2 de Nov. de 2018
Editada: StarBucK el 2 de Nov. de 2018
Thank you a lot. Actually I think the exact function following your idea would be this one :
function B = mapping(M, f, k)
% MAPPING applies a function f along the dimensions k of a matrix M, where
% k is a vector indicating on which dimensions f should be applied on.
% f is a function taking a matrix of numel(k) dimensions and outputs a
% number.
% For example : if M is a 2x3x4 array, k=2, f=norm, the matrix B will be
% a 2x4 matrix containing the norm of the k=2 dimension vectors :
% B(i,k)=norm(squeeze(M(i,:,k)))
B=squeeze(cellfun(@(M_k) f(squeeze(M_k)), num2cell(M, k)));
end
(I slightly changed the definition of the parameter k) And this seems to work fine, at least 2x faster than the code i wrote before ! However it only works for function that eats a k-dimensional matrix to return a number, is the function takes a k-dimensional matrix to return a p-dimensional one it would be i guess more complicated to write but I tortured myself enough for now on :)
Thank you a lot i would never have found your method by myself (and the code is in addition way more simple now).
Guillaume
Guillaume el 2 de Nov. de 2018
Editada: Guillaume el 2 de Nov. de 2018
I'm not sure how you'd concatenate the p-dimensional matrices into B. However, it's easy enough to let the cellfun cope with non-scalar return values from f:
cellfun(@(M_k) f(squeeze(M_k)), num2cell(M, k)), 'UniformOutput', false)
This will create a cell array instead of a matrix.
By the way, I realised after writing my answer that I overcomplicated the modification to your overcomplicated code. You could have simply used:
fct=f(reshape(M(subscripts{:},:), size_M(end-k+1:end)));
as the last line of your F function. But anyway, that's moot. num2cell eliminates all that complexity.
edit: Oh! and I completely forgot to say. If you want to speed the code up, the best thing to do would be to ensure that your k-dimensions to keep together are always the first k dimensions of the matrix, not the last k ones. This way the elements of each submatrix are always contiguous in memory. Anything else, and matlab has to skip around fetching the elements and that's going to cost you.
StarBucK
StarBucK el 2 de Nov. de 2018
Editada: StarBucK el 2 de Nov. de 2018
For the first part of your message : so actually the only thing I have to decide is to take a convention, the job is already done. Like if my function f returns me a matrix of p dimension the only thing I have to consider is how to organize my new dimensions right ?
For your edit : I am not very aware of those considerations. Imagine I have a matrix A, the distance between A(k) and A(p) is something like Abs(k-p) I assume. But if I do A(k)+A(p), matlab doesn't know if they are either far or close right ? (it would be the same time of execution in both case) So just to understand, why is it important that the data are close in memory in this example ?
First part: yes. The simplest would be to concatenate the p-dimensioned matrices in the p+1 dimension:
B = cellfun(@(M_k) f(squeeze(M_k)), num2cell(M, k)), 'UniformOutput', false);
B = cat(ndims(B{1})+1, b{:});
2nd part: In theory it will have an impact on num2cell although I haven't investigated how much. Let's take the example
>>A = cat(3, [1 3 5; 2 4 6], [7 9 11; 8 10 12])
A(:,:,1) =
1 3 5
2 4 6
A(:,:,2) =
7 9 11
8 10 12
It is stored in memory as
[1 2 3 4 5 6 7 8 9 10 11 12]
If you issue num2cell(A, 1) matlab just has to split the matrix along contiguous blocks of memory:
{[1;2], [3;4], [5;6], etc..}
If you issue |num2cell(A, [1 2]), same:
{[[1;2] [3;4] [5;6]], [[7;8] [9;10] [11;12]]}
Copying continuous blocks of memory is fast, particularly with modern processors design.
However, if you use num2cell(A, 2) matlab has to skip memory blocks to fetch the elements of each matrix:
{[1 3 5], [2 4 6], [7 9 11], [8 10 12]}
More importantly, if the submatrices to pass to f are always in the first k dimensions, you don't even need the num2cell. A simple loop would work:
function B = mapping(M, f, k)
%k: integer < ndims(B) specifying which first k dimensions to pass to f
size_M = size(M);
size_k = size_M(1:k);
B = zeros(size_M(k+1:end));
stride = prod(size_k);
startidx = 1;
for idx = 1:numel(B);
B(idx) = f(reshape(M(startidx:startidx+stride-1), size_k));
startidx = startidx + stride;
end
end
Again, you're iterating over continuous blocks of memory (of length stride so it will be fast.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Type Identification en Centro de ayuda y File Exchange.

Productos

Preguntada:

el 1 de Nov. de 2018

Comentada:

el 2 de Nov. de 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by