Finding the most orthogonal set of n vectors from dataset of unit vectors

11 visualizaciones (últimos 30 días)
I am attempting to find the set of determine a subset of vectors in a matrix that are the most dissimilar. Each row in this matrix is a unit vector which is unique.
idx = nchoosek(1:size(dataset, 1), 5); % Produce all possible combinations of indices
score = zeros(size(idx, 1), 1); % Store similarity score of unit vectors into the corresponding row of score
for I = 1 : size(idx, 1)
score(I) = sum(prod(dataset(idx(I, :), :), 1)); % Element-wise multiply this combination's vectors and add
end
[best_scores, best_idx] = mink(score, 10)
  1 comentario
David Goodmanson
David Goodmanson el 8 de Ag. de 2020
Hi Alex
cosider
a = [ 1 1 1 1 1 1 1 1 -2
2 2 2 2 2 2 2 2 -4
3 3 3 3 3 3 3 3 -6]
Since these three vectors are scalar multiples of each other, they are parallel and as mutually nonorthogonal as you can get. But sum(prod (a)) = 0. So that test is not goiing to work.
[the use of 1 in prod(a,1) is superfluous since 1 is the default]

Iniciar sesión para comentar.

Respuesta aceptada

Bruno Luong
Bruno Luong el 8 de Ag. de 2020
Editada: Bruno Luong el 8 de Ag. de 2020
I propose this
% random test A, which represents 100 normalized vector in R^5
n = 5;
m = 100;
A = randn(n,m);
A = A ./ sqrt(sum(A.^2,1));
% Want to select k columns of A, most mutually "orthogonal"
k = 3; % requirement: k <= rank(A) <= n
n = size(A,2);
c = nchoosek(1:n,k);
d = arrayfun(@(r) detsub(A(:,c(r,:))), 1:size(c,1));
[~,imax] = max(d);
% selection of best k columns of A
col = c(imax,:);
B = A(:,col)
% check scalar products, it must have small off-diagonal terms
B'*B
function d = detsub(A)
[~,R] = qr(A,0);
d =prod(abs(diag(R)));
end
  8 comentarios
Bruno Luong
Bruno Luong el 25 de Ag. de 2020
Yes
Let me explain to you more. When you do qr on A where each column of A is unit vector:
[Q,R] = qr(A)
you have the following properties
  • span {Q(:,1:i)} contains span {A(:,1:i)} for all i.
  • the coefficient R(i,j) is dot(Q(:,i),A(:,j)) for i <= j.
  • Since norm(A(:,j))=1 and R is triangular we have norm(R(1:j,j)) = 1.
Now doing some math. For i < j.
abs(dot(A(:,i),A(:j)) = norm Projection A(:,j) on span{A(:,i)}
<= norm Projection A(:,j) on span{Q(:,1:i)}
= norm(R(1:i;j))
= sqrt(R(1,j)^2 + R(2,j)^2 + ... R(i,j)^2)
You want abs(dot(A(:,i),A(:j)) small (most orthogonal) meanning that you want
R(1,j)^2 + R(2,j)^2 + ... R(i,j)^2
to be small, for all i < j.
Since from the third property
R(1,j)^2 + R(2,j)^2 + ... R(j,j)^2 = 1
you just want the last term R(j,j)^2 is as large as possible (therefore the partial sum is small).
And that you want for all j.
I chose det(A) = prod(R(j,j)) as metric, since it has some geometrical interpretation as the volume of the parallelepipied formed by columns of A.
Argually you could use other metric such as "Trace"(A) = sum (abs (R(j,j)) as metric.
My own intuition tell me the det choice is somewhat better.
Z
Z el 25 de Ag. de 2020
That is a valuable explanation. Thank you so much for your input!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Spline Postprocessing en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by