Row & Column Wise Normalisation

Question

1 voto

Objective: Normalise a matrix such that all rows and columns sum to 1.

The below normalises each column, then row and repeats until row and column totals, equal one another.

This seems to work for randomly generated arrays.

However, the data I wish to use it on has some zeros - and that is generating lots of NaN and Infs, which is making things quite messy and sometimes when running the while loop won't execute (no error message, it just hops over it)

I've tried changing the while condition to be rounded to 3 decimal places (because that's good enough) but still no success.

a = rand(7)
rows = sum(a,2) % orginal row totals
cols = sum(a,1) % original col totals
b = a;
i = 1; % for counting how many iterations
while sum(b,1,"omitnan") ~= sum(b,2,"omitnan")' %when column totals == row totals, stop.
    b = b ./ sum(b,1,"omitnan"); %divide by col totals
    b = b ./ sum(b,2,"omitnan"); %divide by row totals
    i = i + 1;
end
i %how many loops
b % normalised output
brows = sum(b,2,"omitnan") %check that all rows sum 1
bcols = sum(b,1,"omitnan") %check that all cols sum 1.

attached are two 7 x 7 matrices. These are the desired input for a.

Suggestions welcome.

edit:

The margfit function (row 345 - 376) in link below, is (I think) what I am trying to implement. My python is non-existant

https://github.com/GoricaB/Land-cover-validation/blob/master/pts_lcval.py

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

John D'Errico el 13 de Feb. de 2020

Editada: John D'Errico el 13 de Feb. de 2020

Abrir en MATLAB Online

Anyway, assume the matrices shown are indicative of what we should expect, thus entirely non-negative. Any zero rows or columns can be extracted, and then returned to the array later on, which leaves us with a possibly rectangular array that has no fully zero rows or columns.

In that context, what can we say about the solution? That is, consider an array A0, of size NxM. Do there exist vectors of L and R, length N and M respectively, such that

A = diag(L)*A0*diag(R)

where the matrix A has all unit row and column sums?

First, if a solution does exist, can it be unique? NO. If any such solution with vectors L and R does exist, then L*k and R/k is also an equally valid solution, for any non-zero scalar value k. We might decide to require that norm(L) == norm(R), or some similar requirement, thus forcing the solution to be unique.

Personally, I alwsys like to play around and get my hands dirty, before I think more seriously about a problem.

A = rand(7);
A0 = rand(7);A = A0;
for i = 1:100
  A = A./sum(A,1); % requires R2016b or later
  A = A./sum(A,2); % requires R2016b or later
end
[sum(A0,1);sum(A,1)]
ans =
        3.828       2.2711       3.9473       4.6529       2.5008       3.7227         3.23
            1            1            1            1            1            1            1
[sum(A0,2),sum(A,2)]
ans =
       3.4035            1
       3.0916            1
       4.5867            1
       3.0196            1
       4.0064            1
       1.8467            1
       4.1985            1

As we see, a simple iterative scheme works sufficiently well. Better code would of course have been testing for convergence, removing and replacing all zero rows or columns, etc., but you get the drift. Randomly interspersed zeros are not a problem, as long as any row or column is not fully and identically zero. We cannot have a row or column with zero sum however.

But despite my success in the above simple example, it still begs the question: Does a solution always exist? (Probably, but a proof would need to be slightly more rigorous than my assertion. Some time is now necessary...)

John D'Errico el 13 de Feb. de 2020

Thanks to Matt for providing the (now obvious) counterexample.

edward holt el 13 de Feb. de 2020

Matt, thank for the Sinkhom-Knopp information (a fair chunk of that is beyond my skill-set)

And John, thank you for making me realise something that now seems glaringly obvious. Removing the columns / rows that are entirely comprised of zeros is certainly the first step.

Furthermore, a solution doesn't seem possible in the data I attached, as there were a few instances of a column containing only one non-zero element, with the corresponding row containing multiple non-zero elements.

Thank you for your efforts.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Matt J el 13 de Feb. de 2020

Editada: Matt J el 13 de Feb. de 2020

2 votos

Sinkhorn-Knopp.pdf

For a non-negative square matrix, the attached article mentions necessary and sufficient conditions (p. 3, Theorem 1) both for the normalization you are trying to achieve to be possible and for the alternating row/column normalization approach (the Sinkhorn-Knopp algorithm ) to work. The required condition for both are the same. So basically, if you are seeing Infs and NaNs in your iterations, the normalization is known to be impossible from the get-go.

The condition is:

"A necessary and sufficient condition ... is that A has total support"

The given matrix A having total support means that for every non-zero element A(i,j)>0, a column permutation Ap of A exists such that Ap has only strictly positive elements on the diagonal, one of which is A(i,j).

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Row & Column Wise Normalisation

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

Row & Column Wise Normalisation

9 comentarios Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos