Improving the consistency of the NNMF function

3 visualizaciones (últimos 30 días)
Thomas
Thomas el 17 de Dic. de 2014
Comentada: xiaoqian chang el 15 de Sept. de 2020
I'm attempting to use non-negative matrix factorization on a matrix containing spectral information (A). Whenever I run the nnmf function, the output matrices W and H are usually different from any other iterations. I have found that this is stated in the help documentation for the nnmf function:
"Because the root-mean-squared residual D may have local minima, repeated factorizations may yield different W and H."
However, as a result of this, I find it difficult to make use this method to say anything scientifically meaningful, as it introduces considerable bias on my behalf (I can effectively run the function repeatedly until I come to a result that fits with my narrative).
My question: how can I get the nnmf function to return W and H matrices with higher reproducibility thereby improving my confidence in the method? I've tried tweaking the input options by decreasing the tolerances, increasing the number of replicates in the initial run, and increasing the number of iterations, all with little effect.
My code is currently very similar to what is written in the help documentation and looks like this:
numcom = 2; % The rank. My datasets typically can be described by very low-rank approximations
opt = statset('MaxIter', 10, 'Display', 'final');
[W0,H0] = nnmf(A, numcom, 'replicates', 10, 'options', opt, 'algorithm', 'mult'); %Get starting values
opt = statset('Maxiter', 1000, 'Display', 'final');
[W,H] = nnmf(A, numcom, 'w0', W0, 'h0', H0, 'options', opt,' algorithm', 'als');
Of course, I can set the random number generator to default before running the function every time:
rng('default')
But that kind of defeats the purpose ;)
  1 comentario
xiaoqian chang
xiaoqian chang el 15 de Sept. de 2020
Is Your problem solved now ?i HAVE the same problem.Thank you!

Iniciar sesión para comentar.

Respuestas (1)

Jakub
Jakub el 19 de Ag. de 2019
According to my experiences I only use 'als' algorithm and with many replicates which usually gives me better estimate. So something like this:
opt = statset('Maxiter',100,'Display','final','useparallel',true);
[coeff,score] = nnmf(A, numcom,'replicates',1e6,'options',opt);
  1 comentario
Guy Reading
Guy Reading el 12 de Nov. de 2019
Agreed, with enough replicates hopefully the space will be adequately explored and the global max will be found each time & repeatably returned. How many is enough? Depends on how large your input space (m) is...

Iniciar sesión para comentar.

Categorías

Más información sobre Dimensionality Reduction and Feature Extraction en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by