Inconsistent matrix multiplication results
Mostrar comentarios más antiguos
I noticed that some code I was running returned different results between runs. I seed all random number generators with a constant, so that shouldn't be the issue. I finally tracked down the problem (or at least one case of it) to a single matrix operation, of the form y1/(y'*y), where y1 is a scalar and y is a column vector. For two seemingly identical values of y, y'*y produced answers that differed in the last decimal place. If I reassigned y - i.e. d = y, d'*d, the answer was the same as y'*y. If I print out y with full precision (20 decimal places) and assign it to d, then d'*d was different from y'*y, even though y == d was all 1.
If I saved the workspace and loaded it, y'*y was different (not equal to the old y'*y), but y itself was still the same.
I am using r2011b, and this issue is somewhat annoying, because these discrepancies add up and I get inconsistent results between runs.
Has anyone seen this before, and do you have any advice? Could this be a bug in r2011b?
EDIT: y and d arrays are bitwise identical (using hex printout), but y'y and d'd differ in the last bit; saving & loading the workspace makes y'y identical to d'd, and this error appears to be nondeterministic. In successive runs, the error in the last bit may or may not occur, and appears somehow "tied" to the array y - i.e. if it occurs, successive calls to y'y produce the same (incorrect) answer, but loading the workspace with bitwise identical y produces the correct answer.
I suspect that Matlab keeps some internal flags that describe the order in which the sum in y'*y should be computed (resulting in different truncation error), and these flags are somehow associated with the array. Reloading the workspace must reset these flags. However, for whatever reason, between successive runs of my function, this is not consistent.
EDIT 2: I can confirm that this does not happen in R2010b or R2009b. So this appears to be a new bug introduced in R2011b.
1 comentario
Walter Roberson
el 2 de Feb. de 2012
http://matlab.wikia.com/wiki/FAQ#Why_is_0.3_-_0.2_-_0.1_.28or_similar.29_not_equal_to_zero.3F
Respuestas (2)
John D'Errico
el 2 de Feb. de 2012
Sigh. No. This is NOT a bug in r2011b. This is NOT a bug. It is merely your not recognizing that this is floating point arithmetic. And the fact is, even if you print out 20 decimal digits of y, you have not encapsulated the true exact value of y!!!!!!!
For example, consider this number:
N = 1.23456789012346;
The decimal representation of that number was simple. Yet matlab converts it to a binary form, because all numbers are stored in that form internally. The value that matlab actually uses is:
hpf(N)
ans =
1.2345678901234560242983206990174949169158935546875
As you can see, they differ.
Welcome to the world of floating point arithmetic.
9 comentarios
Sergey
el 2 de Feb. de 2012
Walter Roberson
el 2 de Feb. de 2012
Different data types?
Sergey
el 2 de Feb. de 2012
Walter Roberson
el 2 de Feb. de 2012
Could you show us a minimal example, representing the values in hex ?
Sergey
el 2 de Feb. de 2012
James Tursa
el 2 de Feb. de 2012
I am unaware of any flag attached to variables internally that would behave this way, and I am pretty familiar with the internal representation of a MATLAB numeric variable. If the two arrays are indeed hex identical down to the last bit, then I would suspect that the order of operations is different between the two calculations. The MATLAB JIT accelerator can sometimes do this for two seemingly identical expressions ... e.g. one as part of an assignment and the other as part of an if test could get different optimizations applied to them. Or if one is part of a function (JIT processing gets applied) and the other is typed in from the command line (JIT processing is not done). Or perhaps something is changing the floating point mode in the background. Or maybe something is going on in the BLAS library with regards to reduction variables in a multi-threaded operation. Hard to say unless we can reproduce the behavior. You might be interested in my MTIMESX submission on the FEX to do your dot products ... it can sometimes be faster than the MATLAB calculation if you have a good compiler, and because of loop unrolling it is usually more accurate.
James Tursa
el 2 de Feb. de 2012
Does all(y(:)==d(:)) evaluate to true?
Sergey
el 2 de Feb. de 2012
James Tursa
el 2 de Feb. de 2012
Maybe it is not doing y'*y at all ... the optimization may recognize that y hasn't changed and it is just reporting the result of an earlier evaluation of this product (that did it in a different order). Try tic;y'*y;toc vs tic;d'*d;toc to see if there is a difference in timing that might yield a very small value for the first one. Also try y(1) = y(1) (forces optimization code to think that the variable has changed) and then do y'*y again.
Jan
el 2 de Feb. de 2012
0 votos
It would be helpful, if you post a small piece of code, which reproduces the problem. Then we could investigate what's going on, e.g. by checking the mxArray headers or calling the BLAS functions directly from a Mex function.
If you cannot reproduce the problem with manually created data, one of the following problem can be the source of the different results:
- An external program modifies the floating point environment, e.g. the rounding direction, perhaps temporarily.
- The floating point unit of your processor is near to melting. This happened to me one time - but after some hours the processor died completely...
Categorías
Más información sobre Matrix Indexing en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!