Auction Algorithm is not converging due to round off error in dell 7820
4 views (last 30 days)
The Question is specifically related to Tracker and Sensor Fussion Toolbox
In trackerGNN, i am using auction algorithm for association. Due to a certain condition the code is stuck in lapAuctionFwdRev.
- I ran code in 3 dell 7820 servers and code gets stuck in lapAuctionFwdRev
- when i ran the code in dell 7910 server and dell m6800, it works fine. (i.e. Auction converges)
Upon investigation i found out that two cost values are approx equal with a difference of 1e-14 thus minRes is 1e-14. In dell 7820, when it tries to break the tie, a NaN stays in rowSoln and colSoln which is not letting the while loop break in lapAuctionFwdRev function thus stuck in infinite loop. when i ran the same condition in anyother dell model, it works fine. It most probably is the rounding off error occuring in findAllMin function (findAllMin function is used inside lapAuctionFwdRev function)
function [minValue,minIndicies] = findAllMin(values)
minValue = min(values);
minIndices = find(values==minValues)
the minIndices are different in dell 7820 as compared to anyother dell model due to round off error.
- How to solve this problem in dell 7820?
- Is this a known issue (auction infinite loop stuck )?
- How to set maxAuction limit from trackerGNN (maxAuction parameter is found inside the lapAuctionFwdRev function which can limit the while loop from runing infinitely)? It is set to inf by default. How to set it to a finite value from tracker Initialization
Kudos to tracker and Sensor fussion toolbox team for such a detailed documentation. Any help will be appreciated
Prashant Arora on 29 Jul 2021
The auction algorithm is indeed suspectible to "price wars" when epsilon is low and cost of assignments are roughly equal. This can result in the algorithm getting stuck for a long time and may be indefinitely stuck due to numerical precision.
You cannot set the maxAuction limit from the tracker. If you would like to continue using auction algorithm, I would suggest you to create a copy of the auction algorithm, make modifications (set maxAuction) and use your modified assignment algorithm with trackerGNN by setting 'Assignment' as 'Custom'.
You can also choose to use other assignment algorithms with trackerGNN such as "Matchpairs", "Munkres" and "Jonker-Volgenant".
More Answers (1)
Walter Roberson on 28 Jul 2021
min() never does rounding. min() of a single parameter uses bit-for-bit comparisons (since the entire array is always the same datatype.)
== between two values of the same datatype always uses bit-for-bit comparisons. The output of min() over a single numeric parameter is always going to be the same datatype as the parameter was, and then you are comparing that bit-for-bit identical copy of the minima that is stored in minValue against the values in the original array. Except in the case where the array is all nan, the == in that situation is guaranteed to match at least one value. (When the entire array is nan, the output of min() is not guaranteed to be bitwise identical to any of the input nan values.)
Therefore, the problem is not in findAllMin()
However, the challenge could be in the values that are passed to findAllMin(). Different processors have different instruction sets and different cache behaviors, so it is expected that the exact results of numeric computation might be different on different processors.
The 7820 appears to have "One or Two Intel Xeon Scalable Processor family CPUs (1st and 2nd generation)with up to 28 cores per processor". However, current online purchasing options for it include Intel Xeon Bronze 3204, Intel Xeon Silver 4208, Intel Xeon Silver 4214R, Intel Xeon Gold 5222
The 7910 has "Single or dual Intel® Xeon® processor E5-2600 v4" . Unfortunately the 7910 does not appear to be sold anymore so I cannot list specific CPUs.
The m6800 has "4th Generation Intel® Core™ i5 and i7 processor options up to Intel Core i7 Extreme Edition"
When I compare the Xenon processors, it looks to me as if there are differences in the number of cores, in the efficiency, and in the amount of cache available per core, but I do not notice any differences in available instruction sets (I might have overlooked something.) However, differences in the amount of cache can lead to differences in computed result, as the calculations can end up getting blocked differently.
At the moment I have no ideas as to why you might be getting nans; I am not familiar with the algorithms being used. But I do know that differences are to be expected. I would tend to only expect a difference to result in a nan if you were working with calculations that are numerically unstable.