'Default' random seed when parallel pool is created?
13 views (last 30 days)
Hi, I came across an interesting behavior concerning parallel computing and random number generation.
When I run the following code several times
poolobj = gcp('nocreate');delete(poolobj)
parpool; parfor l=1:1, rand(1,10), end
(i.e. delete existing parallel pool, initialize a parallel pool and generate 10 random numbers in a (trivial) parfor loop), I get exactly the same random numbers in most of the times. This happens without setting a random seed. I dont understand why I get the same numbers in most of the cases (and if there is a reason for this behavior, why isnt it the case in EVERY run ?). Note: if I replace the 'parfor' by 'for' in the above code, I get different random numbers in different runs (as expected). Does anyone have an idea if some kind of random seed is automatically set by this percedure of shutting down a parallel pool and initializing a new one? (Which only holds in 'most of the cases' and not for standard 'for'-loops?) I am very confused.
Edric Ellis on 30 Mar 2021
Workers in a parallel pool have deterministic random number generation states set up each time they start up. This is described in detail here in the documentation. The reason that you don't see reproducible results is that the parfor implementation itself does not guarantee to send the same iterations of the loop to the same workers. You would get reproducible results by using spmd, where you know for sure which workers run each iteration:
x = rand(1,10)
The code above will create the same x on each worker each time you start a fresh parallel pool
If you wish to get complete reproducibility of random number generation in a parfor loop, you can use the techniques described here which basically ensure that the random number generation state is based on the loop iteration.