How to efficiently generate a random integer within a range from an arbitrary probability distribution
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Paolo Binetti
el 7 de En. de 2017
Comentada: the cyclist
el 8 de En. de 2017
I need to generate a random integer within a range from an arbitrary probability distribution, within a loop of 100000 iterations. My implementation works, but I am not sure it is mathematically clean, and it takes forever:
pdf = [ 0.9 0.3 0.003 0.1 0.07 0.0005 0.003 0.15 0.009 0.08 ]; % discrete prob distrib function
cdf = cumsum(pdf); % cumulative distribution function
cdf = cdf / max(cdf); cdf(1) = 0; % normalization
index = ceil(interp1(cdf, [1:numel(pdf)], rand(1)))
Notice that the pdf above is just an example: my actual case is a vector of about 500 numbers.
Here is a different solution, which seems mathematically cleaner, but does not work for my overall problem, and is just as slow as above:
pdf = [ 0.9 0.3 0.003 0.1 0.07 0.0005 0.003 0.15 0.009 0.08 ]; % discrete prob distrib function
cdf = cumsum(pdf); % cumulative distribution function
cdf = cdf - min(cdf); cdf = cdf / max(cdf); % normalization
index = round(interp1(cdf, [1:numel(pdf)], rand(1)))
Is there a more efficient/correct way to do this?
0 comentarios
Respuesta aceptada
the cyclist
el 8 de En. de 2017
Editada: the cyclist
el 8 de En. de 2017
I think that
index = sum(rand()>cdf)+1;
will be much faster than using interp1 as you do, and will give the same result.
4 comentarios
the cyclist
el 8 de En. de 2017
I double-checked by generating 100,000 samples, and the results seem right.
the cyclist
el 8 de En. de 2017
Regarding speed ...
I get roughly equivalent results between my solution and randsample. Note that both of these solutions can be vectorized:
index = sum(rand(N,1)>cdf,2)';
and
index = randsample(1:numel(pdf),N,true,pdf);
These will both generate N = 100,000 samples in a few milliseconds.
I did not try to vectorize your solution, but as it currently stands, it took 5 seconds to generate that many.
Más respuestas (1)
the cyclist
el 8 de En. de 2017
Do you have the Statistics and Machine Learning Toolbox? If so, you can do this with the randsample command.
Ver también
Categorías
Más información sobre Creating and Concatenating Matrices en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!