I’m interested in parallelizing a LCF based random number generator.
I’m not interested in examining other types of RNGs at this time , such as the merseinne twister.
It’s readily apparent to me that the LCF is quite easily parallelizable by generating subsequences of the LCF sequence on multiple processors.
What is not apparent to me is how to implement such a parallelization so that the results are reproducible.
Consider the following hypothetical approach:
- Generate 100 billion random values
- CUDA determines number of GPUs available to run subsequences and generates 100 billion/# of cores values on each GPU.
NOTE: There would not necessarily be an equal number of values generated on each core because 100 billion/# cores may not be an integer.
If one applied the same random value generation approach on another computer with fewer cores, or if the number of available cores changed, then the number of values generated per core would be different.
I don’t think one would be guaranteed to have reproducible results under such a scenario.
Does anyone know of an approach to parallelize LCF RNGs that can handle, e.g. have reproducible results, situations where the number of available cores can change?
Barring that, is it possible to tell CUDA how many GPUs to use for a particular algorithm? e.g. specify the number of subsequences to run for the LCF. While not optimal, I could at least implement an LCF RNG by specifying the number of GPUs one HAS to use.
p.s. I’m very new to the whole CUDA programming topic and mostly new to parallel processing so please bear with me :)