What do you people use for a random number generator for the GPU? We want each thread to call a random number generator. If we use Mersenne Twister, it needs to store 624 integers, which each thread needs to access, making it slow. If we use something that doesn’t require any memory (like Park-Miller) then we also have problems, because that uses a mod (%) operator which is slow. Any suggestions?
By they way, we cant use the Cuda implementation of the Mersenne Twister, because we want to use one per thread.