random numbers inside the Kernel

Can you please tell how do you generate random seeds for each thread? Do you get thread number as rand seed?

This paper recently posted to arxiv discusses several different RNG strategies:
[url=“http://arxiv4.library.cornell.edu/abs/1003.1123”]http://arxiv4.library.cornell.edu/abs/1003.1123[/url]

This paper recently posted to arxiv discusses several different RNG strategies:
[url=“http://arxiv4.library.cornell.edu/abs/1003.1123”]http://arxiv4.library.cornell.edu/abs/1003.1123[/url]

The number of threads you launch (128512) is half of the # of randoms you want (1281024).

(answer is a bit late…)

The number of threads you launch (128512) is half of the # of randoms you want (1281024).

(answer is a bit late…)

The number of threads you launch (128512) is half of the # of randoms you want (1281024).

(answer is a bit late…)

The number of threads you launch (128512) is half of the # of randoms you want (1281024).

(answer is a bit late…)

I can say using park-miller rng this way is not so good. It may lead to low quality rng. People should avoid such technique in serious simulations. Also linear congruent random is not good too.

I can say using park-miller rng this way is not so good. It may lead to low quality rng. People should avoid such technique in serious simulations. Also linear congruent random is not good too.

It is a good paper, though their best approach one - rng - for- all- threads is not very fast.

It is a good paper, though their best approach one - rng - for- all- threads is not very fast.

We should be able to do this. For some reason, it does not seem to be clearly addressed either from CURAND documentation, or from the Mersenne Twister documentation. I haven’t found the answer yet, but one would think it would be possible to set up one random number generator state vector for each of your CUDA cores. Then if “rand()” accesses its local random number generator, we might be able to avoid races, and get sufficiently independent random draws. Let us know if anybody has done this yet.

It is NOT reasonable to generate a state vector for every thread, if you have MILLIONS of threads, which is not uncommon.