What's a good random number generator?

molecube · April 11, 2008, 7:38pm

Since rand() cannot be used in CUDA, what is a good random number generator that provides uniform distribution over a certain range of numbers that I could use?

seibert · April 11, 2008, 7:50pm

There’s a Mersenne Twister example in the SDK. That’s a good quality generator, though I’m not sure if the example shows how to use the results in device code.

jordyvaneijk · April 14, 2008, 1:48pm

I think that if you do a search on the forum you can find a lot of them. there has been a discussion about it a while ago. Can’t find it that quick but there is something…

Check it out.

x248 · April 14, 2008, 2:15pm

The mersenne twister19937 is the best, but if you want something simple

and possible without double:

take the Wichmann/Hill (the period is 10^12).

There are 3 componants (integer) in the “diffusion”.

x(0)=100,y(0)=20,z(0)=2500

x(i)=171*x(i-1) mod 30269

y(i)=172*y(i-1) mod 30307

z(i)=170*z(i-1) mod 30323

and u(i)=((x(i)/30269) + (y(i)/30307) + (z(i)/30323)) mod 1

u(i) is an uniform in [0,1]

molecube · April 15, 2008, 11:09pm

Thanks, I suppose the above could be put into a kernel function that can be run and used within device code? And would I basically increment ‘i’ every time I want a random number?

DenisR · April 16, 2008, 5:26am

You would not have any i in your code I think, I would think it looks something like this:

unsigned int x=100,y=20,z=2500;

repeat this part for every uniform randnr you want:

x *=171;

x%=30269;

y*=172;

y%=30307;

z*=170;

z%= 30323;

float u=((((float)x)/30269.0)  + (((float)y)/30307.0)  + (((float)z)/30323.0)) % 1.0;

But you have to make sure each thread gets its own set of initial x,y,z values, otherwise all threads

molecube · April 16, 2008, 6:36am

You would not have any i in your code I think, I would think it looks something like this:
unsigned int x=100,y=20,z=2500;

repeat this part for every uniform randnr you want:

x *=171;

x%=30269;

y*=172;

y%=30307;

z*=170;

z%= 30323;

float u=((((float)x)/30269.0)  + (((float)y)/30307.0)  + (((float)z)/30323.0)) % 1.0;
But you have to make sure each thread gets its own set of initial x,y,z values, otherwise all threads

[snapback]362994[/snapback]

Thanks for the clarification. So each thread would have the same initial values (int x=100,y=20,z=2500)? Otherwise…what?

jlehtone · April 16, 2008, 12:27pm

I think DenisR said: each thread must have different initial values.

If each thread starts with the same (x,y,z), then they all use the same serie of numbers. Not very random.

If the (x,y,z) is in global memory and the update of (x,y,z) is atomic in each call to “device_rand()” from threads, then threads get different values as they should. But atomic update of three values forces serial behaviour.

With “local rand”, if the period is 10^12 and you have N threads, then logically – to remain unique – each thread would start with local (x,y,z) that is 10^12/N steps forward from the start of the previous thread.

zakim · September 8, 2008, 6:58am

Thanks for the good info.

I’d just like to add that Wichmann has a new 4-sequence algorithm that is better than the 3-sequence mentioned here and the paper mentions parallelism.

The paper’s title is Generating good pseudo-random numbers 2006.
[url=“http://portal.acm.org/citation.cfm?id=1219278”]http://portal.acm.org/citation.cfm?id=1219278[/url]

alex_dubinsky · September 8, 2008, 4:18pm

Can you please post a code snippet? I can’t access the paper.

Also, does anyone know a simple generator that only needs one variable? I’d like to put it into smem. I have half a mind to use randu(), but there must exist something as simple but a bit better. I honestly don’t need something good.

zakim · September 8, 2008, 10:00pm

I consider this to be a simple generator with good properties that uses far less state variables as compared to Mersenne Twister.

Here is the 32-bit version

     // Assuming the width of integers are 32-bits

      ix = 11600 * ( ix % 185127 ) - 10379 * ( ix / 185127 );

      iy = 47003 * ( iy % 45688 ) - 10479 * ( iy / 45688 );

      iz = 23000 * ( iz % 93368 ) - 19423 * ( iz / 93368 );

      it = 33000 * ( it %  65075 ) - 8123 *  ( it / 65075 );

      

      if ( ix < 0 ) ix += 2147483579;

      if ( iy < 0 ) iy += 2147483543;

      if ( iz < 0 ) iz += 2147483423;

      if ( it < 0 )  it  += 2147483123;

      

      float W = ( ((float) ix)/2147483579.0 + ((float) iy)/ 2147483543.0

                              + ((float) iz)/2147483423.0 + ((float) it)/2147483123.0 );

       

      return fmod(W, 1.0);

Sarnath · September 9, 2008, 8:23am

Monte-carlo simulation example makes use of the MT generator - i believe (not yet checked… but dats what I used for my MC)

You can always launch a kernel to generate the random numbers in global memory required for your simulation (could be quite huge)

and

then launch your kernel that uses the random numbers (you can index them sequentialy…)

Reimar · September 9, 2008, 8:37am

If you originally wanted to use rand() why do you insist on having a good one to replace it?

Or to say it differently: If rand() is good enough for you, any of the previous ones are sever overkill.

See http://en.wikipedia.org/wiki/Linear_congruential_generator for the trivial ones that you get when using rand().

“man rand” also lists the one suggested by POSIX.

Either way, don’t forget to check that the random number generator you use is actually suitable (and also that it is not overkill) for what you do.

alex_dubinsky · September 9, 2008, 10:23pm

Thanks. A linear congruence generator is exactly what I needed.

kbarros · September 11, 2008, 3:01pm

Linear congruential generators are already implemented in CUDA

See for example
[url=“http://www.amolf.nl/~vanmeel/mdgpu/download.html”]http://www.amolf.nl/~vanmeel/mdgpu/download.html[/url]

Note that the generated random sequence is exactly that of the rand48 algorithm provided by the C standard library, but the implementation has been parallelized in a clever way.

I’ve modified this implementation so that random numbers can be generated, one at a time, within a CUDA device function. The code is attached.

alex_dubinsky · September 11, 2008, 5:55pm

Thanks. The defintion of ‘uint2’ is missing. P.S. is it ok to call cudaFree() right after a kernel call? Will cudaFree block until the kernel’s finished? Just wondering (I know you’d normally have a memcpy anyway).

kbarros · September 11, 2008, 7:51pm

The intention is that the file “rand48.cu” is included into your file “MyCudaCode.cu”, so that the functions it defines are accessible. The nvcc compiler has uint2 as a built in type.

The CUDA documentation does not list cudaFree as asynchronous (section 4.5.1.5 in my copy), so I expect that it blocks.

Kip

alex_dubinsky · September 12, 2008, 2:21am

oh ok, thanks

zakim · September 12, 2008, 6:29am

The one I posted is an excellence combination of 4 Linear congruential generators. The great thing is that it is still accurate after having each thread having its own seed.

And the best thing is that I doesn’t need to write to global mem, just registers.

MT’s algorithm is complicated.

And yes, I was using it for a CUDAfied poisson generator that has a different mean parameter for every 3D point in the data

Simon_Green · November 23, 2008, 12:17am

This code is very useful, thanks for posting.

There’s a small typo at the end:

device inline int rand48_nextFloat(Rand48 &r);

should be:

device inline float rand48_nextFloat(Rand48 &r);

Topic		Replies	Views
random number generation generating random numbers in CUDA CUDA Programming and Performance	64	109742	January 25, 2011
random numbers inside the Kernel CUDA Programming and Performance	31	60889	November 26, 2011
simple rand() a simple rand function on gpu exist? CUDA Programming and Performance	9	5834	June 1, 2009
random number generators CUDA Programming and Performance	12	7244	August 7, 2008
More on CUDA Random Numbers. CUDA Programming and Performance	2	4680	July 2, 2009
Using MersenneTwister to generate random numbers Can random sequence be continued? CUDA Programming and Performance	1	2345	September 11, 2008
MersenneTwister application CUDA Programming and Performance	0	4049	May 28, 2008
CUDA random CUDA Programming and Performance	4	5329	June 27, 2010
Mersenne Twister SDK: what's going on? CUDA Programming and Performance	10	16922	January 19, 2011
random number generator function? CUDA Programming and Performance	2	15564	November 23, 2009

What's a good random number generator?

Related topics