Random Number within Kernel


I’m trying to generate a random number within a @cuda.jit Kernel. I need each thread to generate ~ 5000 random numbers. Because there will be ~ 10**5 blocks of 1024 threads each, generating a single random number array in global memory is not feasible (524 billion total random numbers). So I can’t simply use the curand bindings on a device array.

Is there a way to generate a random number within a Kernel written using @cuda.jit?

As an example, I’m trying to do something like:

def loop(d):
i = cuda.grid(1)
out = 0.
for i in range(1024*5):
t = np.random.uniform() #Generate single random number
#Do something with t
out += t
d[i] = out

Thank you!

You can use the cuRAND library.

Hi SPWorley,

How do I access that from within CUDA Python? There is accelerate.cuda.rand, but this is not callable from within a Kernel.


There’s a Python wrapper for it already. But I admit I have not used CUDA within Python. Hopefully the pointers give you a place to look though.