The following code is causing Access violation error when executing cuRandGenerateUniform call and I have no idea why, as it seems to be an example almost from the book:
My suggestion would be to use proper error checking throughout your code and run your code with a sanitizer such as cuda-memcheck or compute-sanitizer.
As a simple example, if your count value was too large, or you were otherwise out of GPU memory (perhaps because you are calling this in a loop and not properly freeing allocations), your cudaMalloc operation would fail.
I’m not certain that is the issue, but error checking may be a useful step. Your code seems to run fine for me, when I provide suitable missing values:
If you’re still having trouble after that, please provide a complete test case. The shortest but complete code that is necessary to see the issue, the operating system you are using, the CUDA version you are using, the GPU you are running on, and the compile command you use.
Thanks Robert. Yes, I am wrapping those calls in error check macros of our own, I just wanted to reduce our code to the barebones sample. I am not getting any errors on the way up to curandGenerateUniform.
Any thoughts where the access violation could come from?
Edit: Windows 10, same result on Quadro FX5200 and RTX 3070, cuda toolkit 10.4 and 11.5.
It seems that the issue was with the supplied cuda stream. For some reason I thought that the curandSetStream would return an error if the stream was invalid.