I’m currently working an a CUDA program which essentially executes the same loop over and over.
That is, every thread is doing something like this:
- generate random number*
- read from shared memory array
- write to shared memory array
I’ll just write this in pseudo-code, since everything works quite well for most inputs.
However, if the program runs for too long, my whole memory seems to be obliterated.
Now I’m not really sure if this has anything to do with some intern CUDA looping limit, but currently, it’s my best guess. So if anyone can tell me whether such a limit really exists, the help would be appreciated!
Basically I just need to know whether this could be the source of my problem, or if there is no such limit and I have to look elsewhere.
*the random thing works with clock() calls… I’m not sure whether this can overflow, but this should be irrelevant… I think.