I am working on some code that is designed to run on multiple GPU devices… we have a computer with 4 cards on the way.
I am testing the code on a machine with 3 Tesla cards.
My code uses POSIX threads to launch four (or currently, three) CPU threads, each which launches its own GPU kernels.
(so each one uses cudaSetDevice(threadid) where threadid can take the values 0-3, or 0-2 currently)
It runs successfully for 3 or 4 iterations, and then it halts, printing out that there is a CUDA error, usually “no CUDA-enabled device available,” but occasionally “invalid device symbol.”
I alternate between running two different kernels on each card.
I just don’t understand why it would run without a hitch several times, and then stop with such a cryptic error, or what I should do to remedy it. I would appreciate any help on this greatly!