Multiple Devices in CUDA - Crash

I am working on some code that is designed to run on multiple GPU devices… we have a computer with 4 cards on the way.
I am testing the code on a machine with 3 Tesla cards.

My code uses POSIX threads to launch four (or currently, three) CPU threads, each which launches its own GPU kernels.
(so each one uses cudaSetDevice(threadid) where threadid can take the values 0-3, or 0-2 currently)
It runs successfully for 3 or 4 iterations, and then it halts, printing out that there is a CUDA error, usually “no CUDA-enabled device available,” but occasionally “invalid device symbol.”

I alternate between running two different kernels on each card.
I just don’t understand why it would run without a hitch several times, and then stop with such a cryptic error, or what I should do to remedy it. I would appreciate any help on this greatly!

I had the same problem with the error “invalid device symbol” in memcpy for constant memory when dealing with multiple GPUs using multiple threads.

In main thread, I uses GPU 0 to do some computation, release GPU 0 (cudaThreadExit() )then launches n pthreads, where n is the number of GPUs available in the system. Each child threads run on one GPU.

  1. The code always works in cuda 2.0

  2. With cuda 2.2, the code sometimes works, sometimes fails with “invalid device symbol” error.

  3. With cuda 2.3 beta, same as 2)

Later I moved the GPU computation in the main thread to child thread 0. Since then, the code works fine in cuda 2.2.

Hope it helps

-gshi

Search the forums for ‘GpuWorker’, it’s a bit of code that another user (MisterAndersen42) wrote that is designed to efficient handle multiple GPU’s. Perhaps you can use that in your project and save yourself some time and trouble.