all CUDA-capable devices are busy or unavailable problem in a multi-process Linux application

Can you post a full repro?

Can you post a full repro?

I’d appreciated it if someone at Nvidia could try to duplicate this problem, using the test program ForkTest.cu that is attached to this post. It is 460 lines, but the relevant functions are only main() and getProperties(). The rest is just support code. Here is the problem description:

Synopsis of the Problem: After forking to create a child process, cudaMalloc() in the child returns the error “All CUDA-enabled devices are busy or unavailable”.

Detailed Description: The test program ForkTest.cu forks and attempts to access the GPUs in a child process. The parent calls cudaGetDeviceCount(), cudaGetDeviceProperties() and uses “new” to create several cudaDeviceProp structures before the fork. In the test program, adding these calls to the parent causes the “busy or unavailable” error in the child. If the calls are removed, the kernel in the child runs correctly. The offending device management functions are in the function getProperties(), which is called at the top of main(). Just comment or uncomment getProperties() to disable/enable the error.


Operating System: x86_64 Red Hat Enterprise Linux Client release 5.4 (Tikanga)

CUDA toolkit release version: cudatoolkit_3.1_linux_64_rhel5.4.run

SDK version: gpucomputingsdk_3.1_linux.run

CPU code compiler: g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)

CPU Type: dual Intel Xeon Quad-Core E5620 2.4GHz 5.86GT/s 1366pin 12MB CPUs

Installed System RAM: 48 Gb ((12x) 4Gb DDR3 1333GHz, Registered ECC, dual rank)

System type: Custom-built system, ThinkTank IQ-4XGPU, Desktop Super Computer; chassis is Supermicro SuperServer 7046GT-TRF, Optimized for 4x GPU cards

Video cards installed in the system: 4 GeForce GTX 480

Motherboard: Motherboard: Tyan S7025AGM2NR, dual Xeon processor motherboard

Chipset: IOH / ICHIntel (2) 5520 / ICH10R; Super I/OWinbond W83627DHG
ForkTest.cu (14.5 KB)

I’d appreciated it if someone at Nvidia could try to duplicate this problem, using the test program ForkTest.cu that is attached to this post. It is 460 lines, but the relevant functions are only main() and getProperties(). The rest is just support code. Here is the problem description:

Synopsis of the Problem: After forking to create a child process, cudaMalloc() in the child returns the error “All CUDA-enabled devices are busy or unavailable”.

Detailed Description: The test program ForkTest.cu forks and attempts to access the GPUs in a child process. The parent calls cudaGetDeviceCount(), cudaGetDeviceProperties() and uses “new” to create several cudaDeviceProp structures before the fork. In the test program, adding these calls to the parent causes the “busy or unavailable” error in the child. If the calls are removed, the kernel in the child runs correctly. The offending device management functions are in the function getProperties(), which is called at the top of main(). Just comment or uncomment getProperties() to disable/enable the error.


Operating System: x86_64 Red Hat Enterprise Linux Client release 5.4 (Tikanga)

CUDA toolkit release version: cudatoolkit_3.1_linux_64_rhel5.4.run

SDK version: gpucomputingsdk_3.1_linux.run

CPU code compiler: g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)

CPU Type: dual Intel Xeon Quad-Core E5620 2.4GHz 5.86GT/s 1366pin 12MB CPUs

Installed System RAM: 48 Gb ((12x) 4Gb DDR3 1333GHz, Registered ECC, dual rank)

System type: Custom-built system, ThinkTank IQ-4XGPU, Desktop Super Computer; chassis is Supermicro SuperServer 7046GT-TRF, Optimized for 4x GPU cards

Video cards installed in the system: 4 GeForce GTX 480

Motherboard: Motherboard: Tyan S7025AGM2NR, dual Xeon processor motherboard

Chipset: IOH / ICHIntel (2) 5520 / ICH10R; Super I/OWinbond W83627DHG

I did find a workaround for the problem. I moved the getDeviceProperties() functions from the parent to the children, and then cudaMalloc() worked without producing that error message.