Hi all,
Finally got pretty hardcore into CUDA over the last couple of weeks. I’ve got a working kernel linked into another larger program.
Problem I’m having is the other program is already setup to be multi-threaded and has another Ada task (implemented as a light weight thread) that is using OpenGL to do visualizaiton.
Due to the way the OpenGL code is implemented it isnt’ playing nice with CUDA, which is fine. I’ve got a 9600 GT attached to X that is running that part and GTX 275 unattached running CUDA. I know how to check which of these two is which CUDA device number and set the correct device to do CUDA processing.
What I don’t know how to do is determine at runtime which card is or is not attached to a display without a priori knowledge. I know the information is available somewhere, just not sure where from doing a few google searches and looking at a couple of SDK examples. The NVidia X Server Settings program for instance sees the GTX 275 and doesn’t list it as attached to anything.
While I can get around this for my development machine, I can’t release code built around such assumptions. I can release code requiring two video cards actually, as this is pretty specialized software. Seems like it might be a common problem. Any ideas would be appreciated.
Thanks,
Jon
Neither CUDA API directly expose whether a GPU has an attached display. The driver API does let you see whether there is an active watchdog timer (CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT) which is probably useful enough for what you want.
A better, simpler and more flexible way is to use nvidia-smi to designate devices as compute permissive or compute prohibited. On my development box I have this:
avid@cuda:~/NVIDIA_GPU_Computing_SDK/C$ nvidia-smi -g 0 -s
Compute-mode rules for GPU=0x0: 0x2
avid@cuda:~/NVIDIA_GPU_Computing_SDK/C$ nvidia-smi -g 1 -s
Compute-mode rules for GPU=0x1: 0x1
(alternatively in deviceQuery):
avid@cuda:~/NVIDIA_GPU_Computing_SDK/C$ bin/linux/release/deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There are 2 devices supporting CUDA
Device 0: "GeForce GTX 275"
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 938803200 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.46 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Prohibited (no host thread can use this device)
Device 1: "GeForce GTX 275"
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 939261952 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.46 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Exclusive (only one host thread at a time can use this device)
Test PASSED
Which makes the display GPU compute prohibited and won’t allow a CUDA kernel to run, while marking the second as compute exclusive, allowing one host thread to use the device at a time. If you go this route, apart from greatly simplifying the code, it give a lot of flexibility to end users over how different hardware setups can be accommodated by your app.
Thanks! That sounds like it will do exactly what I’m looking for.
-Jon