2 CUDA devices - multiple user setup

I’m trying to set up a computer which has an 8800 GT intended to be used for the display, and a Tesla card intended to be used remotely for running CUDA codes. Currently /dev/nvidia0 /dev/nvidia1 /dev/nvidiactl are created upon loading X, with permissions granted to the user logged into X. Other users who try to access remotely are therefore unable to run CUDA code.

I’ve seen the script for setting things up for when the computer is not booted into X, which manually creates /dev/nvidia0 /dev/nvidia1 /dev/nvidiactl. But I don’t think that’s what is needed in this situation: only /dev/nvidia1 and /dev/nvidiactl should be created in this fashion. What I think would make sense would be that only /dev/nvidia0 (the 8800) has permissions associated with the user logged into X, while /dev/nvidia1 (the Tesla) is open to remote users. How can this be done?

NVIDIA: could not open the device file /dev/nvidiactl (Permission denied).

cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.0/cufft/src/config.cu, line 106

cufft: ERROR: CUFFT_INTERNAL_ERROR

Using Device:            0

Device Number:   0

Name:                    Device Emulation (CPU)

Multi-processor Count:   16

Global Memory:           -1

Shared Memory per Block: 16384

Register per Block:      8192

Warp Size:               1

Memory Pitch:            262144

Max Threads Per Block:   512

Max Threads Dimension:   512 512 64

Max Grid Size:           65535 65535 1

Constant Memory:         65536

Version:                 9999.9999

Clock Rate:              1350000

Texture Alignment:       256

I agree. But it doesn’t seem possible in CUDA currently. Even running a CUDA app that calls cudaSetDevice(1), I’ve noticed that it still tries to open /dev/nvidia0

Yes, probably to query the device attributes. On the G200 I have seen deviceQuery give different results for the clock rate so it probably actually reads those values from the card on demand.

How can you tell for sure that it’s still using /dev/nvidia0?

well, if you chance permissions, it will yell at you very loudly about being unable to open /dev/nvidia0

Thanks for all the help so far.

I manually set the permissions for /dev/nvidia1 and /dev/nvidiactl to be accessible to all users, while leaving /dev/nvidia0 as only available to the user who started X. I then ran my code with cudaSetDevice(1) and observe the same issue that MisterAnderson42 pointed out. CUDA totally ignores the call to cudaSetDevice(1), a call to cudaGetDevice immediately afterwards reports that the selected device is device 0, and then proceeds to access /dev/nvidia0. It’s only once I set /dev/nvidia0 to be accessible to all users that cudaSetDevice(1) actually works.

A few more questions:

Does NVIDIA intend for this to be fixed at some point, or is this the intended behavior?

If /dev/nvidia0 is publicly accessible and remote users are running CUDA codes using it, how does this affect the user who is using this device with X?

Even though /dev/nvidia1 is not being used by X, it is apparently loaded by X. Is it therefore subjected to the 5-second rule?