nvidia-smi : how to make compute mode permanent compute mode reverts to 0 after reboot

I set the compute mode to rule 1 using nvidia-smi, but after a reboot, it reverts back to zero. In this May 2009 thread

[post=“0”] Unable to set exclusive compute mode using nvidia-smi[/post]

the solution given was that nvidia-smi needs to be running continuously in the background:

nvidia-smi --loop-continuously --interval=60 --filename=/var/log/nvidia-smi.log &

The thread stated that this problem would be fixed in a future driver release. Is there now a better way to make compute mode rules permanent? (I guess I could put “nvidia-smi -g 0 -c 1” in one of our startup files. Just wondering if nvidia-smi has a standard way to do it :unsure: )

I am using a GeForce 480 card on x86_64 Red Hat Enterprise Linux Client release 5.4 (Tikanga)

Nvidia driver version 256.40

The Cuda toolkit I downloaded was cudatoolkit_3.1_linux_64_rhel5.4.run

The compute configuration isn’t persistent across reboots, so set your desired machine state in a startup file. The nvidia-smi daemon mode trick solves a different problem. The NVIDIA linux driver automatically unloads driver libraries and releases state resources after a period of inactivity when no client (be that the X server, a user space application, or nvidia-smi itself) is connected to the driver. The unloading causes a loss of driver state including compute mode settings.

On our cluster, all GPU nodes run a start up script which sets up the necessary device entries in dev, starts nvidia-smi in daemon mode and the calls nvidia-smi to configure each gpus compute state. I don’t believe there is yet a way to avoid it, but it is a trivial solution which we find “just works”.

The compute configuration isn’t persistent across reboots, so set your desired machine state in a startup file. The nvidia-smi daemon mode trick solves a different problem. The NVIDIA linux driver automatically unloads driver libraries and releases state resources after a period of inactivity when no client (be that the X server, a user space application, or nvidia-smi itself) is connected to the driver. The unloading causes a loss of driver state including compute mode settings.

On our cluster, all GPU nodes run a start up script which sets up the necessary device entries in dev, starts nvidia-smi in daemon mode and the calls nvidia-smi to configure each gpus compute state. I don’t believe there is yet a way to avoid it, but it is a trivial solution which we find “just works”.