turn off a GPU via software linux

Is there a simple way to turn off a GPU when there are 2 installed on a linux computer either via environment variable before launching app or some command or some control panel - some of our plugin software is having issue with recent nVidia drivers when 2 or more GPU are installed.

Pierre Jasmin

Hi Pierre,

It is possible to hide a GPU from X using xorg.conf options. See http://us.download.nvidia.com/XFree86/Linux-x86_64/390.87/README/xconfigoptions.html#ProbeAllGpus . If “ProbeAllGpus” is disabled, the X driver will only touch GPUs that are explicitly specified by BusID in xorg.conf. So, if the use were to run “nvidia-xconfig --no-probe-all-gpus -a”, then go and delete all but the desired X screen/GPU in xorg.conf, they would effectively disable the other GPU(s) from X’s perspective.

However, the GPU could still be accessed via NVML or other non-X interfaces, so this solution might not be sufficient for your purposes.

Could you provide us some additional details about this plugin software issue? Do you have an nvida-bug-report.log.gz that you can provide to us?

For the multiple GPUs, are you using multiple X screens, mosaic, SLI, or Xinerama?

Thanks,
Ryan Park

Problem is we don’t have a dual Quadro P6000 setup to replicate here.
Application is Autodesk Flame and plugin is Twixtor.
InitCompute below is our code that determines at startup which GPU is faster to execute. We only run on one after that. So we deadlock in testing that, we are not running multiple GPU at once, just sequentially checking which one is faster. It’s same code we run for many years to test this, has been tested with up to 8 GPU under quadro driver a number of years ago. It has stopped working recently on Quadro when client has two GPU in their machine. We don’t have same issue say with 2 GTX 690 and a Titan in our system. We tried to put an old Quadro k5000 to run on Quadro driver and we get a black screen at startup somehow if it’s first card and under GTX first card, the Quadro does not show up.

Just looking for a solution other than tell clients to remove power cable so our plugin does not crash their application when applying it. We cannot tell them to modify their system in a manner that will affect how other applications might work. Is nVidia driver using futex_wait? We don’t. If so google says that many linux versions (e.g. RHEL 6.6, 7.0, 7.1) are affected by a futex_wait bug in general.

(gdb) where #0 0x00007f31e0086a0b in do_futex_wait.constprop.1 () at
/lib64/libpthread.so.0 #1 0x00007f31e0086a9f in
__new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0 #2
0x00007f31e0086b3b in [1]sem_wait@@GLIBC_2.2.5 () at
/lib64/libpthread.so.0 #3 0x00007f31b0a22a37 in () at
/lib64/libnvidia-opencl.so.1 #4 0x00007f31b0909ba9 in () at
/lib64/libnvidia-opencl.so.1 #5 0x00007f31b08f4461 in () at
/lib64/libnvidia-opencl.so.1 #6 0x00007f2e929fef22 in () at
/opt/Autodesk/sparks/links/flame/__sparks0 #7 0x00007f2e929a2827 in ()
at /opt/Autodesk/sparks/links/flame/__sparks0 #8 0x00007f2e92a048f6 in
() at /opt/Autodesk/sparks/links/flame/__sparks0 #9 0x00007f2e92bfabd1
in () at /opt/Autodesk/sparks/links/flame/__sparks0 #10
0x00007f2e92bfb9cf in () at /opt/Autodesk/sparks/links/flame/__sparks0
#11 0x00007f2e92d149a4 in () at
/opt/Autodesk/sparks/links/flame/__sparks0 #12 0x00007f2e92a08964 in ()
at /opt/Autodesk/sparks/links/flame/__sparks0 #13 0x00007f2e92a08c57 in
() at /opt/Autodesk/sparks/links/flame/__sparks0 #14 0x00007f2e9299c9e9
in RVSparks::initComputeDevice() () at
/opt/Autodesk/sparks/links/flame/__sparks0 #15 0x00007f2e92999471 in
SparkInitialise () at /opt/Autodesk/sparks/links/flame/__sparks0

You can try the CUDA_VISIBLE_DEVICES environment variable. Details on how it works can be found here: https://devblogs.nvidia.com/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/

Basically, you can specify a comma-separated list of which GPUs should be made visible to applications. The GPUs are identified by their UUIDs, which can be determined by running nvidia-smi. The “-L” option should provide the list of GPUs, along with their respective UUIDs.

Thanks,
Ryan Park