Simliar to this thread
nvidia-bug-report.log (4.1 MB)
Hello,
After upgrading coda-driver to version 12.1 any non-root users that use Slurm’s srun interactive job feature cannot run nvidia-smi and they get the following:
me@server:~$ srun -A mine -n 15 "nvidia-smi"
No devices were found
No devices were found
No devices were found
No devices were found
No devices were found
No devices were found
No devices were found
No devices were found
me@server:~$ nvidia-smi
No devices were found
Is the an env variable that is not being set somewhere? We are on Bright Cluster 9.2