Singularity image of Relion

I’ve been trying to get Singularity image of Relion Docker container (GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC) working on our HPC cluster with P100 and V100 nodes.
I was able to create the image (it’s very easy) and even to run it on my workstation with 2x 1080Ti. No problems.
But when I run it on the cluster I get

ERROR: all CUDA-capable devices are busy or unavailable in /opt/relion-sm70/src/gpu_utils/cuda_projector.cu at line 115 (error-code 46)

which is not true. Even if I run nvidia-smi through the very same image it tells me all four GPUs are free. The same error happens on P100 and V100 nodes. My workstation isn’t really much different from the cluster environment. Both are CentOS 7, the same driver version:
NVIDIA-SMI 396.26 Driver Version: 396.26

Any hints are most welcome. I can send the full output if necessary. The error happens straight after

Running CPU instructions in double precision.

  • On host gpu03.prv.davros.compute.estate: free scratch space = 539 Gb.
    Copying particles to scratch directory: /scratch/tmp.14368/relion_volatile/
    1.30/1.30 min …(,_,">
    Estimating initial noise spectra
    58/ 58 sec …
    (,_,">
    CurrentResolution= 60.2998 Angstroms, which requires orientationSampling of at least 18.9474 degrees for a particle of diameter 360 Angstroms
    Oversampling= 0 NrHiddenVariableSamplingPoints= 580608
    OrientationalSampling= 15 NrOrientations= 4608
    TranslationalSampling= 2 NrTranslations= 21
    =============================
    Oversampling= 1 NrHiddenVariableSamplingPoints= 18579456
    OrientationalSampling= 7.5 NrOrientations= 36864
    TranslationalSampling= 1 NrTranslations= 84
    =============================

Thanks in advance!

The issue appears to be related to Compute Mode settings of the cards. The workstation cards were set to Default while the HPC cluster cards were set to Exclusive_Process. Changing them to Default resolved the problem.

Hi I’m new to singularity, I built the same image with singularity build --name my_relion.simg docker://nvcr.io/hpc/relion:3.1.0

I tried ./my_relion.simg and singularity run --nv my_relion.simg, it says relion command not found! I would appreciate your input!