CudaSafeCall() invalid device symbol error

Hi,

I am using Clara Parabricks v3.5.0 (with a trial license) on an HPC system with PBS Pro and I am getting a CudaSafeCall() invalid device symbol error. The system admin recommended I ask the developers for advice. The script and stdout are below:

The PBS script:

#!/bin/bash

#PBS -P project
#PBS -N pbrun
#PBS -l walltime=05:00:00
#PBS -l ncpus=12
#PBS -l ngpus=1
#PBS -l mem=96GB
#PBS -q gpuvolta
#PBS -W umask=022
#PBS -l wd
#PBS -o ../Logs/pbrun_fq2bam.o
#PBS -e ../Logs/pbrun_fq2bam.e
#PBS -lstorage=scratch/project

set -e

module load singularity

# Simple test without RG first
# Can use GATK interval files

ref=../Reference/hs38DH.fasta
sampleid=NA12877
fq1=../Fastq/ERR194146_1.fastq.gz
fq2=../Fastq/ERR194146_2.fastq.gz
thousandG_indels=../Reference/Known_vars/Homo_sapiens_assembly38.known_indels.vcf
gold_standard_indels=../Reference/Known_vars/Mills_and_1000G_gold_standard.indels.hg38.vcf
dbsnp=../Reference/Known_vars/Homo_sapiens_assembly38.dbsnp138.vcf
outdir=../pbrun_fq2bam
outbam=${outdir}/${sampleid}.rmdup.recal.bam
outrecal=${outdir}/${sampleid}.recal_data.table

mkdir -p ${outdir}

echo "PBS_NGPUS" $PBS_NGPUS

/g/data/project/containers/parabricks/parabricks_v3.5.0/pbrun fq2bam \
        --ref ${ref} \
        --in-fq ${fq1} ${fq2} \
        --knownSites ${thousandG_indels} \
        --knownSites ${gold_standard_indels} \
        --knownSites ${dbsnp} \
        --out-bam ${outbam} \
        --out-recal-file ${outrecal} \
        --num-gpus ${PBS_NGPUS} \
        --license-file /g/data/wz54/containers/parabricks/parabricks_v3.5.0/license.bin \
        --tmp-dir ${outdir}

The stderr:

------------------------------------------------------------------------------
||                 Parabricks accelerated Genomics Pipeline                 ||
||                              Version v3.5.0                              ||
||                       GPU-BWA mem, Sorting Phase-I                       ||
||                  Contact: Parabricks-Support@nvidia.com                  ||
------------------------------------------------------------------------------
[M::bwa_idx_load_from_disk] read 3171 ALT contigs

GPU-BWA mem
ProgressMeter   Reads           Base Pairs Aligned
cudaSafeCall() failed at ParaBricks/src/mem_chain_kernel.cu:137 : invalid device symbol

stdout:

------------------------------------------------------------------------------
||                 Parabricks accelerated Genomics Pipeline                 ||
||                              Version v3.5.0                              ||
||                       GPU-BWA mem, Sorting Phase-I                       ||
||                  Contact: Parabricks-Support@nvidia.com                  ||
------------------------------------------------------------------------------
[M::bwa_idx_load_from_disk] read 3171 ALT contigs

GPU-BWA mem
ProgressMeter   Reads           Base Pairs Aligned
cudaSafeCall() failed at ParaBricks/src/mem_chain_kernel.cu:137 : invalid device symbol
[tc6463@gadi-login-09 Logs]$ cat pbrun_fq2bam.o
PBS_NGPUS 1
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation


[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /scratch/er01/PlatinumGenomes/Fastq/ERR194146_1.fastq.gz and
/scratch/er01/PlatinumGenomes/Fastq/ERR194146_2.fastq.gz
[Parabricks Options Mesg]: @RG\tID:ERR194146.1.1\tLB:lib1\tPL:bar\tSM:sample\tPU:ERR194146.1.1
Please contact Parabricks-Support@nvidia.com for any questions
There is a forum for Q&A as well at https://forums.developer.nvidia.com/c/healthcare/Parabricks/290
Exiting...
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation



Could not run fq2bam
Exiting pbrun ...

======================================================================================
                  Resource Usage on 2021-04-01 14:57:46:
   Job Id:             20063468.gadi-pbs
   Project:            er01
   Exit Status:        255
   Service Units:      0.21
   NCPUs Requested:    12                     NCPUs Used: 12
                                           CPU Time Used: 00:00:32
   Memory Requested:   96.0GB                Memory Used: 15.02GB
   Walltime requested: 05:00:00            Walltime Used: 00:00:21
   JobFS requested:    100.0MB                JobFS used: 0B
======================================================================================