[Parabricks3.7][A100] cudaSafeCall() failed at ParaBricks/src/mem_chain_kernel.cu/136: invalid device symbol

Hi there,

I am running pbrun with an A100 on a supernode, and I am getting the following error. How do we solve this case?

# pbrun germline \
     --ref /workspace/datasets/ref/Homo_sapiens_assembly38.fasta \
     --in-fq \
         /workspace/datasets/germline/sample_1.fq.gz  \
         /workspace/datasets/germline/sample_2.fq.gz  \
     --knownSites /workspace/datasets/ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
     --out-bam output/output.bam \
     --out-variants output/output.vcf \
     --out-recal-file output/report.txt

Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation


[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /workspace/datasets/germline/sample_1.fq.gz and
/workspace/datasets/germline/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1


[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Read group created for /workspace/datasets/germline/sample_1.fq.gz and
/workspace/datasets/germline/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
[PB Info 2022-May-04 16:38:19] Logger not initialized! 
[PB Info 2022-May-04 16:38:19] ------------------------------------------------------------------------------
[PB Info 2022-May-04 16:38:19] ||                 Parabricks accelerated Genomics Pipeline                 ||
[PB Info 2022-May-04 16:38:19] ||                              Version 3.7.0-1                             ||
[PB Info 2022-May-04 16:38:19] ||                       GPU-BWA mem, Sorting Phase-I                       ||
[PB Info 2022-May-04 16:38:19] ||                  Contact: Parabricks-Support@nvidia.com                  ||
[PB Info 2022-May-04 16:38:19] ------------------------------------------------------------------------------
[PB Info 2022-May-04 16:38:20] Logger already initialized, continuing with current settings.
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[PB Info 2022-May-04 16:38:25] GPU-BWA mem
[PB Info 2022-May-04 16:38:25] ProgressMeter    Reads           Base Pairs Aligned
[PB Warning 2022-May-04 16:38:28][ParaBricks/src/check_error.cu:41] cudaSafeCall() failed at ParaBricks/src/mem_chain_kernel.cu/136: invalid device symbol
[PB Error 2022-May-04 16:38:28][ParaBricks/src/check_error.cu:44] No GPUs active, shutting down due to previous error., exiting.
For technical support visit https://docs.nvidia.com/clara/parabricks/3.7.0/index.html#how-to-get-help
Exiting...

Could not run fq2bam as part of germline pipeline
Exiting pbrun ...

The error is:

[ParaBricks/src/check_error.cu:41] cudaSafeCall() failed at ParaBricks/src/mem_chain_kernel.cu/136: invalid device symbol

OS

base on the docker image nvidia/cuda:11.2.0-cudnn8-devel-ubuntu18.04


Hardware

  • GPU: A100 x1
  • CPU: 24 vCPU
  • CPU RAM: 100 GiB

GPU info

# nvidia-smi
Wed May  4 16:38:39 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:82:00.0 Off |                    0 |
| N/A   36C    P0    38W / 250W |      0MiB / 40536MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
# nvidia-smi -L
GPU 0: NVIDIA A100-PCIE-40GB (UUID: GPU-e72a7afe-ae40-ff53-9e07-e9e2674d37e7)
  • MIG: disabled

Software

# pbrun version
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation

pbrun: 3.7.0-1

Hey @tj_tsai

Thanks for all the information. Parabricks has two installation paths, one for non-ampere devices, and one for ampere devices. It looks like your installation is built for the non-ampere version, but you do have ampere hardware. This is an easy fix. If you re-run your installer, but this time with the --ampere flag, this should fix your issue.

Let me know if you still run into issues. Thanks!

1 Like

Hey @gburnett ,
We have solved it. Thank you for your help.

Solution:

Before: parabricks.deb
After: parabricks-ampere.deb

About the installation

The following documentation does not mention the Ampere version nor does it require the user to install Ampere version

Node Locked License, Debian Package Installation

And we also didn’t notice the Ampere version since the word Ampere is covered.
Nvidia Clara Parabricks Bare Metal Debian Package

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.