Error with clara-parabricks:4.0.1-1 fq2bam - Bad argument value: Number of GPUs requested is more than number of gpus in system

avenkatraman · June 4, 2023, 10:56pm

Hi

I am trying out clara-parabricks:4.0.1-1.sif on a g4dn.metal EC2 which has 8 GPUs

This is how I created my sif file:

singularity build clara-parabricks_4.0.1-1.sif   docker://nvcr.io/nvidia/clara/clara-parabricks:4.0.1-1

And I get this error when trying to run fq2bam

Bad argument value: Number of GPUs requested (8) is more than number of GPUs (0in the system., exiting.

This is nvidia-smi on the host

nvidia-smi
Sun Jun  4 18:34:55 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:18:00.0 Off |                    0 |
| N/A   34C    P0    25W /  70W |  13458MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            Off  | 00000000:19:00.0 Off |                    0 |
| N/A   29C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla T4            Off  | 00000000:35:00.0 Off |                    0 |
| N/A   30C    P8    11W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla T4            Off  | 00000000:36:00.0 Off |                    0 |
| N/A   28C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  Tesla T4            Off  | 00000000:E7:00.0 Off |                    0 |
| N/A   29C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  Tesla T4            Off  | 00000000:E8:00.0 Off |                    0 |
| N/A   29C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  Tesla T4            Off  | 00000000:F4:00.0 Off |                    0 |
| N/A   29C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  Tesla T4            Off  | 00000000:F5:00.0 Off |                    0 |
| N/A   29C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      7387      C   python                          13455MiB |
+-----------------------------------------------------------------------------+

This below is from inside the container after invoking it on command like by singularity shell --nv clara-parabricks_4.0.1-1.sif

Singularity> date
Sun Jun  4 18:38:42 EDT 2023

Singularity> nvidia-smi
Sun Jun  4 18:39:50 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:18:00.0 Off |                    0 |
| N/A   42C    P0    26W /  70W |  13458MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            Off  | 00000000:19:00.0 Off |                    0 |
| N/A   31C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla T4            Off  | 00000000:35:00.0 Off |                    0 |
| N/A   32C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla T4            Off  | 00000000:36:00.0 Off |                    0 |
| N/A   30C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  Tesla T4            Off  | 00000000:E7:00.0 Off |                    0 |
| N/A   31C    P8     8W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  Tesla T4            Off  | 00000000:E8:00.0 Off |                    0 |
| N/A   31C    P8    11W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  Tesla T4            Off  | 00000000:F4:00.0 Off |                    0 |
| N/A   32C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  Tesla T4            Off  | 00000000:F5:00.0 Off |                    0 |
| N/A   31C    P8     9W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      7387      C   python                          13455MiB |
+-----------------------------------------------------------------------------+

pbrun fq2bam \
> --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
> --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz \
> --out-bam fq2bam_output.bam

[PB Error 2023-Jun-04 18:42:55][ParaBricks/src/pbOpts.cu:132] Bad argument value: Number of GPUs requested (8) is more than number of GPUs (0in the system., exiting.

Would appreciate any help.

Thanks in advance.

daniel.amsel · June 6, 2023, 10:07am

Hi there,
I observed the same issue with docker and version nvcr.io/nvidia/clara/clara-parabricks:4.1.0-1 .

Kind regards,
Daniel

mdemouth · June 6, 2023, 11:25am

Hello,

sorry to hear you are having issues.
Can you please let me know if you are able to run any other CUDA application and/or CUDA samples, to make sure that this is not a driver issue.

daniel.amsel · June 6, 2023, 1:06pm

Hey,
I re-installed cuda12. It seems that previously it was not installed properly.
Furthermore, I disabled MIG service.
Now it works for me.

mdemouth · June 6, 2023, 1:34pm

Thank you for your reply.
Parabricks does not support MIG mode.

Best

Topic		Replies	Views
Clara-parabricks_4.1.0-1.sif can not recognize A100 cards? Parabricks ai	12	1147	July 2, 2024
Could not run fq2bam as part of germline pipeline (Version 4.0.1-1 ) Parabricks ai , nvidia-smi , fq2bam	11	158	December 9, 2024
Out-of-memory errors running pbrun fq2bam through singularity on A100s via slurm Parabricks ai	2	1361	January 19, 2023
Failed: CUDA driver version is insufficient for CUDA runtime version Parabricks cuda , containers , ai , driver	8	2054	November 21, 2023
"Could not run fq2bam" Is the only verbose output from Parabricks 4.4.0-1 and 4.3.2-1 on tutorial data Parabricks ai , demos-and-tutorials , fq2bam	15	186	March 3, 2025
Fq2bam on GCP Parabricks ai , fq2bam	10	65	December 11, 2024
Fq2bam Error Received signal: 11 Parabricks cuda , ai	3	1545	May 4, 2023
[Nvidia/Parabricks] Does pbrun support GPU options? Parabricks	3	1041	October 12, 2021
Problem with gpu Parabricks ai	12	2470	November 1, 2024
Could not run fq2bam when try align a sequence Parabricks ai	1	1757	December 10, 2023

Error with clara-parabricks:4.0.1-1 fq2bam - Bad argument value: Number of GPUs requested is more than number of gpus in system

Related topics