EDIT: Please use nvcr.io/nvidia/clara/clara-parabricks:4.6.0-2. The suggestions below are no longer needed.
Thanks for the details and the log. This is not an issue due to unified memory, MIG, or CUDA/driver mismatch.
Unfortunately it is a known issue that DGX Spark does not report the amount of GPU memory through nvidia-smi (see Known Issues — DGX Spark User Guide ). This is interfering with a new automatic configuration feature which we added by default in NVIDIA Parabricks v4.6.0. We will more gracefully handle the result in a future release.
In the meantime, one can bypass the automatic configuration by specifying parameters which have a default value of auto. I have verified that the following configurations for fq2bam, deepvariant, and haplotypecaller are valid with satisfactory performance. You may get better performance with different parameters, depending on your specific use case, as we did not perform an exhaustive search for the best configuration.
# fq2bam
docker run --rm --runtime=nvidia --gpus all nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 pbrun fq2bam --ref ${REFERENCE} --in-fq ${FQ1} ${FQ2} --out-bam ${outputfile} --bwa-nstreams 3 --bwa-primary-cpus 16 --bwa-cpu-thread-pool 1 --gpusort
# deepvariant (faster performance with `--use-tf32` at a slight loss of accuracy, remove for best accuracy with slower runtime)
docker run --rm --runtime=nvidia --gpus all nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 pbrun deepvariant --ref ${REFERENCE} --in-bam ${IN_BAM} --out-variants ${OUT}.vcf --num-streams-per-gpu 4 --use-tf32
# haplotypecaller
docker run --rm --runtime=nvidia --gpus all nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 pbrun haplotypecaller --ref ${REFERENCE} --in-bam ${IN_BAM} --out-variants ${OUT}.vcf --num-htvc-threads 8