I would like to run the NVIDIA Parabricks. Mostly interested in alignment, variant calling speed improvements. I can’t figure out if my GPU meets the minimum requirements. It’s Tesla K40c. Which CUDA architecture does it support?
The following are required to install Parabricks:
- Access to the internet (yes)
- nvidia-driver that supports cuda-9.0 or higher (yes)
- nvidia-driver that supports cuda-10.0 or higher if you want to run deepvariant or cnnscorevariants (yes)
- nvidia-docker or singularity version 2.6.1 or higher (yes)
- Python 2.7 (Most Linux systems will already have this installed) (yes)
- curl (Most Linux systems will already have this installed) (yes)
The following are the hardware requirements (?)
- Run on any GPU that supports CUDA architecture 60, 61, 70, 75 and has 12GB GPU RAM or more. It has been tested on NVIDIA P100, NVIDIA V100, and NVIDIA T4 GPUs.
- 1 GPU server should have 64GB CPU RAM, at least 16 CPU threads
- 2 GPU server should have 100GB CPU RAM, at least 24 CPU threads
- 4 GPU server should have 196GB CPU RAM, at least 32 CPU threads
- 8 GPU server should have 392GB CPU RAM, at least 48 CPU threads
Another important point is I am running Windows 10 64-bit Enterprise.
I want to do CNV calling, run the germline pipeline from fastq to bam + vcf from whole genome sequence data. My research is in rare developmental disorders and I have trio data. My CNV calling pipeline currently runs many days to complete. I want to be able to do it in less than an hour.
Thank you
Thank you for your interest in Parabricks. K40 is the Kepler architecture and unfortunately that doesn’t meet the minimum Compute Capabiltiy (CUDA architecture) requirements. Only Pascal (ex. P100), Volta (ex. V100) and Turing (ex. T4) are supported architectures.
1 Like
do i have other options running the same tools utilizing GPU?
Unfortunately you only really have two options as far as I know:
-
either buy compatible GPUs - so P100, V100, T4, or I’m assuming A100 now as well.
-
using GPU enabled cloud images (i.e. via AWS)
I want to do CNV calling, run the germline from fastq to bam + vcf from whole genome sequence data. My research is in rare developmental disorders and I have trio data. My CNV calling pipeline currently runs many days to complete. I want to be able to do it in less than an hour.
We have the same use case and are using Parabricks on 2x Tesla V100 cards. Our time to take a 30X whole genome sample from fastq to vcf via the germline pipeline is around 2 hours, which is a great speed up from the 20+ it takes on ~36 CPU cores. We’re finding it really nice to use for our clinical exome testing, being able to take 100X coverage exomes from fastq t0 vcf in under 10 mins.
If you can make a business case to purchase GPUs and are constantly running pipelines that would benefit I think it makes sense to invest in the hardware (I’m trying to get work to invest in more GPUs). However if you aren’t running lots of samples through then the cloud based approach might be worth a look?
Just my two cents.
1 Like
Can we get any word from Nvidia whether Geforce cards could be used in Parabricks pipelines? The newly released 3070/3080/3090 cards seem more than capable and are far cheaper than their Tesla/A100 counterparts. The 3090 in fact has double the CUDA cores of one of our V100, yet it is magnitudes of dollars cheaper.
Is there anyway to run GPU with Docker Desktop (Windows 10)?
Hello all,
As we know that AWS is quite expensive for many of us to afford. However, we have an affordable option as Google Colab Pro that also provides us high end GPUs.
I need to know from Nvidia people that is it possible to execute Parabricks pipeline on Colab? Are the GPUs in Colab pro compatible with the pipeline?
Thanks