Hi,
I’ve been testing Parabricks on our local HPC (with A30s) and comparing it to bwa and gatk following the instructions in the links below:
https://docs.nvidia.com/clara/parabricks/4.0.0/Documentation/ToolDocs/man_fq2bam.html#man-fq2bam
I’ve been using the example dataset recommended by the tutorial.
Since I’m testing on an HPC environment, I’ve been pulling the Parabricks container with apptainer 1.1.0:
apptainer pull docker://nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1
the Slurm run script I’m using is:
#!/bin/bash
#SBATCH --mem=125G
#SBATCH --time=10
#SBATCH --qos=bonus
#SBATCH --partition=gpuq
#SBATCH --cpus-per-task=24
#SBATCH --job-name=fq2bam-gpu
#SBATCH --output=%x-%j.o
#SBATCH --error=%x-%j.e
#SBATCH --gres=gpu:A30:1
# setting up input and output files
in_dir=/vast/scratch/users/yang.e/gpu-aligners/parabricks/parabricks_sample_rearranged
fq1_file_name=sample_1.fq.gz
fq2_file_name=sample_2.fq.gz
known_sites_file_name=Homo_sapiens_assembly38.known_indels.vcf.gz
ref_file_name=Homo_sapiens_assembly38.fasta
out_dir=`pwd -P`
out_recal_file_name=recal-gpu.txt
out_bam_file_name=mark_dups_gpu.bam
apptainer run -W ${in_dir} --nv -B ${in_dir} -B ${out_dir} \
clara-parabricks_4.0.0-1.sif \
pbrun fq2bam \
--ref ${in_dir}/${ref_file_name} \
--in-fq ${in_dir}/${fq1_file_name} ${in_dir}/${fq2_file_name} \
--knownSites ${in_dir}/${known_sites_file_name} \
--out-bam ${out_dir}/${out_bam_file_name} \
--out-recal-file ${out_dir}/${out_recal_file_name}
If I compare the final bam files with du
, I get:
$ du -h fq2bam-gpu/mark_dups_gpu.bam
4.5G fq2bam-gpu/mark_dups_gpu.bam
$ du -h fq2bam-cpu/mark_dups_cpu.bam
5.1G fq2bam-cpu/mark_dups_cpu.bam
and the recommended bam diff command doesn’t return anything.
Comparing the bqsr reports:
$ diff fq2bam-cpu/recal-cpu fq2bam-gpu/recal-gpu.txt
38,39c38,39
< 12 9300111 12
< 13 100360144 13
---
> 12 9301228 12
> 13 100362463 13
49c49
< 23 81759155 23
---
> 23 81761883 23
51,55c51,55
< 25 3934555 25
< 26 45391671 26
< 27 60802037 27
< 28 547404255 28
< 29 3750950052 29
---
> 25 3934663 25
> 26 45392934 26
> 27 60803126 27
> 28 547415841 28
> 29 3751034096 29
123,124c123,124
< ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors
< sample_rg1 M 26.0000 30.7691 4599901980 11657780.00
---
> ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors
> HK3TJBCX2.1 M 26.0000 30.7690 4600006234 11658934.00
128,144c128,144
< ReadGroup QualityScore EventType EmpiricalQuality Observations Errors
< sample_rg1 14 M 12.0000 9300111 629474.00
< sample_rg1 15 M 13.0000 12782602 613091.00
< sample_rg1 16 M 13.0000 87577542 4103975.00
< sample_rg1 27 M 23.0000 81759155 391842.00
< sample_rg1 28 M 25.0000 3934555 12017.00
...truncated...
---
> ReadGroup QualityScore EventType EmpiricalQuality Observations Errors
> HK3TJBCX2.1 14 M 12.0000 9301228 629500.00
> HK3TJBCX2.1 15 M 13.0000 12782950 613134.00
> HK3TJBCX2.1 16 M 13.0000 87579513 4104091.00
> HK3TJBCX2.1 27 M 23.0000 81761883 391871.00
> HK3TJBCX2.1 28 M 25.0000 3934663 12021.00
...truncated...
148,3642c148,3642
< ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors
< sample_rg1 14 -1 Cycle M 17.0000 2729 2.00
< sample_rg1 14 -10 Cycle M 12.0000 45153 2661.00
< sample_rg1 14 -100 Cycle M 12.0000 75975 5128.00
< sample_rg1 14 -101 Cycle M 12.0000 77372 5137.00
< sample_rg1 14 -102 Cycle M 13.0000 81402 4117.00
... truncated ...
---
> ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors
> HK3TJBCX2.1 14 -1 Cycle M 17.0000 2729 2.00
> HK3TJBCX2.1 14 -10 Cycle M 12.0000 45155 2661.00
> HK3TJBCX2.1 14 -100 Cycle M 12.0000 75981 5129.00
> HK3TJBCX2.1 14 -101 Cycle M 12.0000 77375 5137.00
> HK3TJBCX2.1 14 -102 Cycle M 13.0000 81409 4117.00
... truncated ...
The final results are similar, but not identical as claimed on the documentation. Is there something that I’m doing wrong?