Struggling to produce identical results between Parabricks fq2bam (with apptainer) and bwa and gatk

Hi,

I’ve been testing Parabricks on our local HPC (with A30s) and comparing it to bwa and gatk following the instructions in the links below:
https://docs.nvidia.com/clara/parabricks/4.0.0/Documentation/ToolDocs/man_fq2bam.html#man-fq2bam

I’ve been using the example dataset recommended by the tutorial.

Since I’m testing on an HPC environment, I’ve been pulling the Parabricks container with apptainer 1.1.0:

apptainer pull docker://nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1

the Slurm run script I’m using is:

#!/bin/bash
#SBATCH --mem=125G
#SBATCH --time=10
#SBATCH --qos=bonus
#SBATCH --partition=gpuq
#SBATCH --cpus-per-task=24
#SBATCH --job-name=fq2bam-gpu
#SBATCH --output=%x-%j.o
#SBATCH --error=%x-%j.e
#SBATCH --gres=gpu:A30:1

# setting up input and output files
in_dir=/vast/scratch/users/yang.e/gpu-aligners/parabricks/parabricks_sample_rearranged
fq1_file_name=sample_1.fq.gz
fq2_file_name=sample_2.fq.gz
known_sites_file_name=Homo_sapiens_assembly38.known_indels.vcf.gz
ref_file_name=Homo_sapiens_assembly38.fasta

out_dir=`pwd -P`
out_recal_file_name=recal-gpu.txt
out_bam_file_name=mark_dups_gpu.bam

apptainer run -W ${in_dir} --nv -B ${in_dir} -B ${out_dir} \
        clara-parabricks_4.0.0-1.sif \
        pbrun fq2bam \
        --ref ${in_dir}/${ref_file_name} \
        --in-fq ${in_dir}/${fq1_file_name} ${in_dir}/${fq2_file_name} \
        --knownSites ${in_dir}/${known_sites_file_name} \
        --out-bam ${out_dir}/${out_bam_file_name} \
        --out-recal-file ${out_dir}/${out_recal_file_name}

If I compare the final bam files with du, I get:

$ du -h fq2bam-gpu/mark_dups_gpu.bam
4.5G	fq2bam-gpu/mark_dups_gpu.bam

$ du -h fq2bam-cpu/mark_dups_cpu.bam 
5.1G	fq2bam-cpu/mark_dups_cpu.bam

and the recommended bam diff command doesn’t return anything.

Comparing the bqsr reports:

$ diff fq2bam-cpu/recal-cpu fq2bam-gpu/recal-gpu.txt 
38,39c38,39
<           12     9300111              12
<           13   100360144              13
---
>           12     9301228              12
>           13   100362463              13
49c49
<           23    81759155              23
---
>           23    81761883              23
51,55c51,55
<           25     3934555              25
<           26    45391671              26
<           27    60802037              27
<           28   547404255              28
<           29  3750950052              29
---
>           25     3934663              25
>           26    45392934              26
>           27    60803126              27
>           28   547415841              28
>           29  3751034096              29
123,124c123,124
< ReadGroup   EventType  EmpiricalQuality  EstimatedQReported  Observations  Errors     
< sample_rg1  M                   26.0000             30.7691    4599901980  11657780.00
---
> ReadGroup    EventType  EmpiricalQuality  EstimatedQReported  Observations  Errors     
> HK3TJBCX2.1  M                   26.0000             30.7690    4600006234  11658934.00
128,144c128,144
< ReadGroup   QualityScore  EventType  EmpiricalQuality  Observations  Errors    
< sample_rg1            14  M                   12.0000       9300111   629474.00
< sample_rg1            15  M                   13.0000      12782602   613091.00
< sample_rg1            16  M                   13.0000      87577542  4103975.00
< sample_rg1            27  M                   23.0000      81759155   391842.00
< sample_rg1            28  M                   25.0000       3934555    12017.00
...truncated...
---
> ReadGroup    QualityScore  EventType  EmpiricalQuality  Observations  Errors    
> HK3TJBCX2.1            14  M                   12.0000       9301228   629500.00
> HK3TJBCX2.1            15  M                   13.0000      12782950   613134.00
> HK3TJBCX2.1            16  M                   13.0000      87579513  4104091.00
> HK3TJBCX2.1            27  M                   23.0000      81761883   391871.00
> HK3TJBCX2.1            28  M                   25.0000       3934663    12021.00
...truncated...
148,3642c148,3642
< ReadGroup   QualityScore  CovariateValue  CovariateName  EventType  EmpiricalQuality  Observations  Errors   
< sample_rg1            14  -1              Cycle          M                   17.0000          2729       2.00
< sample_rg1            14  -10             Cycle          M                   12.0000         45153    2661.00
< sample_rg1            14  -100            Cycle          M                   12.0000         75975    5128.00
< sample_rg1            14  -101            Cycle          M                   12.0000         77372    5137.00
< sample_rg1            14  -102            Cycle          M                   13.0000         81402    4117.00
... truncated ...
---
> ReadGroup    QualityScore  CovariateValue  CovariateName  EventType  EmpiricalQuality  Observations  Errors   
> HK3TJBCX2.1            14  -1              Cycle          M                   17.0000          2729       2.00
> HK3TJBCX2.1            14  -10             Cycle          M                   12.0000         45155    2661.00
> HK3TJBCX2.1            14  -100            Cycle          M                   12.0000         75981    5129.00
> HK3TJBCX2.1            14  -101            Cycle          M                   12.0000         77375    5137.00
> HK3TJBCX2.1            14  -102            Cycle          M                   13.0000         81409    4117.00
... truncated ...

The final results are similar, but not identical as claimed on the documentation. Is there something that I’m doing wrong?

Hey @edward_yang_125,

You have done everything right. There might be slight differences based on software versions. Parabricks 4.0.0-1 is compatible with:

BWA mem: 0.7.15
Picard: 2.18.25

What did you use to run your CPU comparison?

Thanks