I have been comparing Parabricks DeepVariant (v4.4.0-1) with the standard DeepVariant (v1.6.1-gpu) and noticed some discrepancies in variant classification. Specifically, I observed that Parabricks DeepVariant tags some variants as RefCall
when they seemingly should not be.
For example, consider the following variant at position chr7:72628972:
Output from Parabricks DeepVariant:
chr7 72628972 . A G 0.6 RefCall . GT:GQ:DP:AD:VAF:PL ./.:9:387:201,186:0.48062:0,8,17
Output from DeepVariant 1.6.1-gpu:
chr7 72628972 . A G 36.2 PASS . GT:GQ:DP:AD:VAF:PL 0/1:32:387:201,186:0.48062:36,0,33
Additionally, I checked this variant using Parabricks GATK, which produced the following record:
chr7 72628972 . A G 3775.64 . AC=1;AF=0.500;AN=2;BaseQRankSum=-2.856;DP=381;ExcessHet=0.0000;FS=20.149;MLEAC=1;MLEAF=0.500;MQ=52.77;MQRankSum=-14.382;QD=9.99;ReadPosRankSum=1.416;SOR=1.209 GT:AD:DP:GQ:PL:SB 0/1:199,179:378:99:3783,0,4733:95,104,110,69
As shown, this variant has excellent coverage and quality metrics. I am puzzled as to why Parabricks DeepVariant categorizes this as RefCall
, as we typically filter out such variants during our pipeline.
If it would be helpful, I am happy to share the FASTQ files for further analysis. Please let me know if any additional details are needed.
Looking forward to your insights.