How could I compare variant calling results?

Hello, I saw a post on your homepage that the results of the HaplotypeCaller of Parabricks are the same as those of GATK4 HaplotypeCaller.
Does this really have 100% matching results?

How did you compare the results? I compared the results of the VCF file for both of Parabricks and GATK4 using Illumina/hap.py, but the results were not the same. I’d like to know how you compared the variant calling results.

I look forward to your answer.

Thank you.

Hey @qhtjrmin,

Here is the page where we discuss how to compare the outputs of Parabricks vs GATK:

The results will depend on which versions of Parabricks and GATK you are comparing. Parabricks Version 2 will align with GATK 4.0.4 and Parabricks Version 3 will align with GATK 4.1

Hi,

I’m currently trying to compare outputs with the CPU version I’m using, but it’s GATK 4.2.0.0
The results are still similar (39M calls i.e. 97% overlap), but not fully (1.2M extra calls i.e. 3% and 3.9M missing calls i.e. 5.7%).
I did not compare to the CPU version of GATK 4.1 but I suppose that these differences are coming from the difference of version from GATK.

Is there a plan to align with v.4.2 in the near future?

Thanks

Hi Vincent,

Your supposition is right. The differences between the results are indeed due to the the difference of version.

We are planning on to incorporate GATK 4.2 in our next major release.

Thanks
Myrieme

Hi @mdemouth ,

That’s perfect! Thanks for the precision.

Is there a page where I can follow the coming updates/next releases?
I don’t know how often you release a new major version. Is this in the order of months? Or years?

Thanks