[v4.0 vs v3.6] Germline pipeline had difference in header in GVCF file

Hi Parabricks team,

I was running Parabricks Germline pipeline with most of my sample ran on version 3.6, and some of my new samples ran with version 4.0 with the same options, the only difference is Docker Image.
But when I do joint-calling, I found out that the GVCF header is slightly different in DS, show as below:

## I grep "##INFO=<ID=DS," in each GVCF:
WGS_040_Father_v4.g.vcf	|	
WGS_040_Father_v3.6.g.vcf	|	##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
WGS_040_Mother_v4.g.vcf	|	
WGS_040_Mother_v3.6.g.vcf	|	##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
WGS_040_Proband_v4.g.vcf	|	
WGS_040_Proband_v3.6.g.vcf	|	##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
WGS_040_Sibling_v4.g.vcf	|	
WGS_040_Sibling_v3.6.g.vcf	|	##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">

When I checked documents, I did’t see any note of the difference of Germline options. That’s why I assume the default stay the same between v3.6 and v4.0.
So, I’m wondering if there are any difference in v4.0, compared to v3.6, might cause this problem?
And how can I output “<ID=DS,Number=0,Type=Flag,Description=“Were any of the samples downsampled?”>” in header on my next run on version 4.0?
I try to consist my gvcf header to avoid re-run my old sample and/or manually correct the header one-by-one. Thanks for help!

Po-Ying

Hey @fup,

Different versions of Parabricks germline are are based on different versions of GATK so there will be some subtle differences in versions. The latest version 4.0.0-1 uses BWA version 0.7.15 should match results from GATK version 4.2.0.0. I believe that 3.6 results should match with GATK 4.0 or even earlier, so that could explain the differences in the GVCF header.