It’s a very important step to combine multiple samples’ gvcf files together in the pipeline of joint calling.
In GATK, it could be done with CombineGVCFs.
But in Parabricks 4.0, I can’t find the corresponding software.
And in previous version, some join calling functions has been implemented, such as CombineGVCFs (but can only input 2 or 3 gvcfs) and GLNexus.
So is there any future plan to add the function back and free the limitation of the number of gvcf files?
I am also having issues with the genotypegvcf command. It seems to only be taking one g.vcf file in the input. If I specify a directory with multiple g.vcf files it throws this error →
[PB Info 2023-Jan-04 08:22:55] ------------------------------------------------------------------------------
[PB Info 2023-Jan-04 08:22:55] || Parabricks accelerated Genomics Pipeline ||
[PB Info 2023-Jan-04 08:22:55] || Version 4.0.0-1 ||
[PB Info 2023-Jan-04 08:22:55] || genotypegvcf ||
[PB Info 2023-Jan-04 08:22:55] ------------------------------------------------------------------------------
[PB Warning 2023-Jan-04 08:22:55][src/PBLocalFile.cpp:48] Failed to open file /home/user/project/gvcf_test/0.g.vcf
[PB e[31mErrore[0m 2023-Jan-04 08:22:55][-unknown-:0] Received signal: 11
Exiting…
Could not run genotypegvcf
Exiting pbrun …
Interestingly 0.g.vcf is not one of my gvcf files … Im not sure where that is coming from. THis is the code Im running
Does GLnexus the best way to do joint calling ? I compared the results of joint calling VCF which was produced by GATK and GLnexus , there are many sites different
I also have this issue. After haplotypecaller, I cannot find proper way to merge individual gvcf and proceed genotypegvcf. Even I used the original CombineGVCFs function in GATK4, it showed an error when proceeding to genotypegvcf in Parabricks 4.4.0 with the combined gvcf. It seems the Parabricks genotypegvcf cannot take gvcf files after GATK4 CombineGVCFs.
In the latest Parabricks release, there are no tools that combine multiple gvcf files. And the genotypeGVCF in Parabricks only handles one single gvcf file that outputed by Parabricks haplotypecaller.
If you have multiple gvcf files that need to be combined, you can try GATK CombineGVCFs/GenomicsDBImport or GLnexus, then feed the output gvcf file to GATK GenotypeGVCFs to do joint calling.