No software to combine .gvcf files in 4.0

It’s a very important step to combine multiple samples’ gvcf files together in the pipeline of joint calling.
In GATK, it could be done with CombineGVCFs.
But in Parabricks 4.0, I can’t find the corresponding software.
And in previous version, some join calling functions has been implemented, such as CombineGVCFs (but can only input 2 or 3 gvcfs) and GLNexus.

So is there any future plan to add the function back and free the limitation of the number of gvcf files?

Do you mean this tool genotypegvcf? @minerw1024

Thanks for your reply.

But it’s a pity that I cannot use that in the joint calling for multiple gvcf files.

I tried genotypegvcf in followed code:

docker run \
        --gpus all \
        --rm \
        --volume $(pwd):/workdir \
        --volume $(pwd):/outputdir \
        nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1 \
        pbrun genotypegvcf \
        --ref /workdir/refs/refdata.fna \
        --in-gvcf /workdir/gvcf/0010.g.vcf \
        --in-gvcf /workdir/gvcf/0011.g.vcf \
        --in-gvcf /workdir/gvcf/0012.g.vcf \
        --in-gvcf /workdir/gvcf/0013.g.vcf \
        --in-gvcf /workdir/gvcf/0014.g.vcf \
        --in-gvcf /workdir/gvcf/0015.g.vcf \
        --in-gvcf /workdir/gvcf/0016.g.vcf \
        --in-gvcf /workdir/gvcf/0017.g.vcf \
        --in-gvcf /workdir/gvcf/0018.g.vcf \
        --in-gvcf /workdir/gvcf/0019.g.vcf \
        --out-vcf /outputdir/001x.vcf

It throws no error. But after it’s finished, I used bcftools to check the result file, the outputs shows only the last gvcf file is converted:

$ bcftools query -l 001x.vcf
0019

So what is the right command to use it ?

I’m not sure, but you may try to use this option --in-selectvariants-dir just to specify the gvcf files directory(e.g. /workdir/gvcf/).

I am also having issues with the genotypegvcf command. It seems to only be taking one g.vcf file in the input. If I specify a directory with multiple g.vcf files it throws this error →

[PB Info 2023-Jan-04 08:22:55] ------------------------------------------------------------------------------
[PB Info 2023-Jan-04 08:22:55] || Parabricks accelerated Genomics Pipeline ||
[PB Info 2023-Jan-04 08:22:55] || Version 4.0.0-1 ||
[PB Info 2023-Jan-04 08:22:55] || genotypegvcf ||
[PB Info 2023-Jan-04 08:22:55] ------------------------------------------------------------------------------
[PB Warning 2023-Jan-04 08:22:55][src/PBLocalFile.cpp:48] Failed to open file /home/user/project/gvcf_test/0.g.vcf
[PB e[31mErrore[0m 2023-Jan-04 08:22:55][-unknown-:0] Received signal: 11

Exiting…

Could not run genotypegvcf
Exiting pbrun …

Interestingly 0.g.vcf is not one of my gvcf files … Im not sure where that is coming from. THis is the code Im running

singularity exec --nv -B $PWD,$genome_path,$gvcf_path /packages/7x/parabricks/4.0/parabricks.simg
pbrun genotypegvcf --ref ${genome_path}/${genome_prefix}
–in-selectvariants-dir ${gvcf_path}
–out-vcf ${gvcf_path}/${output_file}

hello. Did you find a better way for joint calling?

Yours,
changsheng

No one can answer it from NVIDIA? That makes PB4.0 less useful as before.

For this step we recommend using GLNexus.

it also can be accelerated by GPU?
Does the genotypegvcf in PB4.0 only convert one g.vcf file?

It seems GLNexus is no longer present in parabricks v4.0.

Since this is the standard method to merge gVCF files from DeepVariant, can this be implemented in v4.0 (as it was in v3.8)?

Does GLnexus the best way to do joint calling ? I compared the results of joint calling VCF which was produced by GATK and GLnexus , there are many sites different