No software to combine .gvcf files in 4.0

It’s a very important step to combine multiple samples’ gvcf files together in the pipeline of joint calling.
In GATK, it could be done with CombineGVCFs.
But in Parabricks 4.0, I can’t find the corresponding software.
And in previous version, some join calling functions has been implemented, such as CombineGVCFs (but can only input 2 or 3 gvcfs) and GLNexus.

So is there any future plan to add the function back and free the limitation of the number of gvcf files?

Do you mean this tool genotypegvcf? @minerw1024

Thanks for your reply.

But it’s a pity that I cannot use that in the joint calling for multiple gvcf files.

I tried genotypegvcf in followed code:

docker run \
        --gpus all \
        --rm \
        --volume $(pwd):/workdir \
        --volume $(pwd):/outputdir \
        nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1 \
        pbrun genotypegvcf \
        --ref /workdir/refs/refdata.fna \
        --in-gvcf /workdir/gvcf/0010.g.vcf \
        --in-gvcf /workdir/gvcf/0011.g.vcf \
        --in-gvcf /workdir/gvcf/0012.g.vcf \
        --in-gvcf /workdir/gvcf/0013.g.vcf \
        --in-gvcf /workdir/gvcf/0014.g.vcf \
        --in-gvcf /workdir/gvcf/0015.g.vcf \
        --in-gvcf /workdir/gvcf/0016.g.vcf \
        --in-gvcf /workdir/gvcf/0017.g.vcf \
        --in-gvcf /workdir/gvcf/0018.g.vcf \
        --in-gvcf /workdir/gvcf/0019.g.vcf \
        --out-vcf /outputdir/001x.vcf

It throws no error. But after it’s finished, I used bcftools to check the result file, the outputs shows only the last gvcf file is converted:

$ bcftools query -l 001x.vcf
0019

So what is the right command to use it ?

I’m not sure, but you may try to use this option --in-selectvariants-dir just to specify the gvcf files directory(e.g. /workdir/gvcf/).

I am also having issues with the genotypegvcf command. It seems to only be taking one g.vcf file in the input. If I specify a directory with multiple g.vcf files it throws this error →

[PB Info 2023-Jan-04 08:22:55] ------------------------------------------------------------------------------
[PB Info 2023-Jan-04 08:22:55] || Parabricks accelerated Genomics Pipeline ||
[PB Info 2023-Jan-04 08:22:55] || Version 4.0.0-1 ||
[PB Info 2023-Jan-04 08:22:55] || genotypegvcf ||
[PB Info 2023-Jan-04 08:22:55] ------------------------------------------------------------------------------
[PB Warning 2023-Jan-04 08:22:55][src/PBLocalFile.cpp:48] Failed to open file /home/user/project/gvcf_test/0.g.vcf
[PB e[31mErrore[0m 2023-Jan-04 08:22:55][-unknown-:0] Received signal: 11

Exiting…

Could not run genotypegvcf
Exiting pbrun …

Interestingly 0.g.vcf is not one of my gvcf files … Im not sure where that is coming from. THis is the code Im running

singularity exec --nv -B $PWD,$genome_path,$gvcf_path /packages/7x/parabricks/4.0/parabricks.simg
pbrun genotypegvcf --ref ${genome_path}/${genome_prefix}
–in-selectvariants-dir ${gvcf_path}
–out-vcf ${gvcf_path}/${output_file}

hello. Did you find a better way for joint calling?

Yours,
changsheng

No one can answer it from NVIDIA? That makes PB4.0 less useful as before.

For this step we recommend using GLNexus.