Haplotypecaller producing an empty VCF file

Hello everyone,

I’m encountering an issue with GATK’s HaplotypeCaller while working with RNA-seq data using NVIDIA Parabricks. Despite following the recommended steps, the output VCF file is empty, indicating no variants were called.

Context:

  • Tool: GATK HaplotypeCaller via NVIDIA Parabricks
  • Data Type: RNA-seq
  • Reference Genome: Mus musculus (GRCm39)
  • BAM FILE: Was produced via NVIDIA Parabrick’s rna_fq2bam
docker run --rm --gpus all     -v /home/$USER:/workdir     nvcr.io/nvidia/clara/clara-parabricks:4.3.1-1     pbrun haplotypecaller     --ref /workdir/ref/Mus_musculus.GRCm39.dna.primary_assembly.fa     --in-bam /workdir/output/aligned_reads.bam     --out-variants /workdir/output/germline_variants.vcf --rna

I have checked the integrity of the BAM file with samtools as well as the contents and all seems to be okay.

Are there any pipeline’s available in parabricks similar to 3.8’s rna_gatk?

1 Like

I have the same issue. Did youo recieve any replies?

Hi @owais.siddiqi and @naomi.dyer,

We have removed some tools from Parabricks 3.8. However for RNA data we still have rna_fq2bam and starfusion.

Does that suffice or is there something more that you’re looking for?

Thank you

Hi, I’m not sure which tools you have removed? The tools I was attempting to usde were rna-fq2bam and haplotypecaller. I am working with Anopheles gambiae. rna-fq2bam appears to have run correctly producing bam files which look OK when examined using samtools. However, when haplotypecaller is run the vcf files only contain the header and no variants. No error messages are produced. Would this be caused by the changes to Parabricks? The STAR genome index, genome.fasta and vcf files all have the same chromosome names (2L, 2R, 3L, 3R, X). I can provide all commands and sample outputs and logs if that is any help in diagnosing the issue?
thanks