Dear Parabricks Team
Hello everyone. I am writing here to share my experience.
I have found that fq2bam has been replaced by fq2bamfast and fq2bamfast needs BWAindex files which fq2bam does not need.
I created BWAindx files with the following command
bwa index Homo_sapiens_assembly38.fasta
Index files created
Homo_sapiens_assembly38.fasta.pac
Homo_sapiens_assembly38.fasta.bwt
Homo_sapiens_assembly38.fasta.ann
Homo_sapiens_assembly38.fasta.amb
Homo_sapiens_assembly38.fasta.sa
I was then relieved to be able to run fq2bamfast, but noticed that the number of SNP variants called was less than the vcf generated by pbrun4.1(fq2bam).
It was very hard to see what was causing this, I checked the output cram file header, there was no difference between pbrun4.1 fq2bam and pbrun4.3 fq2bamfast, no difference in the running script, no difference in the logging messages.
But eventually I found out that the .alt index file was needed to use alternative contigs for bwa mapping, which was the reason my variants had been reduced.
(Here is the document: https://gatk.broadinstitute.org/hc/en-us/articles/360037498992--How-to-Map-reads-to-a-reference-with-alternate-contigs-like-GRCH38)
I downloaded the bwakit archive and renamed hs38DH.fa.alt to ‘Homo_sapiens_assembly38.fasta.alt’ and put it in the same directory as the other index files, the SNP call was the same as pbrun4.1.
I understand that pbrun4.1 fq2bam doesn’t need an .alt file but can use alternative contigs, but pbrun4.3 fq2bam needs an .alt file to use alternative contigs.
This is my experience.
I wish parabricks could give a log of which index file was loaded.
Index files affect the results.
Thanks for reading.
Best regards.