Fq2bam - Marking Duplicates, BQSR executing despite --no-markdups, etc

I’m running Parabricks 4.1.1-1 fq2bam as follows:

    pbrun fq2bam \
    --logfile output/$1.$LOG_FILE \
    --out-bam output/$1.$OUT_FILE \
    --in-fq input/unmerged_1.fastq.gz input/unmerged_2.fastq.gz --fix-mate \
    --ref input/$GENOME_FASTA \
    --bwa-options=-Y \
    --tmp-dir tmp \
    --filter-flag 256 \
    --gpuwrite --gpusort \
    --no-markdups

Although the documentation states:

The user can turn-off marking of duplicates by adding the –no-markdups option. The BQSR step is only performed if the –knownSites input and –out-recal-file output options are provided

After successful alignment and sorting, the job above resulted in:

[PB Info 2023-Jul-13 22:14:04] ------------------------------------------------------------------------------
[PB Info 2023-Jul-13 22:14:04] ||                 Parabricks accelerated Genomics Pipeline                 ||
[PB Info 2023-Jul-13 22:14:04] ||                              Version 4.1.1-1                             ||
[PB Info 2023-Jul-13 22:14:04] ||                         Marking Duplicates, BQSR                         ||
[PB Info 2023-Jul-13 22:14:04] ------------------------------------------------------------------------------
[PB Info 2023-Jul-13 22:14:39] progressMeter -  Percentage
[PB Info 2023-Jul-13 22:14:49] 0.2       4.59 GB
[PB Info 2023-Jul-13 22:14:59] 0.6       7.94 GB

Am I misinterpreting? It appears to be running at least one of Marking Duplicates, or BQSR despite setting --no-markdups and not setting -–knownSites or -–out-recal-file.

Is this a bug or my error or misunderstanding? It’s an immediate problem for me because the mark dups/BQSR eventually got killed OOM (a different issue…)

1 Like