Memory leak crashing run at MarkDuplicates/BQSR stage in Fq2Bam

PaulMatterBioDev · January 7, 2026, 5:14pm

I’m attempting to align a pair of rather large FastQ files (960GB in total) and my pipeline is crashing out when fq2bam hits the MarkDuplicates stage due to excessive memory allocation causing a signal-kill that terminated my job.

I’ve attempted a few things to fix this, including:

Running fq2bam with the --low-memory flag
Running fq2bam with the --memory-limit flag set to 90GB when my job requests 100GB
Running fq2bam with the --no-markdups flag

In all cases, fq2bam isn’t respecting the memory limits I’ve set for it and the --no-markdups flag doesn’t seem to be working at all as that run still crashed out when attempting to mark duplicates.

I’m using Parabricks 4.5.0, I don’t see anything in the release notes for more recent versions addressing any of the above but will try updating to 4.6.0 and see if that helps.

Update: upgrading to Parabricks 4.6.0 did not fix the problem (although the BWA stage did get faster!) – the --no-markdups flag is still being ignored and the job is exceeding its allowed memory

The command being executed:

#!/bin/bash
#SBATCH -t 24:00:00
#SBATCH -M HPC4
#SBATCH -p gpu-h200
#SBATCH -J gatk
#SBATCH -o parabricks_%A.log
#SBATCH --gres=gpu:h200:1
#SBATCH --mem=100G
#SBATCH -c 18



singularity run --nv \
    -B /gs/gsfs0/users/murray-maslov-lab/reference/mouse/:/gs/gsfs0/users/murray-maslov-lab/reference/mouse/ \
    -B $(pwd)/processing:/processing \
    $parabricks \
    pbrun fq2bam \
     --num-gpus 1 \
     --low-memory \
     --memory-limit 90 \
    --ref $refgenome \
    --knownSites $dbsnp \
    --in-fq /processing/2-trim/$samplename-r1.trimmed.fq /processing/2-trim/$samplename-r2.trimmed.fq \
    --out-bam /processing/4-parabricks/$samplename-pe.rg.bam \
    --out-recal-file /processing/4-parabricks/$samplename-pe.recal.grp \
    --out-duplicate-metrics /processing/4-parabricks/duplicates_$samplename.txt \
    --read-group-id-prefix $samplename \
    --read-group-sm $samplename \
    --read-group-lb $samplename \
    --read-group-pl ILLUMINA

Or alternatively:

singularity run --nv \
    -B /gs/gsfs0/users/murray-maslov-lab/reference/mouse/:/gs/gsfs0/users/murray-maslov-lab/reference/mouse/ \
    -B $(pwd)/processing:/processing \
    $parabricks \
    pbrun fq2bam \
     --num-gpus 1 \
     --no-markdups \
    --ref $refgenome \
    --in-fq /processing/2-trim/$samplename-r1.trimmed.fq /processing/2-trim/$samplename-r2.trimmed.fq \
    --out-bam /processing/4-parabricks/$samplename-pe.bam \
    --read-group-id-prefix $samplename \
    --read-group-sm $samplename \
    --read-group-lb $samplename \
    --read-group-pl ILLUMINA

The error:

[PB Info 2026-Jan-07 06:43:36] Sorting and Marking: 1670.335 seconds
[PB Info 2026-Jan-07 06:43:36] ------------------------------------------------------------------------------
[PB Info 2026-Jan-07 06:43:36] ||        Program:                                  Sorting Phase-II        ||
[PB Info 2026-Jan-07 06:43:36] ||        Version:                                           4.5.0-1        ||
[PB Info 2026-Jan-07 06:43:36] ||        Start Time:                       Wed Jan  7 06:15:45 2026        ||
[PB Info 2026-Jan-07 06:43:36] ||        End Time:                         Wed Jan  7 06:43:36 2026        ||
[PB Info 2026-Jan-07 06:43:36] ||        Total Time:                          27 minutes 51 seconds        ||
[PB Info 2026-Jan-07 06:43:36] ------------------------------------------------------------------------------
[PB Info 2026-Jan-07 06:43:36] ------------------------------------------------------------------------------
[PB Info 2026-Jan-07 06:43:36] ||                 Parabricks accelerated Genomics Pipeline                 ||
[PB Info 2026-Jan-07 06:43:36] ||                              Version 4.5.0-1                             ||
[PB Info 2026-Jan-07 06:43:36] ||                         Marking Duplicates, BQSR                         ||
[PB Info 2026-Jan-07 06:43:36] ------------------------------------------------------------------------------
[PB Info 2026-Jan-07 06:43:36] BQSR using CUDA device(s): { 0 }
[PB Info 2026-Jan-07 06:43:37] Using PBBinBamFile for BAM writing
[PB Info 2026-Jan-07 06:43:37] progressMeter -  Percentage
[PB Info 2026-Jan-07 06:43:47] 0.0
Process terminated with signal [SIGKILL: 9]. SIGKILL cannot be caught. A common reason for SIGKILL is running out of
host memory. If the user has root access, they may be able to check by running `sudo journalctl -k --since "<#> minutes
ago" | grep "Killed process"` to see the reason why processes were recently killed.
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation



Could not run fq2bam
Exiting pbrun ...
INFO:    Cleaning up image...

tnatkinn · January 27, 2026, 4:00pm

I have been experiencing the same with 60+ GB single-end human reads. Currently testing with –align-only to see if even that works.

PaulMatterBioDev · January 27, 2026, 7:59pm

My experience has been that if you don’t have enough RAM to fit the entire library into memory at once it will crash. Did you have any luck with the --align-only flag?

tnatkinn · January 27, 2026, 10:09pm

align-only worked but haven’t had the time to test if I can reproduce the issue with bamsort. Strange thing is that I have aligned even larger paired-end libraries without any issues.

PaulMatterBioDev · February 20, 2026, 8:15pm

I was able to get --align-only working as well to at least produce an initial aligned BAM file. The pipeline still crashes at the next step when I attempt to run a standalone bamsort, which for some reason still reports a marking duplicates/BQSR phase after the initial sort is done

[PB Info 2026-Feb-19 19:28:53] Sorting and Marking: 240.106 seconds
[PB Info 2026-Feb-19 19:28:53] ------------------------------------------------------------------------------
[PB Info 2026-Feb-19 19:28:53] ||        Program:                                  Sorting Phase-II        ||
[PB Info 2026-Feb-19 19:28:53] ||        Version:                                           4.5.0-1        ||
[PB Info 2026-Feb-19 19:28:53] ||        Start Time:                       Thu Feb 19 19:24:53 2026        ||
[PB Info 2026-Feb-19 19:28:53] ||        End Time:                         Thu Feb 19 19:28:53 2026        ||
[PB Info 2026-Feb-19 19:28:53] ||        Total Time:                            4 minutes 0 seconds        ||
[PB Info 2026-Feb-19 19:28:53] ------------------------------------------------------------------------------
[PB Info 2026-Feb-19 19:29:03] ------------------------------------------------------------------------------
[PB Info 2026-Feb-19 19:29:03] ||                 Parabricks accelerated Genomics Pipeline                 ||
[PB Info 2026-Feb-19 19:29:03] ||                              Version 4.5.0-1                             ||
[PB Info 2026-Feb-19 19:29:03] ||                         Marking Duplicates, BQSR                         ||
[PB Info 2026-Feb-19 19:29:03] ------------------------------------------------------------------------------
[PB Info 2026-Feb-19 19:29:03] Using PBBinBamFile for BAM writing
[PB Info 2026-Feb-19 19:29:03] progressMeter -  Percentage
[PB Info 2026-Feb-19 19:29:13] 0.1
[PB Info 2026-Feb-19 19:29:23] 0.4
[PB Info 2026-Feb-19 19:29:33] 0.6
[PB Info 2026-Feb-19 19:29:43] 0.8
[PB Warning 2026-Feb-19 19:29:52][src/PBTempFile.cpp:155] Attempting to allocate host memory above desired limit (144.478831 GB)
[PB Info 2026-Feb-19 19:29:53] 1.1
[PB Info 2026-Feb-19 19:30:03] 1.2
[PB Info 2026-Feb-19 19:30:13] 1.4
[PB Info 2026-Feb-19 19:30:23] 1.7
[PB Info 2026-Feb-19 19:30:33] 2.0
[PB Info 2026-Feb-19 19:30:43] 2.3
[PB Info 2026-Feb-19 19:30:53] 2.3
[PB Info 2026-Feb-19 19:31:03] 2.3
[PB Info 2026-Feb-19 19:31:13] 2.3
[PB Info 2026-Feb-19 19:31:23] 2.3
[PB Info 2026-Feb-19 19:31:33] 2.3
[PB Info 2026-Feb-19 19:31:43] 2.3
[PB Info 2026-Feb-19 19:31:53] 2.3
[PB Info 2026-Feb-19 19:32:03] 2.3
[PB Info 2026-Feb-19 19:32:13] 2.3
[PB Info 2026-Feb-19 19:32:23] 2.3
[PB Info 2026-Feb-19 19:32:33] 2.3
[PB Info 2026-Feb-19 19:32:43] 2.3
[PB Info 2026-Feb-19 19:32:53] 2.3
[PB Info 2026-Feb-19 19:33:03] 2.3
[PB Info 2026-Feb-19 19:33:13] 2.3
[PB Info 2026-Feb-19 19:33:23] 2.3
[PB Info 2026-Feb-19 19:33:33] 2.3
[PB Info 2026-Feb-19 19:33:43] 2.3
[PB Info 2026-Feb-19 19:33:53] 2.3
[PB Info 2026-Feb-19 19:34:03] 2.3
[PB Info 2026-Feb-19 19:34:13] 2.3
[PB Info 2026-Feb-19 19:34:23] 2.3
[PB Info 2026-Feb-19 19:34:33] 2.3
[PB Info 2026-Feb-19 19:34:43] 2.3
[PB Info 2026-Feb-19 19:34:53] 2.4
[PB Info 2026-Feb-19 19:35:03] 2.5
[PB Info 2026-Feb-19 19:35:13] 2.9

Process terminated with signal [SIGKILL: 9]. SIGKILL cannot be caught. A common reason for SIGKILL is running out of
host memory. If the user has root access, they may be able to check by running `sudo journalctl -k --since "<#> minutes
ago" | grep "Killed process"` to see the reason why processes were recently killed.
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
Could not run bamsort
Exiting pbrun ...

I’ve confirmed via subsequent runs with the --verbose flag that CPU memory usage immediately spikes to 97-98% at this stage before eventually being over-allocated and crashing

dpuleri · February 20, 2026, 10:04pm

Hi @PaulMatterBioDev , a few questions:

How much available host memory is there on your system? Both installed and actually available before running anything.
What value are you setting for --memory-limit?
1. I would be conservative when setting this memory limit. We try to respect it but as the code says in a warning it is more of a soft limit which we can go over slightly depending on a few scenarios. If you do not provide a value for --memory-limit then we default to half of the installed memory. By installed memory I am referring to the value of MemTotal that is shown in /proc/meminfo. I see that you are using singularity so you are probably on a shared cluster. If that is the case then --memory-limit should be set judiciously because there is no easy way for us to know what parameters you provided to slurm or your job scheduler of choice.
What GPU are you using? I see that you are using --low-memory. The parameter --low-memory refers to low VRAM or device memory.

Also, the banner for the third stage where it says “Marking Duplicates, BQSR” is a bit of a misnomer. It will always print that message in the banner for the third stage whether or not you are Marking Duplicates or doing BQSR. If you are not doing those two operations then the main thing that stage is doing is finishing coordinate sorting and writing the final BAM or CRAM file.

You may also want to try using --gpuwrite as that can make the final stage faster by using the GPU to prepare the BAM file.

PaulMatterBioDev · February 20, 2026, 10:48pm

Hi @dpuleri , thanks for the reply.

So the shared cluster I’m using has two nodes types available that I send alignment jobs to that have either 4 A100 80G with 500GB of CPU RAM or 4 H100’s with 248GB of CPU RAM. However, because this is a shared cluster I’m not generally able to monopolize a whole node – I’m fairly consistently able to submit jobs with 200GB of RAM and 2 GPUs to either node type (and our H200 nodes are being upgraded to also have 500GB of RAM). However, my current project has a library that contains 2TB of raw fastQ data and has aligned into a 480GB bam file. Running the pipeline with 450GB of CPU ram and two A100s still resulted in the crash
So in my tests with the --memory-limit flag I was setting the limit to 90GB in a job with 100GB of total memory allocated, if it’s a soft limit as you say I can see how this isn’t enough of a buffer. However, if the default when the flag isn’t specified is half of the system memory, that should be a default limit of 124GB on a node that has 248GB of RAM installed (confirmed that this is what shows as the MemTotal in /proc/meminfo). For a job that has 200GB of ram allocated, it seems surprising that it would blow so far past the default limit of 124GB to still crash, that’s more than a 60% spike over the limit. I’ll try again with a much more conservative memory limit and see how it does
The jobs are run either with either 2 A100 80GBs or 2 H200s, I’ve since dropped the --low-memory flag as I realized the property was for GPU ram
Great to know re: the misnaming, it would be awesome if the logging could be made more clear there. I’d been doing a separate unmarkduplicates step for non-PCR libraries because it appeared that duplicates were being marked regardless of whether I wanted them to be or not

I’ll share my most current run configurations in case something jumps out as obviously wrong. This node has 248GB of RAM available, per /proc/meminfo’s MemTotal, and this job is taking in 2TB of FastQ data:

#!/bin/bash
#SBATCH -t 24:00:00
#SBATCH --signal=SIGTERM@900 # give time for the cleanup script to avoid draining the node on job termination
#SBATCH -M HPC4
#SBATCH -p gpu-h200
#SBATCH -J parabricks
#SBATCH -o parabricks_%A.log
#SBATCH --gres=gpu:h200:2
#SBATCH --mem=200G
#SBATCH -c 32

// …
// other setup code
// …

singularity exec instance://“$inst”
pbrun fq2bam
–num-gpus 2
–ref $refgenome
–knownSites $dbsnp
–bwa-options “-K 5000000”
–in-fq “/processing/2-trim/$samplename-r1.trimmed.fq” “/processing/2-trim/$samplename-r2.trimmed.fq” “@RG\tID:${samplename}\tLB:${samplename}\tPL:ILLUMINA\tSM:${samplename}\tPU:${samplename}” \
–out-bam $SCRATCH_BASE/4-parabricks/$samplename-pe.rg.bam
–out-recal-file $SCRATCH_BASE/4-parabricks/$samplename-pe.recal.grp
–out-duplicate-metrics $SCRATCH_BASE/4-parabricks/duplicates_$samplename.txt
–gpusort
–gpuwrite
–monitor-usage

dpuleri · February 21, 2026, 1:49am

Hello,

Yes, setting a limit of 90GB when only 100GB are available may be too tight to account for overages. A good rule of thumb would be to set the --memory-limit to half of the available memory you wish to allow for the job.

Great to know re: the misnaming, it would be awesome if the logging could be made more clear there. I’d been doing a separate unmarkduplicates step for non-PCR libraries because it appeared that duplicates were being marked regardless of whether I wanted them to be or not

Thanks for the feedback. We will note this for the future.

Two more questions.

Would it be possible to try exclusive jobs on your cluster? In my experience for clusters many of them do not require the user to set a memory limit so even if you set one, another user could get scheduled on the same node without a memory limit applied to their job.
Do your FASTQs have very deep coverage in certain parts of the genome? To parallelize the coordinate sorting we break up the alignments by parts of chromosomes so if you have very deep coverage in certain parts of the genome it could make that group of alignments very large. We do not have any parameters for you to adjust this behavior but it would be good to know for the future if this is a dataset which does not work well with our algorithm.

One more comment. Earlier you wrote

I’ve confirmed via subsequent runs with the --verbose flag that CPU memory usage immediately spikes to 97-98% at this stage before eventually being over-allocated and crashing

Did you mean that you were using the ‑‑monitor‑usage option? If so, that would be interesting if you saw 97% memory usage while not having exclusive access because that would mean another job scheduled to the same node is using a substantial amount of memory. For that metric we are using values from /proc/meminfo on available memory and that should be on a per-node basis, not per-job. This is because we do not have visibility in to your cluster’s scheduling and how it may partition a node.

Finally, your run command looks good except I would set --memory-limit 100 so that it is half of what you are asking from slurm.

PaulMatterBioDev · February 24, 2026, 7:59pm

So unfortunately I am not able to run exclusive jobs on my cluster – our QOS limits are such that an exclusive job will never be picked up by the scheduler as it attempts to allocate more resources than are allowed per-job.

The FASTQs for this project are whole-genome sequencing. In theory, they should be fairly uniform in coverage but in practice there are always spikes in certain parts of the genome, particularly for repetitive regions or regions with high similarity to other parts of the genome where you get a lot of lower-fidelity reads mapping onto those areas.

If I’m understanding you correctly, you’re saying that there might be so many reads mapping to the same region that the chunk of reads for that area during coordinate sorting is so large that it’s going to overwhelm the available memory tying to load it all at once? If that’s the case then yeah it might require an adjustment at the the algorithm level to do chunking dynamically based on number of reads in a region rather than presumably just dividing the dataset into X chunks where each chunk covers a certain region and then you have inconsistency in the size of those chunks that can lead to spikes.

And yes I did mean --monitor-usage, interesting to note on the spikes coming from multiple jobs. I’m curious if that other job crashed when mine did, though I have no insight there.

The alignment did still crash with --memory-limit 100, I’m trying it again on one of the A100 nodes with 450G of RAM allocated and --memory-limit 100 and we’ll see how that performs. If those configurations aren’t sufficient to get through coordinate sorting, my next options are either:

Coordinate-sort the aligned BAM generated by --align-only using a CPU run of GATK SortSam and then feed that sorted BAM to Parabricks markdups and continue the GPU pipeline from there
Split the unaligned FASTQ files into multiple chunks, run fq2bam --no-markdups on the chunked inputs, merge the coordinate sorted BAM files, and then continue with GPU markdups from there on the merged sorted BAM

Neither option is ideal in terms of automation and infrastructure scalability but they’ll get me through this project at least

Update: the pipeline still crashed with 450G of RAM and a memory limit of 100G

I ran a CPU queryname sort and attempted to resume the GPU pipeline with a standalone markdups stage, which also crashed due to overallocation of memory

dpuleri · March 2, 2026, 5:07pm

Hi @PaulMatterBioDev

Sorry for the late reply. It looks like your run is violating an assumption we have in our code if 400GB does not seem to be enough. Is that data you are using public or is there a similar version of your data that is publicly available? An SRA accession number or EBI dataset would be great so I can reproduce your runs and see in more detail what is happening. Also, for completeness sake can you post your final run command so I can do a run with the same parameters?

I ran a CPU queryname sort and attempted to resume the GPU pipeline with a standalone markdups stage, which also crashed due to overallocation of memory

Which tool did you use for this? How much CPU memory did you allocate to that tool?

PaulMatterBioDev · March 3, 2026, 9:36pm

Hi @dpuleri

The dataset is non-public and I’d be hard-pressed to identify a comparable one but I can share it privately with you to help with internal testing. It’s very large (2TB raw FastQ) so we’ll need to coordinate on data transfer via an AWS bucket or something similar.

The reference genome used here is mus musculus GRCm38.

You can find my initial Parabricks command and relevant slurm configurations here:

#!/bin/bash
#SBATCH -t 24:00:00
#SBATCH --gres=gpu:a100v2:2
#SBATCH --mem=450G
#SBATCH -c 28


singularity exec instance://"$inst" \
    pbrun fq2bam \
    --num-gpus 2 \
    --ref $refgenome \
    --knownSites $dbsnp \
    --bwa-options "-K 5000000" \
    --in-fq "/processing/2-trim/$samplename-r1.trimmed.fq" "/processing/2-trim/$samplename-r2.trimmed.fq" "@RG\tID:${samplename}\tLB:${samplename}\tPL:ILLUMINA\tSM:${samplename}\tPU:${samplename}" \
    --out-bam $SCRATCH_BASE/4-parabricks/$samplename-pe.rg.bam \
    --out-recal-file $SCRATCH_BASE/4-parabricks/$samplename-pe.recal.grp \
    --out-duplicate-metrics $SCRATCH_BASE/4-parabricks/duplicates_$samplename.txt \
    --gpusort \
    --gpuwrite \
    --memory-limit 100 \
    --monitor-usage

I then ran an --align-only fq2bam run that produced an aligned BAM file. I sorted it via CPU nodes using:

  gatk SortSam \
    --java-options -Xmx30g \
    -I ./processing/4-parabricks/backup/$samplename-pe.bam \
    -O ./processing/4-parabricks/backup/$samplename-pe.sorted.bam \
    --SORT_ORDER queryname

I then attempted to pass this sorted bam to parabricks markdup via the following command, which also failed:

#!/bin/bash
#SBATCH -t 24:00:00
#SBATCH --gres=gpu:a100v2:2
#SBATCH --mem=450G
#SBATCH -c 28

singularity exec instance://"$inst" \
    pbrun markdup \
    --verbose \
    --mem-limit 100 \
    --ref $refgenome \
    --in-bam /processing/4-parabricks/backup/$samplename-pe.sorted.bam \
    --out-bam $SCRATCH_BASE/4-parabricks/$samplename-pe.markdup.bam \
    --out-duplicate-metrics $SCRATCH_BASE/4-parabricks/duplicates_$samplename.txt

I ultimately had to do sorting, duplicate marking, and BQSR generation/application via CPU workflows for this dataset before resuming GPU processing with the HaplotypeCaller stage, which was successful

dpuleri · March 3, 2026, 10:08pm

Hi @PaulMatterBioDev , thank you for the confirmation on your workflow. That helps to narrow it down a bit to what part of the pipeline could be causing issues on our end.

Thanks for being willing to share your data. I’ll send you a private message.

peter.macdonald · April 19, 2026, 11:00pm

Was this ever resolved? I am facing a similar issue where I have a crash in the markdups stage of FQ2BAM. For human samples, only occurs on particular data sources. The only solution I can manage is juicing up the AWS instance used on failures.

dpuleri · April 20, 2026, 1:45pm

Hello @peter.macdonald , there was no generalized solution. Do you mind providing more information about your case? What instance were you using, what did the log look like, what command were your running? Also, are you running the latest release? (nvcr.io/nvidia/clara/clara-parabricks:4.7.0-1)

peter.macdonald · April 20, 2026, 2:14pm

Sure, I typically run all of my instances on the AWS g6.8xlarge or g5.8xlarge depending on availability. This instance is 32 cores, 128GiB and one of NVIDIA L4 24GB (g6) or an NVIDIA A10G (g5).

When it fails, it instead runs on g6.24xlarge or a g5.24xlarge. 96 vCPU, 384GiB with 4x GPU instead. This always succeeds. I am confirm this issue exists for 4.6.0 and 4.7.0

Command used:
pbrun fq2bam \
–ref Homo_sapiens_assembly38.fasta \
–in-fq sample.fastq.gz sample.fastq.gz “@RG\tID:…” \
–out-bam sample.bam \
–gpusort --gpuwrite

Tried with, and without, gpusort,gpuwrite.

[PB Info 2026-Apr-19 23:13:31] ------------------------------------------------------------------------------
[PB Info 2026-Apr-19 23:13:31] ||                 Parabricks accelerated Genomics Pipeline                 ||
[PB Info 2026-Apr-19 23:13:31] ||                              Version 4.7.0-1                             ||
[PB Info 2026-Apr-19 23:13:31] ||                         Marking Duplicates, BQSR                         ||
[PB Info 2026-Apr-19 23:13:31] ------------------------------------------------------------------------------
[PB Info 2026-Apr-19 23:13:31] Using PBBinBamFile for BAM writing
[PB Info 2026-Apr-19 23:13:31] progressMeter -	Percentage
[PB Info 2026-Apr-19 23:13:41] 1.6
[PB Info 2026-Apr-19 23:13:51] 4.9
[PB Info 2026-Apr-19 23:14:01] 7.9
[PB Info 2026-Apr-19 23:14:11] 10.7
[PB Info 2026-Apr-19 23:14:21] 13.8
[PB Info 2026-Apr-19 23:14:31] 16.5
[PB Info 2026-Apr-19 23:14:41] 19.8
[PB Info 2026-Apr-19 23:14:51] 22.8
[PB Info 2026-Apr-19 23:15:01] 26.5
[PB Info 2026-Apr-19 23:15:11] 29.9
[PB Info 2026-Apr-19 23:15:21] 32.6
[PB Info 2026-Apr-19 23:15:31] 36.1
[PB Info 2026-Apr-19 23:15:41] 39.1
[PB Info 2026-Apr-19 23:15:51] 42.3
[PB Info 2026-Apr-19 23:16:01] 45.2
[PB Info 2026-Apr-19 23:16:11] 48.2
[PB Info 2026-Apr-19 23:16:21] 51.3
[PB Info 2026-Apr-19 23:16:31] 54.9
[PB Info 2026-Apr-19 23:16:41] 57.6
[PB Info 2026-Apr-19 23:16:51] 60.6
[PB Info 2026-Apr-19 23:17:01] 63.3
[PB Info 2026-Apr-19 23:17:11] 67.6
[PB Info 2026-Apr-19 23:17:21] 70.9
Process terminated with signal [SIGKILL: 9]. SIGKILL cannot be caught. A common reason for SIGKILL is running out of
host memory. If the user has root access, they may be able to check by running `sudo journalctl -k --since "<#> minutes
ago" | grep "Killed process"` to see the reason why processes were recently killed.
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Could not run fq2bam
Exiting pbrun ...

It always finishes alignment, does sorting, and then on this BQSR/MarkDups step, it crashes. BQSR is disabled, no knownSites are provided as it gets fed into DeepVariant.

I’ve explored with trying a separate markdups instance, but the requirement to use queryname sort instead of coordinate requires some antics that don’t make it worth it.

dpuleri · April 20, 2026, 2:29pm

Thank you for the information @peter.macdonald . By any chance is your data publicly available? I’ll raise the issue internally either way.

peter.macdonald · April 20, 2026, 3:10pm

Unfortunately it is not, and I am unable to share it. It is a 30x coverage human sample, undergone heavy usage of PCR amplification during prep.

I can however run any stats, debug runs or other logging that could help out as I would love for this to be solved.

peter.macdonald · April 26, 2026, 12:09am

Update here, the samples crashing it have a very high number of duplicate reads, upwards of 10%. Probably very related to that!

Topic		Replies	Views
Memory issues while running Marking Duplicates and BQSR Parabricks fq2bam	3	168	January 16, 2026
Fq2bam Marking Duplicates, BQSR - high memory use, job killed OOM Parabricks	3	1391	July 2, 2024
Could not run fq2bam as part of germline pipeline Parabricks	9	2372	April 8, 2022
"Could not run fq2bam" Is the only verbose output from Parabricks 4.4.0-1 and 4.3.2-1 on tutorial data Parabricks ai , demos-and-tutorials , fq2bam	15	514	March 3, 2025
Parabrick 4.3.1 fq2bam meth takes up too much memory Parabricks fq2bam	3	187	November 25, 2024
Problem with gpu Parabricks ai	12	2895	November 1, 2024
PARABRICKS mem from pbrun germline command hanging and not finishing Parabricks	7	1647	July 5, 2022
[Nvidia/Parabricks] got an error on running Marking Duplicates (with the official Parabricks samples) Parabricks	5	1350	October 12, 2021
Fq2bam stalls in single-end recovery mode Parabricks fq2bam	3	82	March 9, 2026
Let's run germline pipeline + fq2bamfast on RTX 4090 24GB VRAM with 24 cores CPU and 64GB RAM Parabricks cuda	11	985	June 12, 2024

Memory leak crashing run at MarkDuplicates/BQSR stage in Fq2Bam

Related topics