Because v4.1.1-1 isn’t handling --align-only
and --no-markdups
as expected for me (see my other recent posts), fq2bam is running Marking Duplicates, BQSR. However, here I run into memory problems.
Consistent with the stated hardware requirements, I queued a job using Slurm as follows:
#SBATCH --cpus-per-task=24
#SBATCH --gpus-per-task=2
#SBATCH --mem-per-cpu=5g
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
After properly aligning and sorting, it logged:
[PB Info 2023-Jul-14 12:04:31] ------------------------------------------------------------------------------
[PB Info 2023-Jul-14 12:04:31] || Parabricks accelerated Genomics Pipeline ||
[PB Info 2023-Jul-14 12:04:31] || Version 4.1.1-1 ||
[PB Info 2023-Jul-14 12:04:31] || Marking Duplicates, BQSR ||
[PB Info 2023-Jul-14 12:04:31] ------------------------------------------------------------------------------
[PB Info 2023-Jul-14 12:04:57] progressMeter - Percentage
[PB Info 2023-Jul-14 12:05:07] 0.9 14.09 GB
[PB Info 2023-Jul-14 12:05:17] 1.6 26.09 GB
[PB Info 2023-Jul-14 12:05:27] 2.4 37.58 GB
[PB Info 2023-Jul-14 12:05:37] 3.2 48.73 GB
[PB Info 2023-Jul-14 12:05:47] 4.5 60.46 GB
[PB Info 2023-Jul-14 12:05:57] 6.0 70.30 GB
[PB Info 2023-Jul-14 12:06:07] 7.0 80.88 GB
[PB Info 2023-Jul-14 12:06:17] 8.7 91.20 GB
[PB Info 2023-Jul-14 12:06:27] 10.1 100.41 GB
[PB Info 2023-Jul-14 12:06:37] 11.7 111.54 GB
slurmstepd: error: Detected 1 oom_kill event in StepId=55846256.batch. Some of the step tasks have been OOM Killed.
Since the job was queued with net 5 x 24 = 120G CPU RAM, it is apparent that memory usage was accumulating and the job killed when it hit the job limit. This raises two issues/questions:
-
Why does Marking Duplicates, BQSR use so much memory? Is it expected?
-
More importantly, is there a mechanism to tell fq2bam how much memory is available to it?
The job is running on a shared node, and I do not necessarily have access to all RAM on the machine. Even if I try to ensure that I am the only job running on a node, I still need to provide a memory request, and thus will have a job memory limit.
I read in other posts about an option called --memory-limit
, but I do not see it in current documentation so assume it was dropped in recent versions? It seems an important option to me.
(as an aside, for me this issue would become moot if --align-only
worked, I don’t actually want to do sorting or dup marking)