Minimap2 Memory Error with Parabricks 4.3.1

Hi I am getting the following error with the latest GPU version of minimap2:

Run parameters:

        docker run \
            -v "$(dirname $(realpath {input.fq}))":"/mnt/input_fq" \
            -v "$(dirname $(realpath {input.ref}))":"/mnt/input_ref" \
            -v "$(dirname $(realpath {output.bam}))":"/mnt/output" \
            --workdir /tmp \
            --env TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=268435456 \
            --user $(id -u):$(id -g) \
            --rm \
            --gpus {params.gpu} \
            {params.container} \
            pbrun minimap2 \
                --ref "/mnt/input_ref/$(basename {input.ref})" \
                --in-fq "/mnt/input_fq/$(basename {input.fq})" \
                --out-bam "/mnt/output/$(basename {output.bam})" \
                --num-chaining-threads 3 \
                --preset map-ont \
                --eqx \
                --num-threads {threads} \
                --alignment-large-pair-size 5000 \
                --process-large-alignments-on-cpu \
                --num-alignment-threads-per-gpu 8 \
                --num-alignment-device-mem-buffers 8 \
                --gpusort \
                --num-gpus {resources.gpu} \
                --gpuwrite

Logfile with Error message:

[PB Info 2024-Jun-05 15:30:46] 	11.1		700000
[PB Info 2024-Jun-05 15:30:52] 	11.2		700000
[PB Info 2024-Jun-05 15:30:58] 	11.3		710000
[PB Info 2024-Jun-05 15:31:04] 	11.4		720000
[PB Info 2024-Jun-05 15:31:10] 	11.5		740000
[PB Info 2024-Jun-05 15:31:16] 	11.6		750000
terminate called after throwing an instance of 'thrust::system::detail::bad_alloc'
  what():  std::bad_alloc: cudaErrorInvalidValue: invalid argument
[PB e[31mErrore[0m 2024-Jun-05 15:31:21][-unknown-:0] Received signal: 6
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks, exiting.
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation

Our server has two H100 GPUs

Hello Caspar, during which stage did you see this error? For example is it under the heading where a box is printed saying “Parabricks accelerated Genomics Pipeline Version 4.3.1-1 minimap2”?

Hi,

its in the first stage: Reading reference file

[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /mnt/input_fq/TEST_TUE_01.SUP.fastq.gz
[Parabricks Options Mesg]:
@RG\tID:17b03957-a602-44a4-b3af-526a3b\tLB:lib1\tPL:bar\tSM:sample\tPU:17b03957-a602-44a4-b3af-526a3b
[PB Info 2024-Jun-05 15:17:59] ------------------------------------------------------------------------------
[PB Info 2024-Jun-05 15:17:59] ||                 Parabricks accelerated Genomics Pipeline                 ||
[PB Info 2024-Jun-05 15:17:59] ||                              Version 4.3.1-1                             ||
[PB Info 2024-Jun-05 15:17:59] ||                                 minimap2                                 ||
[PB Info 2024-Jun-05 15:17:59] ------------------------------------------------------------------------------
[PB Info 2024-Jun-05 15:17:59] Reading reference file.
[PB Info 2024-Jun-05 15:19:40] -------------------------------------
[PB Info 2024-Jun-05 15:19:40] Elapsed-Minutes	Processed-Reads
[PB Info 2024-Jun-05 15:19:40] -------------------------------------
[PB Info 2024-Jun-05 15:19:40] 	0.0		0
[PB Info 2024-Jun-05 15:19:46] 	0.1		0
[PB Info 2024-Jun-05 15:19:52] 	0.2		0
[PB Info 2024-Jun-05 15:19:58] 	0.3		0
[PB Info 2024-Jun-05 15:20:04] 	0.4		0
[PB Info 2024-Jun-05 15:20:10] 	0.5		0
[PB Info 2024-Jun-05 15:20:16] 	0.6		0
[PB Info 2024-Jun-05 15:20:22] 	0.7		0
[PB Info 2024-Jun-05 15:20:28] 	0.8		0
[PB Info 2024-Jun-05 15:20:34] 	0.9		0
[PB Info 2024-Jun-05 15:20:40] 	1.0		0
[PB Info 2024-Jun-05 15:20:46] 	1.1		0
[PB Info 2024-Jun-05 15:20:52] 	1.2		0
[PB Info 2024-Jun-05 15:20:58] 	1.3		0
[PB Info 2024-Jun-05 15:21:04] 	1.4		0
[PB Info 2024-Jun-05 15:21:10] 	1.5		0
[PB Info 2024-Jun-05 15:21:16] 	1.6		0
[PB Info 2024-Jun-05 15:21:22] 	1.7		0
[PB Info 2024-Jun-05 15:21:28] 	1.8		0
[PB Info 2024-Jun-05 15:21:34] 	1.9		10000
[PB Info 2024-Jun-05 15:21:40] 	2.0		20000
[PB Info 2024-Jun-05 15:21:46] 	2.1		20000
[PB Info 2024-Jun-05 15:21:52] 	2.2		40000
[PB Info 2024-Jun-05 15:21:58] 	2.3		40000
[PB Info 2024-Jun-05 15:22:04] 	2.4		40000
[PB Info 2024-Jun-05 15:22:10] 	2.5		50000
[PB Info 2024-Jun-05 15:22:16] 	2.6		50000
[PB Info 2024-Jun-05 15:22:22] 	2.7		60000
[PB Info 2024-Jun-05 15:22:28] 	2.8		80000
[PB Info 2024-Jun-05 15:22:34] 	2.9		80000
[PB Info 2024-Jun-05 15:22:40] 	3.0		80000
[PB Info 2024-Jun-05 15:22:46] 	3.1		90000
[PB Info 2024-Jun-05 15:22:52] 	3.2		100000
[PB Info 2024-Jun-05 15:22:58] 	3.3		110000
[PB Info 2024-Jun-05 15:23:04] 	3.4		120000
[PB Info 2024-Jun-05 15:23:10] 	3.5		120000
[PB Info 2024-Jun-05 15:23:16] 	3.6		120000
[PB Info 2024-Jun-05 15:23:22] 	3.7		130000
[PB Info 2024-Jun-05 15:23:28] 	3.8		140000
[PB Info 2024-Jun-05 15:23:34] 	3.9		150000
[PB Info 2024-Jun-05 15:23:40] 	4.0		160000
[PB Info 2024-Jun-05 15:23:46] 	4.1		160000
[PB Info 2024-Jun-05 15:23:52] 	4.2		170000
[PB Info 2024-Jun-05 15:23:58] 	4.3		180000
[PB Info 2024-Jun-05 15:24:04] 	4.4		180000
[PB Info 2024-Jun-05 15:24:10] 	4.5		180000
[PB Info 2024-Jun-05 15:24:16] 	4.6		200000
[PB Info 2024-Jun-05 15:24:22] 	4.7		200000
[PB Info 2024-Jun-05 15:24:28] 	4.8		210000
[PB Info 2024-Jun-05 15:24:34] 	4.9		210000
[PB Info 2024-Jun-05 15:24:40] 	5.0		220000
[PB Info 2024-Jun-05 15:24:46] 	5.1		220000
[PB Info 2024-Jun-05 15:24:52] 	5.2		230000
[PB Info 2024-Jun-05 15:24:58] 	5.3		240000
[PB Info 2024-Jun-05 15:25:04] 	5.4		260000
[PB Info 2024-Jun-05 15:25:10] 	5.5		260000
[PB Info 2024-Jun-05 15:25:16] 	5.6		260000
[PB Info 2024-Jun-05 15:25:22] 	5.7		270000
[PB Info 2024-Jun-05 15:25:28] 	5.8		280000
[PB Info 2024-Jun-05 15:25:34] 	5.9		300000
[PB Info 2024-Jun-05 15:25:40] 	6.0		300000
[PB Info 2024-Jun-05 15:25:46] 	6.1		300000
[PB Info 2024-Jun-05 15:25:52] 	6.2		310000
[PB Info 2024-Jun-05 15:25:58] 	6.3		320000
[PB Info 2024-Jun-05 15:26:04] 	6.4		320000
[PB Info 2024-Jun-05 15:26:10] 	6.5		330000
[PB Info 2024-Jun-05 15:26:16] 	6.6		340000
[PB Info 2024-Jun-05 15:26:22] 	6.7		360000
[PB Info 2024-Jun-05 15:26:28] 	6.8		360000
[PB Info 2024-Jun-05 15:26:34] 	6.9		360000
[PB Info 2024-Jun-05 15:26:40] 	7.0		370000
[PB Info 2024-Jun-05 15:26:46] 	7.1		370000
[PB Info 2024-Jun-05 15:26:52] 	7.2		380000
[PB Info 2024-Jun-05 15:26:58] 	7.3		380000
[PB Info 2024-Jun-05 15:27:04] 	7.4		380000
[PB Info 2024-Jun-05 15:27:10] 	7.5		380000
[PB Info 2024-Jun-05 15:27:16] 	7.6		380000
[PB Info 2024-Jun-05 15:27:22] 	7.7		380000
[PB Info 2024-Jun-05 15:27:28] 	7.8		380000
[PB Info 2024-Jun-05 15:27:34] 	7.9		380000
[PB Info 2024-Jun-05 15:27:40] 	8.0		430000
[PB Info 2024-Jun-05 15:27:46] 	8.1		450000
[PB Info 2024-Jun-05 15:27:52] 	8.2		450000
[PB Info 2024-Jun-05 15:27:58] 	8.3		470000
[PB Info 2024-Jun-05 15:28:04] 	8.4		490000
[PB Info 2024-Jun-05 15:28:10] 	8.5		490000
[PB Info 2024-Jun-05 15:28:16] 	8.6		510000
[PB Info 2024-Jun-05 15:28:22] 	8.7		510000
[PB Info 2024-Jun-05 15:28:28] 	8.8		520000
[PB Info 2024-Jun-05 15:28:34] 	8.9		530000
[PB Info 2024-Jun-05 15:28:40] 	9.0		530000
[PB Info 2024-Jun-05 15:28:46] 	9.1		540000
[PB Info 2024-Jun-05 15:28:52] 	9.2		550000
[PB Info 2024-Jun-05 15:28:58] 	9.3		560000
[PB Info 2024-Jun-05 15:29:04] 	9.4		570000
[PB Info 2024-Jun-05 15:29:10] 	9.5		570000
[PB Info 2024-Jun-05 15:29:16] 	9.6		580000
[PB Info 2024-Jun-05 15:29:22] 	9.7		580000
[PB Info 2024-Jun-05 15:29:28] 	9.8		590000
[PB Info 2024-Jun-05 15:29:34] 	9.9		600000
[PB Info 2024-Jun-05 15:29:40] 	10.0		610000
[PB Info 2024-Jun-05 15:29:46] 	10.1		630000
[PB Info 2024-Jun-05 15:29:52] 	10.2		630000
[PB Info 2024-Jun-05 15:29:58] 	10.3		630000
[PB Info 2024-Jun-05 15:30:04] 	10.4		650000
[PB Info 2024-Jun-05 15:30:10] 	10.5		650000
[PB Info 2024-Jun-05 15:30:16] 	10.6		660000
[PB Info 2024-Jun-05 15:30:22] 	10.7		670000
[PB Info 2024-Jun-05 15:30:28] 	10.8		670000
[PB Info 2024-Jun-05 15:30:34] 	10.9		690000
[PB Info 2024-Jun-05 15:30:40] 	11.0		700000
[PB Info 2024-Jun-05 15:30:46] 	11.1		700000
[PB Info 2024-Jun-05 15:30:52] 	11.2		700000
[PB Info 2024-Jun-05 15:30:58] 	11.3		710000
[PB Info 2024-Jun-05 15:31:04] 	11.4		720000
[PB Info 2024-Jun-05 15:31:10] 	11.5		740000
[PB Info 2024-Jun-05 15:31:16] 	11.6		750000
terminate called after throwing an instance of 'thrust::system::detail::bad_alloc'
  what():  std::bad_alloc: cudaErrorInvalidValue: invalid argument
[PB e[31mErrore[0m 2024-Jun-05 15:31:21][-unknown-:0] Received signal: 6
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks, exiting.
For technical support visit https://docs.nvidia.com/clara/index.html#parabricks
Exiting...
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation



Could not run minimap2
Exiting pbrun ...

Thanks, Caspar

I can confirm that this issue has been resolved by now. I reran the whole basecalling pipeline for this dataset, so my current assumption is that the original FASTQ input was likely corrupted.
After first trying to change the parameter options, I now found out that the mapping also works with the recommended parameters from the initial post. So it looks like the problem was on our side and not caused by minimap2.

Thanks for your help, hope our inquiry did not spark too much investigations.
Caspar

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.