Hello everyone,
We are encountering some issues with whole-genome sequencing (WGS) analysis using Parabricks Version 4.4.0-1. Has anyone else encountered similar issues? Thank you very much in advance.
Problem Description
When running Parabricks for whole-genome analysis, we observed that some datasets encounter a similar error, while others did not. The specific error message is as follows:
[PB Info 2025-Jan-11 12:40:48] 2:1416001 0.5 1338291 2676582
terminate called after throwing an instance of ‘std::overflow_error’
what(): robin_hood::map overflow
[PB Error 2025-Jan-11 12:40:55][-unknown-:0] Received signal: 6
For technical support visit NVIDIA Clara - NVIDIA Docs, exiting.
Troubleshooting Steps
To further diagnose the issue, we firstly tested the BAM files output from the fq2bam process and found that they could be processed normally using GATK. This led us to suspect that the error is primarily related to the HTVC (HaplotypeCaller) process.
Then, we narrowed down the analysis to specific intervals, which allowed us to reproduce this error. The problematic intervals are primarily located around GL000220.1:143740-143850 and chr2:55884500-55884610. The reference genome sequences in these regions are as follows
>GL000220.1:143740-143850
CAGTTAGTTTTTGTAATTTTTTTTTTTTTTTTTTTTTTTTGAGACGAGGTTTCACCGTGTTGCCAAGGCTTGGACCGAGGGATCCACCGGCCCTCGGCCTCCCAAAAGTGC
>chr2:55884500-55884610
CAGATTAACAAGAATTTTTTTTTTGTTTTTTCTTTTTTTTTAAGACAGAGTTCTGCTCTTGTTGCCCAGGCTGGCGTGCAATGGTGCAATCTCGGCTCACTGCAACCTCTG
We observed that both of these sequences contain long stretches of polyT sequences, but we don’t know if it has anything to do with this error. The bam file related to this question is in https://pan.quark.cn/s/58f79294bc0b , which encountered an error in chr2:55884500-55884610.
Additionally, we have tested the impact of changing the sequence tags and base quality values in the FASTQ files, but these changes did not resolve this error.
Potential Wider Impact
Although we’ve only identified this issue in the GL000220.1:143740-143850 and chr2:55884500-55884610 intervals so far, we suspect that similar problems might exist in other chromosomes or segments.
We are reaching out to see if anyone else has encountered similar issues or has suggestions on how to resolve this. Any insights or advice would be greatly appreciated!
Best regards