Parabricks fq2bam CRAM output fails samtools index with CRAM slice offset does not match landmark

Hi NVIDIA Parabricks team,

I’m running Parabricks fq2bam as part of an nf-core/sarek-derived workflow on AWS Batch GPU instances, and I hit a CRAM integrity/indexing issue with one sample. The Parabricks fq2bam task completed successfully with exit code 0, but the downstream samtools index step failed on the CRAM produced by Parabricks.

Environment

  • Parabricks container: nvcr.io/nvidia/clara/clara-parabricks:4.7.0-1
  • Tool: pbrun fq2bam
  • Downstream indexer: samtools 1.21
  • Workflow: nf-core/sarek-derived Nextflow workflow
  • Platform: AWS Batch / Seqera Platform
  • Instance shape: GPU AWS Batch environment, using 4 GPUs
  • Reference: GRCh38 / GATK bundle-style reference files

Parabricks command shape

The task used pbrun fq2bam with paired FASTQs, GRCh38 BWA index/reference, known sites for BQSR, and interval restriction. The relevant options were approximately:

pbrun fq2bam \
  --ref <GRCh38_BWA_index_prefix> \
  --in-fq <sample>_R1.fastq.gz <sample>_R2.fastq.gz \
  --out-bam <sample>.cram \
  --knownSites dbsnp_146.hg38.vcf.gz \
  --knownSites Mills_and_1000G_gold_standard.indels.hg38.vcf.gz \
  --knownSites Homo_sapiens_assembly38.known_indels.vcf.gz \
  --out-recal-file <sample>.table \
  --interval-file wgs_calling_regions_noseconds.hg38.bed \
  --num-gpus 4 \
  --bwa-cpu-thread-pool 48 \
  --monitor-usage \
  --read-group-id-prefix <sample>.L1 \
  --read-group-sm <patient_sample> \
  --read-group-lb <sample> \
  --read-group-pl ILLUMINA \
  --bwa-options='-K 100000000 -Y' \
  --gpuwrite \
  --gpusort \
  --bwa-nstreams auto

The output extension was .cram, so Parabricks produced CRAM output.

Observed failure

Parabricks itself finished successfully, but downstream indexing failed:

samtools index -@ 0 22A0018864.cram

with:

[E::cram_index_container] CRAM slice offset 74642 does not match landmark 1 in container header (202490)
samtools index: failed to create index for "22A0018864.cram"

The failing CRAM was large, roughly 74,939,777,141 bytes. I was not able to download and inspect the full CRAM locally, but I did inspect the associated .crai and BQSR recalibration table. The BQSR table looked normal and showed that a large number of reads were processed, so this does not appear to be an obvious early task failure. The problem seems specific to the CRAM structure/indexability.

Why I suspect the CRAM output

The pipeline stage immediately upstream was Parabricks fq2bam, which completed successfully. The next stage was a standard samtools index of the Parabricks-produced CRAM. The failure message appears to be about inconsistent internal CRAM container/slice offsets rather than a missing file, truncated file, or reference mismatch.

Questions

  1. Is pbrun fq2bam CRAM output expected to be fully compatible with samtools index from htslib/samtools 1.21?
  2. Are there known issues in Parabricks 4.7.0-1 with CRAM output, especially when using --gpuwrite, --gpusort, and/or --bwa-nstreams auto?
  3. Are there recommended settings for producing CRAM safely from fq2bam at this scale?
  4. Would you recommend avoiding CRAM output from fq2bam and writing BAM directly, then converting/indexing with another tool if CRAM is required?
  5. What additional diagnostics would be most useful if the full CRAM is too large to download? For example, would the .crai, .command.log, BQSR table, or selected byte ranges from the CRAM be useful?

I can provide the Parabricks .command.log, the .crai, the BQSR recalibration table, and exact command/configuration details if useful. I cannot easily share the full CRAM due to size and data restrictions.

Thanks for any guidance on whether this is a known issue or if there are recommended Parabricks settings to avoid generating non-indexable CRAM output.

Hello,

Thanks for reaching out to us. Just want to confirm that we are able to reproduce this issue, and we believe it’s related to --gpuwriteflag. We will try to fix it in the next release. Before that you can try 2 things to bypass the issue:

  1. Convert the output cram file to a bam/sam for downstream process, which is still valid.
  2. Run fq2bam again without passing --gpuwrite.