Very slow mmap on DGX Spark that affects model loading - questions to NVIDIA

sumitg · November 20, 2025, 11:24am

With mmap, loading huge models can be slower due to lot of page faults and related overhead as ~60GB is brought lazily page by page into page cache.

Could you run below command and check.

$ sudo bash -c "echo 8192 > /sys/block/nvme0n1/queue/read_ahead_kb"

Increasing ‘read_ahead_kb’ for NVME can help the kernel prefetch big chunks and reduces page-faults.

I tried it and observed that in Kernel-v6.17, mmap time reduced by ~50% and no-mmap time reduced by ~35%.

In Kernel-v6.14, mmap time didn’t reduce significantly and no-map reduced by ~50%.

This can be due to improvements in ‘read_ahead’ related Kernel code between v6.14 to v6.17.

Topic		Replies	Views
Apparently mmap is still slow on DGX Spark on Linux 6.17? DGX Spark / GB10 llama	1	191	February 13, 2026
Memory Creep on DGX Spark: Where Your 128 GB Actually Goes (And How to Stop It) DGX Spark / GB10 jetson , nemotron	2	355	March 30, 2026
Double memory use in Huggingface Qwen3 coder next DGX Spark / GB10	4	296	March 21, 2026
Buyers beware: DGX Spark limited to 64GB in ComfyUI DGX Spark / GB10	17	1669	March 27, 2026
New bleeding-edge vLLM Docker Image: avarok/vllm-nvfp4-gb10-sm120 DGX Spark / GB10 Projects	35	2635	December 31, 2025
Distributed Inference - 200gb/s with bottleneck, am I missing something? DGX Spark / GB10 llama	5	439	January 22, 2026
DGX Spark vs AMD Strix Halo DGX Spark / GB10 llama	4	5262	February 18, 2026
NVIDIA folks -- where is this promised nvfp4 speedup? DGX Spark / GB10	27	2296	March 26, 2026
System crashes when memory is full DGX Spark / GB10	28	1624	December 22, 2025
Help: Running NVFP4 model on 2x DGX Spark with vLLM + Ray (multi-node) DGX Spark / GB10 mistral-large	18	2174	December 25, 2025