NVIDIA Developer Forums

Running nvidia/Nemotron-Nano-VL-12B-V2-NVFP4-QAD on your spark

Accelerated Computing DGX Spark / GB10 User Forum DGX Spark / GB10 Projects

raphael.amorim November 24, 2025, 5:40pm 4

I’ve actually optimized load time by 7x by using fastsafetensors library in the latest commit

2 Likes

VLM finetuning playbook - Error 404

Topic		Replies	Views	Activity
DGX Spark, Nemotron3, and NVFP4: Getting to 65+ tps DGX Spark / GB10 spark , nemotron , dgx	14	820	December 22, 2025
Help running Nemotron 3 Nano 30B-A3B-FP8 on DGX Spark (GB10) DGX Spark / GB10 spark , nim , nemotron	41	2268	January 24, 2026
VLM finetuning playbook - Error 404 DGX Spark / GB10	6	149	January 28, 2026
New bleeding-edge vLLM Docker Image: avarok/vllm-nvfp4-gb10-sm120 DGX Spark / GB10 Projects	35	1330	December 31, 2025
Can we fine-tune fastpitch on DGX Spark using Nemo DGX Spark / GB10	0	31	January 21, 2026
NVFP4 quantization on the GP10 error DGX Spark / GB10 llama	3	292	November 14, 2025
NVIDIA folks -- where is this promised nvfp4 speedup? DGX Spark / GB10	24	1068	January 11, 2026
Help: Running NVFP4 model on 2x DGX Spark with vLLM + Ray (multi-node) DGX Spark / GB10 mistral-large	18	1362	December 25, 2025
DGX Spark Playbooks Update - Jan 2026 Announcements data-science , spark , jetson , generative_ai , nemotron	1	567	January 21, 2026
How to enable nvfp4 DGX Spark / GB10	6	540	November 6, 2025