|
vLLM 0.17.0 wheel for Jetson Orin with Marlin GPTQ (SM 8.7)
|
|
0
|
13
|
March 15, 2026
|
|
"RTX 5090 + 5070 Ti Multi-GPU Training: CUDA Driver Crash During Backward Pass (sm_120, PyTorch, gradient_checkpointing)"
|
|
0
|
14
|
March 15, 2026
|
|
RAG Blueprint on DGX Spark (ARM64 / GB10): NIMs & Milvus OK, but ingestor-server / rag-server fail with exec format error
|
|
5
|
401
|
March 15, 2026
|
|
Vulkan as alternative backend for llama.cpp
|
|
1
|
47
|
March 15, 2026
|
|
Building Local + Hybrid LLMs on DGX Spark That Outperform Top Cloud Models
|
|
19
|
3031
|
March 15, 2026
|
|
To NVIDIA Staff: Is This a Hardware Issue Requiring Repeated Shutdowns and RMA Under High Load?
|
|
23
|
529
|
March 14, 2026
|
|
LLM library recomendations for maximum token speeds
|
|
10
|
246
|
March 14, 2026
|
|
NVidia GreenBoost kernel modules opensourced
|
|
1
|
672
|
March 14, 2026
|
|
Nemotron-3-Super 120B on GB10 — llama.cpp sm_121 build + Ollama GGUF incompatibility fix
|
|
0
|
105
|
March 14, 2026
|
|
Missing vision reasoning with Qwen3.5-122B Q4 on vLLM (works on llama.cpp)
|
|
4
|
300
|
March 13, 2026
|
|
How do I run Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled on vllm community docker?
|
|
4
|
291
|
March 13, 2026
|
|
"unable to allocate CUDA0 buffer" after Updating Ubuntu Packages
|
|
244
|
13922
|
March 13, 2026
|
|
DGX crashing after ~8 hours: Stability issues switching from GLM 4.6v (FP8) to Qwen 122B (Q6) in llama.cpp
|
|
1
|
114
|
March 13, 2026
|
|
HOW-TO: Run Qwen3-Coder-Next on Spark
|
|
89
|
6559
|
March 12, 2026
|
|
New tool: llama-benchy - llama-bench style benchmarking for ANY LLM backend (vLLM, SGLang, llama.cpp, etc.)
|
|
9
|
776
|
March 12, 2026
|
|
Single node and Dual node llama.cpp build flag
|
|
5
|
62
|
March 11, 2026
|
|
(sparkrun) Qwen3.5 GGUF Benchmarks over llama.cpp RPC
|
|
3
|
443
|
March 11, 2026
|
|
HOW-TO: setup-dgx-spark docker inference - A "Sane" Inference Stack for GB10 (Need Contributors!)
|
|
30
|
1058
|
March 11, 2026
|
|
Distributed Spark
|
|
2
|
78
|
March 10, 2026
|
|
NVIDIA ACE: Model Archive
|
|
1
|
55
|
March 10, 2026
|
|
Open-Source CLI Agent Framework for NVIDIA AI Endpoints - Seeking Feedback
|
|
2
|
38
|
March 10, 2026
|
|
Cuda headers in crt/math_functions.h still broken in debian-13 repo
|
|
1
|
31
|
March 9, 2026
|
|
CUDA headers in crt/math_functions.h still broken in debian-13 repo
|
|
1
|
19
|
March 9, 2026
|
|
VSS Jetson Thor: GPU memory increase during summarization causes OOM unless VSS is restarted
|
|
2
|
37
|
March 9, 2026
|
|
MSI EdgeXpert Suddenly Power-Off During llama-benchy – Possible PD Firmware Issue?
|
|
25
|
294
|
March 9, 2026
|
|
Missing official native ARM64 NIM images for essential AI models
|
|
5
|
429
|
March 9, 2026
|
|
Home Assistant on DGX?
|
|
5
|
236
|
March 9, 2026
|
|
Max observed wattage
|
|
4
|
136
|
March 8, 2026
|
|
DGX Spark crashes when running tensorrt-llm
|
|
3
|
167
|
March 7, 2026
|
|
TRT LLM for Inference with NVFP4 safetensors slower than LM studio GGUF on the Spark
|
|
9
|
1035
|
March 6, 2026
|