|
Qwen3.5-122B-A10B on single Spark: up to 51 tok/s (v2.1 — patches + quick-start + benchmark)
|
|
398
|
14237
|
May 9, 2026
|
|
DGX Spark Performance Degradation - GPU Power Draw Issue
|
|
50
|
2444
|
May 9, 2026
|
|
T5000 - Sample to use full performance
|
|
1
|
25
|
May 7, 2026
|
|
Slow wi-fi on orin nano devkit (RTL8822CE 802.11ac)
|
|
8
|
46
|
May 7, 2026
|
|
Dual DGX Spark: NCCL capped at 2.80 GB/s + ib_write_bw crashes at 128KB syndrom 0x88 — matches thread 366266 with additional RoCE degradation
|
|
2
|
131
|
April 20, 2026
|
|
GPU needs to be "warmed up" to achieve maximum performance
|
|
22
|
683
|
May 5, 2026
|
|
NeuralForge GPU Native Knowledge Intelligence Platform Built on DGX Spark GB10
|
|
4
|
294
|
April 17, 2026
|
|
Just another ASUS GX10 NCCL all_gather_perf thread... mpirun... please read if you have an ASUS model multinode setup
|
|
3
|
388
|
April 16, 2026
|
|
NCCL bandwidth capped at 3 GB/s, GPU PCIe topology reports Gen1 x1 on DGX Spark FE
|
|
6
|
272
|
April 14, 2026
|
|
Qwen3.5 Flash Attention performance inconsistencies
|
|
1
|
355
|
April 5, 2026
|
|
Latest Update (20Mar 2026) on Nvidia Spark FE caps GPU performance
|
|
9
|
564
|
April 3, 2026
|
|
VK_EXT_descriptor_heap: Uniform buffer loads use global memory loads instead of constant loads
|
|
0
|
143
|
March 14, 2026
|
|
How to tell core utilization when running headless?
|
|
5
|
143
|
March 28, 2026
|
|
Degraded performance on L4T 32.X vs L4T 35.X
|
|
2
|
31
|
March 2, 2026
|
|
OpenGL texture view performance
|
|
0
|
74
|
February 2, 2026
|
|
High Latency and GPU Contention when running DeepStream (Python) + VSS on DGX Platform
|
|
4
|
224
|
January 23, 2026
|
|
A new GPU-accelerated prime sieve using constant-cost structural elimination to overcome memory bandwidth limits at massive scales
|
|
5
|
185
|
January 21, 2026
|
|
Knema – Frame Continuity Engine
|
|
0
|
23
|
January 16, 2026
|
|
Nvidia Powerd Dynamic boost not increasing the power limit
|
|
2
|
174
|
January 13, 2026
|
|
Help on llama.cpp command line arguments and compilation settings (performance testing included)
|
|
7
|
2034
|
January 9, 2026
|
|
`rte_flow_async_create` takes a huge 1 million cycles on ConnectX-6 Dx DX
|
|
2
|
82
|
January 7, 2026
|
|
How to correlate range profiling metrics with a certain kernel?
|
|
3
|
101
|
January 30, 2026
|
|
cuDNN Bug Report: Conv3d Performance Regression with bfloat16/float16 on H100
|
|
2
|
235
|
December 31, 2025
|
|
What is APP ( Adjusted Peak Performance) of Jetson AGX Orin 64GB?
|
|
2
|
62
|
December 18, 2025
|
|
DGX Spark Image and Video Generation performance?
|
|
2
|
1179
|
December 15, 2025
|
|
Unable to run benchmark on Jetson Orin NX 16GB
|
|
7
|
141
|
December 4, 2025
|
|
Unexpected Performance Behavior with CUDA Software Prefetcher, Warm-Up Kernel and GEMV
|
|
10
|
183
|
December 3, 2025
|
|
Optimizing PTX mma ops on volta to surpass wmma
|
|
2
|
95
|
November 30, 2025
|
|
Orin NX: Frame loss when encoding 6 UHP H265 1080p60 streams
|
|
6
|
153
|
November 28, 2025
|
|
MSI EdgeXpert vs DGX Spark
|
|
2
|
1441
|
November 26, 2025
|