We have just published a playbook on how to run Speculative Decoding using Two Spark systems for bigger models. You can check it out here: Speculative Decoding | DGX Spark
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Speculative speculative decoding | 3 | 351 | April 6, 2026 | |
| Boosting LLM Inference Speed Using Speculative Decoding in MLC-LLM on Nvidia Jetson AGX Orin | 0 | 284 | November 23, 2024 | |
| TensorRT-LLM Speculative Decoding Boosts Inference Throughput by up to 3.6x | 4 | 209 | January 9, 2025 | |
| Feasibility of 2-3x Speedup via Speculative Decoding on High-Compute (1000 TFLOPS) / Low-BW Hardware | 1 | 138 | November 28, 2025 | |
| Speculative decoding using vLLM on the Nvidia Jetson AGX Orin 64GB dev kit | 0 | 276 | March 9, 2025 | |
| AI μΆλ‘ μ§μ° μκ°μ μ€μ΄κΈ° μν Speculative Decoding μκ° | 1 | 82 | September 23, 2025 | |
| DGX Spark Playbook Request - BF16 -> nvFP4 via model distillation | 1 | 133 | January 29, 2026 | |
| DGX Spark New Playbooks - Nov 2025 | 0 | 598 | November 26, 2025 | |
| DGX Spark Playbooks Update - Jan 2026 | 1 | 917 | January 21, 2026 | |
| Value of 2nd Spark? | 21 | 1546 | March 30, 2026 |