If its not even as good as FP8 you should probably have a look at Introducing PrismaScout -- PrismaQuant v2!
norman.2
4
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Introducing PrismaQuant | 165 | 6098 | May 23, 2026 | |
| Introducing PrismaScout -- PrismaQuant v2! | 87 | 5264 | June 9, 2026 | |
| Benchmarks for Qwen3.6 FP8 vs PrismQuant | 6 | 1787 | April 30, 2026 | |
| What's the best speed we can get with Qwen 3.6 27B without quantizing? | 30 | 14514 | June 7, 2026 | |
| NVFP4 quantization of a 100B-class Llama on 2× DGX Spark — lessons + open questions | 5 | 369 | May 15, 2026 | |
| Best Q4 / NVFP4 model for quality Qwen3.5-27B or alternatives? | 16 | 3505 | April 26, 2026 | |
| Why Turboquant saves DGX twice | 134 | 11470 | May 31, 2026 | |
| PSA: State of FP4/NVFP4 Support for DGX Spark in VLLM | 234 | 12632 | May 15, 2026 | |
| FP4 on DGX Spark — Why It Doesn't Scale Like You'd Expect | 213 | 6344 | March 13, 2026 | |
| Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer | 0 | 92 | September 10, 2024 |