Introducing PrismaQuant

Excited to release this!

Did you get to have a look into this potential weight problem with the smaller models @tenari ?

Apparently the 3.6 27B model has weights issues as well like the previous gen. Any possibility to integrate those fixes into the prismquant @tenari ? See for example here GitHub - AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash: Lossless abliteration of Qwen3.6-27B with NVFP4 hardware quantization for DGX Spark / Blackwell. BF16 (51 GB) + NVFP4 (26 GB) deployment guide, docker-compose, and QuickStart. · GitHub

Per FernflowerAI’s empirical discovery, certain late SSM / GatedDeltaNet blocks in Qwen 3.5 / 3.6 hybrids have linear_attn.conv1d.weight σ inflated 50–100 % above the median across all SSM blocks. Left unrepaired, this manifests during long-context inference as coherence collapse and “philosophizing” loops, and it makes the model hypersensitive to downstream abliteration (amplifies the noise).

I have not looked at it, but definitely will. It’s the biggest annoyance of the qwen family of models right now.