Not sure if this is supported for Hopper already or in future road map?
Thanks
Gino
Not sure if this is supported for Hopper already or in future road map?
Thanks
Gino
As I understand, TE 2.0 only supports mx scaling for Blackwell, but not for Hopper. And currently the best practice of FP8 for Hopper is Deepseek V3’s group-wise scaling, then would future TE release support group-wise scaling for Hopper, instead of per-tensor scaling?