Will FP8 releated schema in TransformerEngine upstream to PyTorch?

Hi there,

I’m very interested in FP8 schema in TransformerEngine, like DelayedScaling and fp8_autocast. It’s a brand-new data type support, which uses Delayed Scaling to do calibration, unlike int8 quantization.

Since it can run with PyTorch, I’m wondering will this FP8 schema upstream to PyTorch community in the near feature? For all I know, now PyTorch only supports FP8 data types without scaling. And FP8 in TransformerEngine can fill this gap in PyTorch.

Thanks!

you may get better help by asking on discuss.pytorch.org. There are NVIDIA pytorch experts there.