cublasLT FP8

I am running LtFp8Matmul example of cublasLT

The example is using e4m3 fp8 format. However, when i change it to e5m2, i am hitting this error
cuBLAS API failed with status 15
terminate called after throwing an instance of ‘std::logic_error’
** what(): cuBLAS API failed**
Aborted (core dumped)

I just modified the datatype and replaces all __nv_fp8_e4m3 with __nv_fp8_e5m2, and
inside cublasLtMatrixLayoutCreate CUDA_R_8F_E4M3 with CUDA_R_8F_E5M2
I am not sure why e5m2 is not working at all. In theory, the flow should be similar to e4m3.
Any inputs here? Thanks!

I don’t think A * B where both A and B are e5m2 is supported, as it’s not on the chart: cuBLAS

You want weights and activations in e4m3, and gradients in e5m2:
https://www.reddit.com/r/MachineLearning/comments/7bi5yd/comment/dpk2ldm/