I am running LtFp8Matmul example of cublasLT
The example is using e4m3 fp8 format. However, when i change it to e5m2, i am hitting this error
cuBLAS API failed with status 15
terminate called after throwing an instance of ‘std::logic_error’
** what(): cuBLAS API failed**
Aborted (core dumped)
I just modified the datatype and replaces all __nv_fp8_e4m3 with __nv_fp8_e5m2, and
inside cublasLtMatrixLayoutCreate CUDA_R_8F_E4M3 with CUDA_R_8F_E5M2
I am not sure why e5m2 is not working at all. In theory, the flow should be similar to e4m3.
Any inputs here? Thanks!