Issue with ONNX inference on GPU

Hi,

We run the AI model “channel_estimator.onnx” of cubb 25-2 on GPU A100, and it’s strange to find that only cuda cores are used. Is this normal?

“channel_estimator.onnx” includes many conv operations。 In our opinion,they normally would be deployed on tensor cores.

We use python to call the model and open tf32 with “torch.backends.cuda.matmul.allow_tf32 = True / torch.backends.cudnn.allow_tf32 = True”

Is there anything we miss? How can we run the model correctly?

Thanks!