Hi,
We run the AI model “channel_estimator.onnx” of cubb 25-2 on GPU A100, and it’s strange to find that only cuda cores are used. Is this normal?
“channel_estimator.onnx” includes many conv operations。 In our opinion,they normally would be deployed on tensor cores.
We use python to call the model and open tf32 with “torch.backends.cuda.matmul.allow_tf32 = True / torch.backends.cudnn.allow_tf32 = True”
Is there anything we miss? How can we run the model correctly?
Thanks!