Issue with ONNX inference on GPU

fltang · November 10, 2025, 10:19am

Hi，

We run the AI model “channel_estimator.onnx” of cubb 25-2 on GPU A100, and it’s strange to find that only cuda cores are used. Is this normal?

“channel_estimator.onnx” includes many conv operations。 In our opinion，they normally would be deployed on tensor cores.

We use python to call the model and open tf32 with “torch.backends.cuda.matmul.allow_tf32 = True / torch.backends.cudnn.allow_tf32 = True”

Is there anything we miss? How can we run the model correctly?

Thanks!

Topic		Replies	Views
Conv3D does not use Tensor Cores TensorRT tensorrt , cuda , cudnn	8	1184	October 23, 2020
Problem on inference using GPU with OpencCV DNN and ONNX model Jetson AGX Orin opencv , cuda , yolo , cudnn , onnx	5	704	May 22, 2024
TensorRT 7 conv3d is not running on Tensor Cores Jetson Xavier NX tensorrt	16	1767	December 1, 2021
Problems on inference using GPU with OpenCV dnn and ONNX model Jetson AGX Xavier opencv , cuda , yolo , onnx	2	2325	September 30, 2022
Low utilization of Tensor RT cores TensorRT	21	2558	December 18, 2021
How to optimize the tensorRT Engine for Tensor Core? Jetson AGX Orin tensorrt , nvbugs	21	2019	August 2, 2023
Models running in Cuda Cores or Tensor Cores TensorRT cudnn , inception	1	186	November 4, 2025
Parseq tensorrt conversion takes for ever to complete TensorRT cudnn	1	129	August 30, 2024
What's the difference between Cuda Cores kernels (icudnn, hcudnn and scudnn) and Tensor Cores Kernels (h884 and i8816)? TensorRT	3	1573	October 12, 2021
Constant GPU inference power Jetson Nano power , gpu	12	1350	October 15, 2021