An Error when use trtexec on RTX4070
TensorRT Version: TensorRT-22.214.171.124.Windows10.x86_64.cuda-12.0
GPU Type: RTX4070
Nvidia Driver Version: 535
CUDA Version: 12.0.1_528
CUDNN Version: 126.96.36.199
Operating System + Version: Win10
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): Baremetal
this to Export yolov8s.onnx
yolov8s.onnx (42.8 MB)
Steps To Reproduce
Export yolov8s.onnx from
GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > CoreML > TFLite by cmd
yolo detect export model=yolov8s.pt format=onnx
then copy to path of trtexec.exe and run cmd
.\trtexec.exe onnx=yolov8s.onnx,I get an error
Error: Unexpected exception KTM assertion failure: C:\_src\externals\ktm\src\timingModel.cpp:382 smClk > 0
here is my log
error.log (687.2 KB)
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applicat...
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
yolov8s.onnx (42.8 MB)
Here is the onnx file
Based on the following logs, looks like CUDA report the device clock incorrectly.
[07/05/2023-09:50:21] [I] === Device Information ===
[07/05/2023-09:50:21] [I] Selected Device: NVIDIA GeForce RTX 4070
[07/05/2023-09:50:21] [I] Compute Capability: 8.9
[07/05/2023-09:50:21] [I] SMs: 46
[07/05/2023-09:50:21] [I] Device Global Memory: 12281 MiB
[07/05/2023-09:50:21] [I] Shared Memory per SM: 100 KiB
[07/05/2023-09:50:21] [I] Memory Bus Width: 192 bits (ECC disabled)
[07/05/2023-09:50:21] [I] Application Compute Clock Rate: 0 GHz
[07/05/2023-09:50:21] [I] Application Memory Clock Rate: 0 GHz
[07/05/2023-09:50:21] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
Could you please run the following CUDA sample and confirm if it works fine.