Description
onnx to tensorrt takes for ever. GPU utilitisation is 100 % at 550 mb. I am able to convert model on another gpu with other cuda/tensorrt ubuntu versions. .
Environment
TensorRT Version: TensorRT-10.1.0.27
GPU Type: NVIDIA GeForce RTX 3090
Nvidia Driver Version: 555.42.06
CUDA Version: 12.5
CUDNN Version: 8902
Operating System + Version: ubuntu 24.04
Python Version (if applicable): 3.12.3
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.3.1+cu121
Baremetal or Container (if container which image + tag):
Relevant Files
trtexec.log (1.0 MB)
onnx model attached with txt extension incase the download fails
parseq.txt (91.1 MB)
onnx model
Steps To Reproduce
onnxfile=parseq.onnx
trtfile=parseq.engine
trtexec --onnx=$onnxfile --saveEngine=$trtfile --verbose #–fp16 --verbose # --skipInference --buildOnly
trt conversion gets stuck here [08/19/2024-13:59:54] [V] [TRT] Skipping CaskFlattenConvolution: No valid tactics for /encoder/patch_embed/proj/Conv
[08/19/2024-13:59:54] [V] [TRT] >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 0x10383a0781d24dde
[08/19/2024-13:59:54] [V] [TRT] =============== Computing costs for {ForeignNode[(Unnamed Layer* 1145) [Constant]…/head/Add]}
[08/19/2024-13:59:54] [V] [TRT] *************** Autotuning format combination: Float(49152,128,16,1) → Float(2470,95,1) ***************
[08/19/2024-13:59:54] [V] [TRT] --------------- Timing Runner: {ForeignNode[(Unnamed Layer* 1145) [Constant]…/head/Add]} (Myelin[0x80000023])
[08/19/2024-13:59:54] [V] [TRT] [MemUsageChange] Subgraph create: CPU +13, GPU +0, now: CPU 2023, GPU 1295 (MiB)