Description
Hi,
I’m recently having trouble with building a TRT engine for a detector yolo3 model. The original model was trained in Tensorflow (2.3), converted to onnx (tf2onnx most recent version, 1.8.3) and then I convert the onnx model to TensorRT.
The exact error is the following:
[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax.)
[TensorRT] VERBOSE: Builder timing cache: created 74 entries, 26 hit(s)
[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax.)
The conversion code from onnx to TRT is basically what’s done here:
I set a workspace of 1<<30 , which should be more than enough. Tried setting it higher (1<<34, 2<<23 etc.) but it didn’t help…
I attach here the verbose output of the conversion: error_trtexec_verbose.txt (498.1 KB) .
I can’t share the relevant onnx model.
Why do I keep getting an error related to the workspace size, even if I set it higher? How can this be solved?
Environment
TensorRT Version : 7.1.2
CUDA Version : 11.0
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) : 3.6
TensorFlow Version (if applicable) : The model was trained on tf 2.3, converted to onnx, and then converted to tensorRT engine.
Device for TRT engine builder: Jetson AGX Xavier