TensorRT engine cannot be built due to workspace size even if it's set higher

weissrael · March 14, 2021, 9:30am

Description

Hi,
I’m recently having trouble with building a TRT engine for a detector yolo3 model. The original model was trained in Tensorflow (2.3), converted to onnx (tf2onnx most recent version, 1.8.3) and then I convert the onnx model to TensorRT.

The exact error is the following:

[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax.)
[TensorRT] VERBOSE: Builder timing cache: created 74 entries, 26 hit(s)
[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax.)

The conversion code from onnx to TRT is basically what’s done here:

github.com

jkjung-avt/tensorrt_demos/blob/master/yolo/onnx_to_tensorrt.py

# onnx_to_tensorrt.py
#
# Copyright 1993-2019 NVIDIA Corporation.  All rights reserved.
#
# NOTICE TO LICENSEE:
#
# This source code and/or documentation ("Licensed Deliverables") are
# subject to NVIDIA intellectual property rights under U.S. and
# international Copyright laws.
#
# These Licensed Deliverables contained herein is PROPRIETARY and
# CONFIDENTIAL to NVIDIA and is being provided under the terms and
# conditions of a form of NVIDIA software license agreement by and
# between NVIDIA and Licensee ("License Agreement") or electronically
# accepted by Licensee.  Notwithstanding any terms or conditions to
# the contrary in the License Agreement, reproduction or disclosure
# of the Licensed Deliverables to any third party without the express
# written consent of NVIDIA is prohibited.
#
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE

This file has been truncated. show original

I set a workspace of 1<<30 , which should be more than enough. Tried setting it higher (1<<34, 2<<23 etc.) but it didn’t help…
I attach here the verbose output of the conversion: error_trtexec_verbose.txt (498.1 KB) .
I can’t share the relevant onnx model.
Why do I keep getting an error related to the workspace size, even if I set it higher? How can this be solved?

Environment

TensorRT Version : 7.1.2
CUDA Version : 11.0
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) : 3.6
TensorFlow Version (if applicable) : The model was trained on tf 2.3, converted to onnx, and then converted to tensorRT engine.
Device for TRT engine builder: Jetson AGX Xavier

spolisetty · March 15, 2021, 5:40am

Hi @weissrael,

Could you please confirm are you using the same python script (github) in the description.
The logs does not exactly match. The logs does not have FP16 at all but the script on Github has FP16 enable.
You might not set workspace correctly. For example, user use build_engine(network, config) but set the workspace with builder.max_workspace_size.

From log all layers are reporting available scratch is 0. All TopK tactic want a scratch. We need to increase the workspace.

2021-03-14T09:10:21.5757269Z [TensorRT] VERBOSE: --------------- Timing Runner: StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax (TopK)
2021-03-14T09:10:21.5758730Z [TensorRT] VERBOSE: Tactic: 0 skipped. Scratch requested: 147840, available: 0
2021-03-14T09:10:21.5760298Z [TensorRT] VERBOSE: Tactic: 1 skipped. Scratch requested: 147840, available: 0
2021-03-14T09:10:21.5761899Z [TensorRT] VERBOSE: Tactic: 3 skipped. Scratch requested: 147840, available: 0
2021-03-14T09:10:21.5763332Z [TensorRT] VERBOSE: Tactic: 2 skipped. Scratch requested: 147840, available: 0
2021-03-14T09:10:21.5765403Z [TensorRT] VERBOSE: Fastest Tactic: -3360065831133338131 Time: 3.40282e+38
2021-03-14T09:10:21.5766987Z [TensorRT] ERROR: Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
2021-03-14T09:10:21.5769973Z [TensorRT] ERROR: ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax.)
2021-03-14T09:10:21.5771944Z [TensorRT] VERBOSE: Builder timing cache: created 74 entries, 26 hit(s)
2021-03-14T09:10:21.5775088Z [TensorRT] ERROR: ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node StatefulPartitionedCall/functional_3/tiny_yolov3/tf_op_layer_ArgMax/ArgMax.)

Thank you.

weissrael · March 15, 2021, 4:23pm

Hi @spolisetty
I fixed the workspace adjustment to be applied to the config instead of the builder:

config.max_workspace_size = 1 << 30

The attached logs describes several exports of a TRT models- different precision / modes:

export of both float32 model without DLA
float16 model with DLA enabled.
The error of the workspace-related + warning of “DLA Node compilation Failed” comes only for the float16 + DLA model.
I have no clue why this happens, I’m following this thread meanwhile:
Trtexec log problem and use DLA error on Jetson Xavier - #4 by disculus2012
But if anyone can give an advice how to solve this for float16 model + DLA enabled, this will help. Preferably without needing to update jetpack etc. (I use Jetpack 4.4 for Jetson AGX)

spolisetty · March 16, 2021, 3:19pm

Hi @weissrael,

You may need to try this on future releases of Jetpack. We have some fixes in TensorRT 7.2 latest version.
We recommend you to share model and relevant scripts for better debugging.

Thank you.

msripooja · June 21, 2021, 8:43am

Hi @spolisetty ,

I am facing same issue while creating engine on Jetson AGX Xavier which has JetPack 4.5.1

Error message:

[TensorRT] ERROR: Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node TopK_161.)
[TensorRT] VERBOSE: Builder timing cache: created 20 entries, 25 hit(s)
[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node TopK_161.)

How to fix this issue? I tried setting “max_workspace_size” to a max value but I keep getting the same error.

Please suggest a solution for Jetson AGX Xavier platform.

Andrey1984 · January 28, 2022, 2:41pm

@spolisetty
I am running in the similar issue trying to run container from NGC

Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().

how do we do it?

Topic		Replies	Views
TRT engine created on Tesla T4, Quadro P1000 but engine creation fails on Jetson AGX Xavier Jetson AGX Xavier tensorrt , cuda , jetson-inference , python	17	2661	October 18, 2021
Erorr with onnx to trt Jetson Xavier NX tensorrt	8	1238	March 30, 2022
Error: 'tensorrt.tensorrt.Builder' object has no attribute 'max_workspace_size' Jetson Xavier NX tensorrt	8	4812	March 6, 2023
Resize error Jetson Xavier NX tensorrt , tensorflow	5	1286	September 12, 2021
TensorRt Error Network must have at least one output , using onnx model in Jetson nano TensorRT	13	2519	September 24, 2020
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	792	December 8, 2021
TensorRT: Cannot set bindings for dynamic shapes TensorRT	4	5545	October 12, 2021
TensorRT problem on NVIDIA APEX ORIN NX TensorRT tensorrt , jetson-inference , cudnn	1	37	August 29, 2024
[TensorRT] ERROR: (Unnamed Layer* 0) [Convolution]: at least 5 dimensions are required for input - on Jetson Xavier Jetson Xavier NX tensorrt	7	1631	October 18, 2021
Tenssorrt INT8 precision engine build failed for the models having custom layer (BatchedNMSDynamic_TRT) TensorRT	11	1915	June 29, 2021

TensorRT engine cannot be built due to workspace size even if it's set higher

Description

Environment

Related topics