TensorRT Python API builder build_engine faiure - Error Code 2: OutOfMemory (no further information)

Description

We have new issue that we are dealing with and any support from you will much appreciated please.
We created two versions of the same model - topology is identical (Attached below):

  1. Model_big_sizes.onnx

  2. Model_small_sizes.onnx

The only differences are the inputs sizes.

While working with model#2 everything is working well and a TRT engine file successfully created.
With model#1, probably due to our model operators and logics, from some nodes, the TRT can’t find enough device memory for any one of its CUDA kernels while it trying to test which one of them will best optimize the node(s).
So, the TRT build_engine service reports some errors & warnings (Please see below some examples).

Despite these errors and warnings the build engine process continue till this last reported error:
4: [optimizer.cpp::nvinfer1::builder::`anonymous-namespace’::LeafCNode::computeCosts::2031] Error Code 4: Internal Error (Could not find any implementation for node {ForeignNode[72…Transpose_1052 + Reshape_1059]} due to insufficient workspace. See verbose log for requested sizes.)
TRT model optimization exception ERROR - ‘NoneType’ object has no attribute ‘create_engine_inspector’

I tried to run it on several dGPUs including the GeForce 3090 which has 24GB RAM memory and still got these problems.

My questions are:
• How can I know ahead by analyzing the Onnx how much device memory will required?
• Why 24GB RAM is not enough?
• Am I understand it right and actually the only root cause is limited memory capacity?
• Do I have the ability to control the amount of TRT memory usage? I know that there are several ways to control memory such as max_workspace_size,
• Can memory pool settings help? How?
set_memory_pool_limit

Environment

TensorRT Version : 8.4.0.6
GPU Type : Quadro RTX 3000
Nvidia Driver Version : R471.11
CUDA Version : 11.4
CUDNN Version : 8.1.1
Operating System + Version : Windows 10
Python Version (if applicable) : 3.6.8
TensorFlow Version (if applicable) : NA
PyTorch Version (if applicable) : 1.9
Baremetal or Container (if container which image + tag) : Baremetal

Relevant Files

MVSnetModel_small_sizes.onnx (440.8 KB)
Model_big_sizes.onnx (3.2 MB)

Steps To Reproduce

Perform the builder build_engine service on the attached models

Thank you.

Hi,

Could you please share with us complete error logs and issue repro script to try from our end for better debugging. Will get back to you on your queries. We are able to successfully build the TRT engine using trtexec.

Thank you.