Description
We have new issue that we are dealing with and any support from you will much appreciated please.
We created two versions of the same model - topology is identical (Attached below):
-
Model_big_sizes.onnx
-
Model_small_sizes.onnx
The only differences are the inputs sizes.
While working with model#2 everything is working well and a TRT engine file successfully created.
With model#1, probably due to our model operators and logics, from some nodes, the TRT can’t find enough device memory for any one of its CUDA kernels while it trying to test which one of them will best optimize the node(s).
So, the TRT build_engine service reports some errors & warnings (Please see below some examples).
Despite these errors and warnings the build engine process continue till this last reported error:
4: [optimizer.cpp::nvinfer1::builder::`anonymous-namespace’::LeafCNode::computeCosts::2031] Error Code 4: Internal Error (Could not find any implementation for node {ForeignNode[72…Transpose_1052 + Reshape_1059]} due to insufficient workspace. See verbose log for requested sizes.)
TRT model optimization exception ERROR - ‘NoneType’ object has no attribute ‘create_engine_inspector’
I tried to run it on several dGPUs including the GeForce 3090 which has 24GB RAM memory and still got these problems.
My questions are:
• How can I know ahead by analyzing the Onnx how much device memory will required?
• Why 24GB RAM is not enough?
• Am I understand it right and actually the only root cause is limited memory capacity?
• Do I have the ability to control the amount of TRT memory usage? I know that there are several ways to control memory such as max_workspace_size,
• Can memory pool settings help? How?
set_memory_pool_limit
Environment
TensorRT Version : 8.4.0.6
GPU Type : Quadro RTX 3000
Nvidia Driver Version : R471.11
CUDA Version : 11.4
CUDNN Version : 8.1.1
Operating System + Version : Windows 10
Python Version (if applicable) : 3.6.8
TensorFlow Version (if applicable) : NA
PyTorch Version (if applicable) : 1.9
Baremetal or Container (if container which image + tag) : Baremetal
Relevant Files
MVSnetModel_small_sizes.onnx (440.8 KB)
Model_big_sizes.onnx (3.2 MB)
Steps To Reproduce
Perform the builder build_engine service on the attached models
Thank you.