Description
I was trying to convert the DELG model provided in tensorflow-models git repo with TensorRT. While the converstion process itself succeeded, the actual inference failed with error messages like
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 591, in call
outputs = execute.execute(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: dnn PoolForward launch failed
[[node while/cond/resnet_v1_101/pool1/MaxPool (defined at /workspace/image-inference-pipeline/src/image_inference_pipe
line/task_adaptor/housefront_descriptor_extraction/delg/__init__.py:25) ]]
[[PartitionedCall/while/cond/resnet_v1_101/TRTEngineOp_0_16]] [Op:__inference_signature_wrapper_5707]
Function call stack:
signature_wrapper -> while/cond/resnet_v1_101/TRTEngineOp_0_16_native_segment
I have tired using both the ‘convert’ option of saved_model_cli tool installed with Tensorflow 2.5 and the recipe in this tutorial to do the conversion. The results are basically the same.
The DELG model is essentially a Resnet-101 trunk plus a global-pooling layer. However, one special property of it is its input tensor does not support batch size dimension and is expecting a tensor with shape [H, W, 3]. I was wondering whether the lacking of the batch-size dimension is the cause for the problem.
Does anyone know whether TensorRT is able to handle models expecting input without batch-size dimension? If it does, what special thing I should do to make it work?
Environment
TensorRT Version: 8.0.1.6
GPU Type: RTX Titan
Nvidia Driver Version: 515.65.01
CUDA Version: 11.7
CUDNN Version: 8.2.2
Operating System + Version: Linux
Python Version (if applicable): 3.8
TensorFlow Version (if applicable): 2.5
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): Container nvcr.io/nvidia/tensorflow:21.08-tf2-py3