I converted a segmentation model from Pytorch to ONNX to TensorRT with Success.
I tried the same ONNX file to Tensorflow but the model uses more than 10X the GPU memory than the Pytorch or TensorRT version. It is essentially allocating 100% of the GPU memory during inference (1, 1, 512, 512) so it is fairly small image even.
Unbuntu 18.04 all creation, conversion and inference was on the same machine with a Venv or containers.
• Tensorflow version is currently 1.14 (1.13 had the same issue)
• NVIDIA Driver: 418.87.00
• CUDA: 10.1.243
• cuDNN: 7.6.3