Mismatch between Tensorflow + ONNX versus TensorRT

Description

I have a keras/tensorflow model I am working with and it seems I have a mismatch in the output when using TensorRT.

To be more specific, since I am working in Windows I am doing the following conversion process (here is a link to tf2onnx):
TensorFlow --> TF2ONNX --> ONNX Model --> Import to TensorRT

I have tested the ONNX output using OnnxRuntime and it matches the tensorflow model. But when I import the ONNX model into the TensorRT C++ API, the output is no longer correct.

I have been probing different parts of the network and I found the node where the two models start to mismatch.
Essentially I have the following:

Initial part of network -->
Conv2DTranspose --> Concatenate --> Conv2D --> ReLU --> Conv2D --> ReLU --> Add -->
Conv2DTranspose --> Concatenate --> Conv2D --> ReLU --> Conv2D --> ReLU --> Add -->
Conv2DTranspose --> Concatenate --> Conv2D --> ReLU --> Conv2D --> ReLU --> Add

If you notice, there are repeating layers here (Conv2DTranspose to Add).

I noticed the following:
If I make the first Add an output during the ONNX conversion, then the TensorRT output matches the onnxruntime output. However, if I make the second Add node the output, then the outputs start to mismatch. I believe this is because TensorRT is seeing that there are repeating layers and is trying to do an optimization. Unfortunately in my case it seems that this attempted optimization is causing an incorrect output.

There are a few questions:

  1. Any idea what optimization TensorRT is trying to do and why it is failing?
  2. When I create ONNX models, I use Netron to view the layers and nodes. Is there some way to similarly view TensorRT models? Or at the very least is there a way to print the model with all the layers and connections using either the command line tool or C++ API, if I am importing an ONNX model? I would like to see what TensorRT does with the problem set of layers.
  3. Since I am using Windows, it seems my only way to import TensorFlow models is to convert to ONNX using TF2ONNX and then import this ONNX model into TensorRT. In the end I will do inference on a Windows machine. If for testing purposes I were to use an Linux machine to use TF-TRT, could I transfer the output of this tool to my Windows machine. In my case I cannot have the same target GPU on the Linux machine.

Environment

TensorRT Version: 7
CUDA Version: 10.2
CUDNN Version: 7.6
TensorRT API: C++
Operating System + Version: Windows 10 64-bit

Hi @solarflarefx,

There is no such tool to visualize TRT model, atleast that i am aware of.
However I request you to share your model and script so that we can help you better.
Thanks!