Breakage in TensorRT 5.1.2

niekgh6w6 · March 21, 2019, 1:42pm

I have a Tensorflow model derived from VGG16 which worked fine when converted with TensorRT 5.0.2. (tensorrt:19.02-py3 being served with tensorrtserver:19.02-py3).

TensorRT 5.1.2 is making my life miserable. (tensorrt:19.03-py3 being served with tensorrtserver:19.03-py3)

In 5.0.2 I specified the parser input as follows:

parser.register_input(tname, (3, 224, 224), trt.UffInputOrder.NHWC)

In 5.1.2 I get the following error when creating the plan:

[TensorRT] ERROR: import/conv1/convolution: kernel weights has count 32670 but 2439360 was expected
[TensorRT] ERROR: UffParser: Parser error: import/conv1/BiasAdd: The input to the Scale Layer is required to have a minimum of 3 dimensions.
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):

I can get it to compile if I do either:

parser.register_input(tname, (224, 224, 3), trt.UffInputOrder.NHWC)
parser.register_input(tname, (3, 224, 224), trt.UffInputOrder.NCHW)

But then my soft maxes come out as garbage numbers, regardless of whether I provide input images in HWC or CHW.

Other Tensorflow architectures that I use are working fine with 5.1.2.

Any advice on debugging this problem? Until I fix this, I can’t use the 1.0.0 TRT Inference Server because it is not backwards compatible with 5.0 TRT plan files.

NVES · March 22, 2019, 5:03pm

Hello,

to help us debug, can you please share a small repro that contains the model and plan creation code that exhibit the errors you are seeing?

niekgh6w6 · March 22, 2019, 7:41pm

Sure. I can get a repro case with the model and my conversion code together on Monday.

What is the easiest way to share the savedmodel.pb with you?

NVES · March 22, 2019, 8:46pm

you can DM me and share a google drive or dropbox link? or if you’d like to attached to the post, please reference https://devtalk.nvidia.com/default/topic/1043356/tensorrt/attaching-files-to-forum-topics-posts/

niekgh6w6 · March 25, 2019, 2:49pm

I just sent a repro case through DM.

Thanks!

NVES · March 25, 2019, 3:18pm

thanks for the repro. will triage and keep you updated.

NVES · March 27, 2019, 5:58pm

Hello,

Per engineering,

With 5.0.2, there was a UFF bug where input order was completely ignored, and the inputs/outputs were always treated as CHW. So even though parsing would often work, inference results were not always correct for HWC cases.

In 5.1.2+, the input order is no longer ignored, so CHW fails, because the model is actually HWC.
So the correct usage as of 5.1 is to specify the same input order and shape as the original TensorFlow model, and pass input data in that format as well.

Changing your inpurt order to
INPUT_SHAPE = (224, 224, 3)

and was able to parse successfully with 5.1.2+

niekgh6w6 · March 27, 2019, 10:06pm

Howdy,

Using (224,224,3) does compile a plan with TRT 5.1.2, but the softmax outputs are completely bogus.

Specifically, if I shove in all-ones as input to the model I sent over…

Tensorflow served with TRTIS 19.03:
—> input: ones(224,224,3)
—> output: [0.1936914473772049, 0.07535892724990845, 0.09353204816579819, 0.05095922574400902, 0.5572561621665955, 0.0292021706700325]

TRT 5.0.2 built with (3,224,224), trt.UffInputOrder.NHWC served with TRTIS 19.02:
—> input: ones(3,224,224)
—> output: [0.1936902105808258, 0.07535859197378159, 0.0935320109128952, 0.0509592704474926, 0.5572577118873596, 0.02920212782919407]

TRT 5.1.2 built with (224,224,3) trt.UffInputOrder.NHWC served with TRTIS 19.03:
—> input: ones(224,224,3)
—> output: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

So:

TRT 5.0.2 gives answers consistent with TF
TRT 5.1.2 can generate a plan, but the output is broken

At this point I’ve tried every combination of shape, UffInputOrder, and actual input data shape against TRT 5.1.2. All lead to sadness. :(

niekgh6w6 · April 23, 2019, 10:35pm

Sadly this is still broken on today’s 19.04 release of TRT and TRTIS.

yalagamsrinivas · October 1, 2019, 6:31am

I am also getting same issue, please suggest me solution for tensor rt 5.1.2 usage

niekgh6w6 · October 1, 2019, 8:37am

We did find a workaround, but it has been quite a while so the details are fuzzy.

I believe root cause was that TensorRT used a different axis than Tensorflow when doing the final softmax calculation, so the output was nonsense. (Probably a bug in TensorRT HWC vs. CHW propagation).

The work around was rejiggering the graph to have conv2d+activation be separate layers when the activation is softmax.

Newer versions of TensorRT have come out since. The TensorRT bug may be fixed in those.

That’s all the help I can give on the topic.

Topic		Replies	Views
UFF parser errors Jetson TX2	31	5546	October 18, 2021
TensorRT 5 UFFParser error about kernel weights TensorRT	6	1651	October 12, 2021
TF-TRT TrtGraphConverterV2 converter build failure TensorRT	3	794	April 25, 2022
TensorRT 5 Input Tensor Format NCHW / NHWC TensorRT	6	4920	October 24, 2019
TenosrRT5.1.5 don't support tf.transpose TensorRT	0	440	August 6, 2019
TensorRT 3: parser error for VGG16 with INT8 TensorRT	4	1919	April 26, 2018
Convert tensorflow uff model to TensorRT engine on drive px2 platform by using TensorRT4.0 in latest px2 SDK TensorRT	3	1340	October 12, 2021
TensorRT UFF parser register_input() cannot handle original graph in NHWC format TensorRT	7	2418	March 6, 2019
Parser error: The input to the Scale Layer is required to have a minimum of 3 dimensions TensorRT	2	653	October 12, 2021
Tensorflow RNN UFF conversion not yet supported? TensorRT	8	1264	October 12, 2021

Breakage in TensorRT 5.1.2

Related topics