Convert tensorflow uff model to TensorRT engine on drive px2 platform by using TensorRT4.0 in latest px2 SDK

Hi, I’m trying to transplant a network from x86 platform to drive px2 platform now.
My net was implemented by tensorflow, and I have tried to convert my tensorflow model to uff model, and got some warning to indicate the non-supported layers contained in my net.
I summarized as list below:

Warning: No conversion function registered for layer: Merge yet.
Warning: No conversion function registered for layer: Switch yet.
Warning: No conversion function registered for layer: Equal yet.
Warning: No conversion function registered for layer: RefSwitch yet.
Warning: No conversion function registered for layer: AssignSub yet.

I want to confirm something as below:

  1. Dose the latest drive px2 sdk update in October 3 support TensorRT4.0 so that I can use plugin for my non-supported laysers of tensorflow model?
  2. Dose the uff model converted by TensorRT4.0 on x86 platform also work well on drive px2 platform?

Update,
by removing something tensorRT unsupported in tensorflow operations, I have solved the warning problems,
And now, I can convert my tensorflow model to uff model successfully.
Log like:

zyf@zyf-HP-Z4-G4-Workstation:~/tf_to_uff$ source activate tf
(tf) zyf@zyf-HP-Z4-G4-Workstation:~/tf_to_uff$ python '/home/zyf/tf_to_uff/tf_to_uff.py' 
2018-10-19 18:04:01.418676: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2018-10-19 18:04:02.479061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: 
name: TITAN V major: 7 minor: 0 memoryClockRate(GHz): 1.455
pciBusID: 0000:15:00.0
totalMemory: 11.78GiB freeMemory: 11.00GiB
2018-10-19 18:04:02.479491: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-10-19 18:04:04.834618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-19 18:04:04.835263: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0 
2018-10-19 18:04:04.835380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N 
2018-10-19 18:04:04.836994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10580 MB memory) -> physical GPU (device: 0, name: TITAN V, pci bus id: 0000:15:00.0, compute capability: 7.0)
Automatically deduced output nodes: lanenet_loss/Softmax, lanenet_loss/instance_seg
Using output node lanenet_loss/Softmax
Using output node lanenet_loss/instance_seg
Converting to UFF graph
No. nodes: 189
UFF Output written to model/px2HRYT_lanenet_pb.uff
UFF Text Output written to model/px2HRYT_lanenet_pb.uff.pbtxt
[HRYT_lanenet] Successfully transfer to UFF model

However, when I tried the converted uff model by using the trtexec provided by tensorrt4.0 samples, something wrong occured:
You can know that my output of network are named lanenet_loss/Softmax, lanenet_loss/instance_seg
When I use one of it as the parameter of trtexec ,errors like:

zyf@zyf-HP-Z4-G4-Workstation:~/TensorRT-4.0.1.6/bin$ ./trtexec --uff='/home/zyf/tf_to_uff/model/px2HRYT_lanenet_pb.uff' --output=lanenet_loss/Softmax, --engine=px2lanenet
uff: /home/zyf/tf_to_uff/model/px2HRYT_lanenet_pb.uff
output: lanenet_loss/Softmax,
engine: px2lanenet
lanenet_loss/inference/encode/conv1_1/conv/Conv2D: kernel weights has count 1728 but 147456 was expected
UFFParser: Parser error: lanenet_loss/inference/encode/conv1_1/bn/FusedBatchNorm: The input to the Scale Layer is required to have a minimum of 3 dimensions.
Engine could not be created
Engine could not be created

zyf@zyf-HP-Z4-G4-Workstation:~/TensorRT-4.0.1.6/bin$ ./trtexec --uff='/home/zyf/tf_to_uff/model/px2HRYT_lanenet_pb.uff' --output=lanenet_loss/instance_seg --engine=px2lanenet
uff: /home/zyf/tf_to_uff/model/px2HRYT_lanenet_pb.uff
output: lanenet_loss/instance_seg
engine: px2lanenet
lanenet_loss/inference/encode/conv1_1/conv/Conv2D: kernel weights has count 1728 but 147456 was expected
UFFParser: Parser error: lanenet_loss/inference/encode/conv1_1/bn/FusedBatchNorm: The input to the Scale Layer is required to have a minimum of 3 dimensions.
Engine could not be created
Engine could not be created

My environments:
OS: ubuntu16.04
CUDA9.0 cudnn7.0
tensorflow 1.10.1
python3.5
tesorrt4.0

Hi, I have solved my problems, I release my solution here now.

Firstly, the warning problems mentioned above:

Warning: No conversion function registered for layer: Merge yet.
Warning: No conversion function registered for layer: Switch yet.
Warning: No conversion function registered for layer: Equal yet.
Warning: No conversion function registered for layer: RefSwitch yet.
Warning: No conversion function registered for layer: AssignSub yet.

was because of the utilization of tensorflow function tf.cond() and tf.equal().
tf.equal() caused Warning: No conversion function registered for layer: Equal yet.
tf.cond() caused all others.
Due to the function of tf.cond() and tf.equal() is not essential for my net, I removed them by some way.
For tf.cond(), it was used for switch batch_normalization between training and test, because it was not supported by tensorrt4.0, I have to give up to use it. Temporarily, I removed batch_normalization layers due to the unavailability of tf.cond() until I found some way to use batch_normalization under the absent of tf.cond().

Secondly, two problems below:
lanenet_loss/inference/encode/conv1_1/conv/Conv2D: kernel weights has count 1728 but 147456 was expected
UFFParser: Parser error: lanenet_loss/inference/encode/conv1_1/bn/FusedBatchNorm: The input to the Scale Layer is required to have a minimum of 3 dimensions.

The Parser error can be removed by removing batch_normalization.
And the error:“lanenet_loss/inference/encode/conv1_1/conv/Conv2D: kernel weights has count 1728 but 147456 was expected”
was caused by NHWC and NCHW problems.
My input shape is (1, 256, 512, 3) in NHWC format, and corresponding weight shape is (3, 3, 3, 64)
You can see that 3X3X3X64 = 1728, so the error information indicates that something wrong with weight shape.
Because 147456 = 3X3X256X64, so the wrong shape of weight was (3, 3, 256, 64), now you can know that tensorrt regard 256 as the C channel mistakenly, it is also to say, tensorrt processed my input shape (1, 256, 512, 3) with the wrong format NCHW.
So, the solution is that input “–uffInput” parameter explicitly when using trtexec, although “–uffInput” is an optional parameter. I suggest that make the “–uffInput” parameter as the essential parameter for trtexec, because NHWC format is usually used for tensorflow.