UFFParser: Validator error: Identity: Unsupported operation Identity

I have a trained network (NVIDIA PilotNet CNN architecture) which I developed using PyTorch and exported to ONNX. Since I want to deploy this model in NVIDIA Drive PX2, and because the DriveOS only supports TensorRT 4, I had to convert the ONNX model to a TensorFlow frozen graph and then convert it to UFF format so that I can get the optimized TensorRT plan using the tensorRT_optimization tool. However, the frozen graph has Identity nodes that’s been added to bypass Dropout layers during inference (according to this post). But TensorRT 4 Uff Parser doesn’t support Identity operations.

I know that removing the Dropout layers and retraining might be a solution. But my question is: Is there a way to remove these Identity nodes from the TensorFlow frozen graph without removing the Dropout layers from the network architecture?

Here is my PyTorch CNN architecture:

class NVIDIACNN(nn.Module):
    def __init__(self, mean, std, keep_prob=0.6):
        super().__init__()
        self.conv1 = nn.Sequential(
            # If a nn.Conv2d layer is directly followed by a nn.BatchNorm2d layer, then the bias in the convolution is not needed.
            # Bias is not needed because in the first step BatchNorm subtracts the mean, which effectively cancels out the effect of bias.
            nn.Conv2d(in_channels=3, out_channels=24, kernel_size=5, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(num_features=24),
            nn.Dropout(keep_prob)
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(in_channels=24, out_channels=36, kernel_size=5, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(num_features=36),
            nn.Dropout(keep_prob)
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(in_channels=36, out_channels=48, kernel_size=5, stride=2),
            nn.ReLU(),
            nn.BatchNorm2d(num_features=48),
            nn.Dropout(keep_prob)
        )
        self.conv4 = nn.Sequential(
            nn.Conv2d(in_channels=48, out_channels=64, kernel_size=3),
            nn.ReLU(),
            nn.BatchNorm2d(num_features=64),
            nn.Dropout(keep_prob)
        )
        self.conv5 = nn.Sequential(
            nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3),
            nn.ReLU(),
            nn.BatchNorm2d(num_features=64),
            nn.Dropout(keep_prob)
        )
        self.fc1 = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=64 * 18, out_features=200),
            nn.ReLU(),
            nn.BatchNorm1d(num_features=200),
            nn.Dropout(0.25)
        )
        self.fc2 = nn.Sequential(
            nn.Linear(in_features=200, out_features=50),
            nn.ReLU(),
            nn.BatchNorm1d(num_features=50),
            nn.Dropout(0.25)
        )
        self.fc3 = nn.Sequential(
            nn.Linear(in_features=50, out_features=10),
            nn.ReLU(),
            nn.BatchNorm1d(num_features=10),
            nn.Dropout(0.25)
        )
        self.fc4 = nn.Linear(in_features=10, out_features=1)

        self.mean = mean
        self.std = std

    @torch.jit.script
    def fused_normalize(x, mean, std):
        return (x - mean) / std

    def forward(self, x):
        # Normalize (standardizing in to [-1, 1]).
        x = self.fused_normalize(x, self.mean, self.std)
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        x = self.fc4(x)
        return x

How I converted the ONNX model to a TensorFlow frozen graph:

import onnx
import tensorflow as tf
from onnx import version_converter
from onnx_tf.backend import prepare
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2

onnx_model = onnx.load("outputs/nvidiacnn_ep83_converted.onnx")
onnx.checker.check_model(onnx_model)

model = prepare(onnx_model)
model.export_graph('outputs/frozen_model')

model = tf.saved_model.load('outputs/frozen_model')
# Convert TensorFlow model to ConcreteFunction.
full_model = tf.function(lambda x: model(input=x))
full_model = full_model.get_concrete_function(tf.TensorSpec(model.signatures['serving_default'].inputs[0].shape.as_list(), model.signatures['serving_default'].inputs[0].dtype.name))
# Get frozen ConcreteFunction.
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()

# Save frozen graph to disk
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
                  logdir='outputs',
                  name="nvidiacnn_ep83.pb",
                  as_text=False)

Converting the TensorFlow frozen graph to UFF:

import uff

uff.from_tensorflow_frozen_model(frozen_file='outputs/nvidiacnn_ep83.pb', output_filename='outputs/nvidiacnn_ep83.uff')

EDIT: I tried removing the Dropout layers, but the Identity nodes are still there.

If this is a version mismatch between the UFFParser and the converted UFF file to be parsed, which UFF version is DriveWorks using?

Hi,
UFF and Caffe Parser have been deprecated from TensorRT 7 onwards, hence request you to try ONNX parser.
Please check the below link for the same.

Thanks!

Hi,
Thanks for the reply. I know that most of the software for the NVIDIA Drive PX2 is now outdated. However, since we only own the PX2, there’s no option other than using the TensorRT that’s built into the DriveOS 5.0.10.3. I was able to convert an ONNX model (with opset v1 that TensorRT 4 supports) into a TRT engine using TRT sample codes. But the DriveWorks sample codes require TRT plans that are converted via the tensorRT_optimization tool. I’ve tried with the TRT engine files that I converted from ONNX models, but the DriveWorks codes complain that the tensorRT model has the wrong magic number and asks to use the tensorRT_optimization tool.

Could you please let me know what should I do in order to use a custom-trained model for inference with DriveOS 5.0.10.3 (with DriveWorks SDK 1.2)?

EDIT [21/01/2023]: Also, the tensorRT_optimization tool in DriveWorks SDK 1.2 doesn’t have the option to input ONNX models. It only has the options of UFF and caffe models.