Optimize/Prune the Graph to prevent UFF conversion failure

yaduvir.singh · June 3, 2020, 10:04pm

Description

Getting TensorRT conversion failure due to ops not supported in TensorRT. Till now I have come to conclusion to optimize/prune the original tensorflow flow graph so that freeze graph should have only supported ops as per TensorRT and that should then convert easily to uff.

To do this, i want to remove training nodes which i think are unsupported nodes in uff conversion using TransformGraph. Option i used are below, other then Input shape change and “Identity” node no other node removed like : ‘switch’, ‘exit’, ‘add’.

what options to choose in TransformGraph() to remove ‘switch’, ‘exit’, ‘add’ nodes?
We use ‘from tensorflow.python.tools import freeze_graph’ to freeze our graph, is there any direct way to generate optimized graph from saved model format. I see that OpenVino (Intel GPU toolkit) has optimize_for_inferencing() method, can we similar function in Nvidia?*
How to remove training nodes from freezed graph?
How to remove prefix “layer6” from all nodes (ex: layer 6 from “layer6/Conv2D” “layer6/final_dense”?
any direct TF API to directly prune the graph?

frozen_graph.pb2976×7078 524 KB

import tensorflow as tf
from tensorflow.tools.graph_transforms import TransformGraph

def optimize_graph(model_dir, graph_filename, transforms, output_node):
with tf.compat.v1.Session() as sess:

    # shape=[1, ?, ?, 3] -> shape=[1, 128, 128, 3]
    # name='image' specifies the placeholder name of the converted model
    inputs = tf.compat.v1.placeholder(tf.uint8, shape=[1, 128, 128, 3], name='input_image_tensor')
    with tf.io.gfile.GFile(graph_filename, 'rb') as f:
        graph_def = tf.compat.v1.GraphDef()
    graph_def.ParseFromString(f.read())

    # 'image:0' specifies the placeholder name of the model before conversion
    tf.graph_util.import_graph_def(graph_def, input_map={'image_tensor:0': inputs}, name='')
    tf.graph_util.remove_training_nodes(graph_def, protected_nodes=None )
    print([n for n in tf.compat.v1.get_default_graph().as_graph_def().node if n.name == 'input_image_tensor'])

    optimized_graph_def = TransformGraph(
                              tf.compat.v1.get_default_graph().as_graph_def(),
                              'input_image_tensor',
                              [output_node],
                              transforms)
    describe_graph(optimized_graph_def)
    tf.io.write_graph(optimized_graph_def, model_dir, 'optimized_model.pb', as_text=False)

transforms = [
‘remove_nodes(op=Identity, op=CheckNumerics, op=TensorArray)’,

‘merge_duplicate_nodes’,

‘strip_unused_nodes(type=uint8, shape=“1,128,128,3”)’,
‘fold_constants(ignore_errors=true)’,
‘fold_batch_norms’,
‘fold_old_batch_norms’,

‘obfuscate_names’,

‘quantize_weights’,

‘quantize_nodes’,

‘strip_unused_nodes’,

‘sort_by_execution_order’

]

optimize_graph(“./”,“freeze_graph.pb”,transforms, “layer6/final_dense”)

==============================================================================
def freeze_model(saved_model_dir, output_node_names, output_filename):
output_graph_filename = os.path.join(saved_model_dir, output_filename)
initializer_nodes = ‘’
freeze_graph.freeze_graph(
input_saved_model_dir=saved_model_dir,
output_graph=output_graph_filename,
saved_model_tags = tag_constants.SERVING,
output_node_names=output_node_names,
initializer_nodes=initializer_nodes,
input_graph=None,
input_saver=False,
input_binary=False,
input_checkpoint=None,
restore_op_name=None,
filename_tensor_name=None,
clear_devices=False,
input_meta_graph=False,
)
print(‘graph freezed!’)

Environment

TensorRT Version : 7.0.0.11
GPU Type : Tesla K80
Nvidia Driver Version : 440.64.00
CUDA Version : CUDA Version: 10.2
CUDNN Version : 7.6
Operating System + Version : Ubuntu x86_64
Python Version (if applicable) : 3.6
TensorFlow Version (if applicable) : 1.15.2
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) : barematel

Relevant Files

https://drive.google.com/file/d/1HHLcNsCccdQCz98ErE0ci1zkhbdXYGiw/view?usp=sharing

Steps To Reproduce

~/notebook/head-pose-estimation$ convert-to-uff frozen_graph.pb
Warning: No conversion function registered for layer: TensorArrayGatherV3 yet.
Converting map/TensorArrayStack/TensorArrayGatherV3 as custom op: TensorArrayGatherV3
W0529 23:11:49.407045 139740564793152 module_wrapper.py:139] From /home/ubuntu/anaconda3/lib/python3.6/site-packages/uff/converters/tensorflow/converter.py:179: The name tf.AttrValue is deprecated. Please use tf.compat.v1.AttrValue instead.

Warning: No conversion function registered for layer: Exit yet.
Converting map/while/Exit_2 as custom op: Exit
Warning: No conversion function registered for layer: Switch yet.
Converting map/while/Switch_2 as custom op: Switch
Warning: No conversion function registered for layer: LoopCond yet.
Converting map/while/LoopCond as custom op: LoopCond
Warning: No conversion function registered for layer: LogicalAnd yet.
Converting map/while/LogicalAnd as custom op: LogicalAnd
Warning: No conversion function registered for layer: Less yet.
Converting map/while/Less_1 as custom op: Less
Warning: No conversion function registered for layer: Enter yet.
Converting map/while/Less/Enter as custom op: Enter
Warning: No conversion function registered for layer: Merge yet.
Converting map/while/Merge_1 as custom op: Merge
Warning: No conversion function registered for layer: NextIteration yet.
Converting map/while/NextIteration_1 as custom op: NextIteration
Warning: No conversion function registered for layer: Switch yet.
Converting map/while/Switch as custom op: Switch
Warning: No conversion function registered for layer: Merge yet.
Converting map/while/Merge as custom op: Merge
Warning: No conversion function registered for layer: NextIteration yet.
Converting map/while/NextIteration as custom op: NextIteration
Warning: No conversion function registered for layer: Enter yet.
Converting map/while/Enter as custom op: Enter
Warning: No conversion function registered for layer: Switch yet.
Converting map/while/Switch_1 as custom op: Switch
Warning: No conversion function registered for layer: Enter yet.
Converting map/while/Enter_1 as custom op: Enter
Warning: No conversion function registered for layer: Less yet.
Converting map/while/Less as custom op: Less
Warning: No conversion function registered for layer: Merge yet.
Converting map/while/Merge_2 as custom op: Merge
Warning: No conversion function registered for layer: NextIteration yet.
Converting map/while/NextIteration_2 as custom op: NextIteration
Warning: No conversion function registered for layer: TensorArrayWriteV3 yet.
Converting map/while/TensorArrayWrite/TensorArrayWriteV3 as custom op: TensorArrayWriteV3
Warning: No conversion function registered for layer: ResizeBilinear yet.
Converting map/while/resize/ResizeBilinear as custom op: ResizeBilinear
Warning: No conversion function registered for layer: TensorArrayReadV3 yet.
Converting map/while/TensorArrayReadV3 as custom op: TensorArrayReadV3
Warning: No conversion function registered for layer: Enter yet.
Converting map/while/TensorArrayReadV3/Enter_1 as custom op: Enter
Warning: No conversion function registered for layer: TensorArrayScatterV3 yet.
Converting map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3 as custom op: TensorArrayScatterV3
Warning: No conversion function registered for layer: TensorArrayV3 yet.
Converting map/TensorArray as custom op: TensorArrayV3
Warning: No conversion function registered for layer: Range yet.
Converting map/TensorArrayUnstack/range as custom op: Range
Warning: No conversion function registered for layer: Enter yet.
Converting map/while/TensorArrayReadV3/Enter as custom op: Enter

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

SunilJB · June 4, 2020, 7:43am

You can use graphsurgeon, please refer below link:

github.com

NVIDIA/TensorRT/blob/07ed9b57b1ff7c24664388e5564b17f7ce2873e5/samples/opensource/sampleUffFasterRCNN/config.py#L34


      
          "crop_and_resize_1/Reshape" : CropAndResize,
          'crop_and_resize_1/CropAndResize' : CropAndResize,
          "crop_and_resize_1/transpose" : CropAndResize,
          "crop_and_resize_1/transpose_1" : CropAndResize
          }
          
          
          def preprocess(dynamic_graph):
              # Now create a new graph by collapsing namespaces
              dynamic_graph.append(Proposal)
              dynamic_graph.remove('input_2')
              dynamic_graph.collapse_namespaces(namespace_plugin_map)

You can try TF-TRT to optimize the node.

Please refer below API in case it helps
https://www.tensorflow.org/api_docs/python

Not sure why it is required during TRT conversion, could you please elaborate.

Found below link which might be useful, for more details/issues you can raise issue in tensorflow git repo:

Thanks

yaduvir.singh · June 6, 2020, 11:25am

@SunilJB I wanted to remove training “map/" nodes (control ops) from the frozen_grap.pb and replace it by Reshape/Shape and Cast (unit8 to float32). As you said in another post to not to use convert-to-uff as it is deprecated and use onnx (which fails at engine load), graphsuergon is only used in uff conversion, so it is not valid solution anymore. Further as i tried tf2onnx.convert but engine load failed. i wanted to remove training "map/” nodes (20+ nodes as you can see) form original TF frozen_graph.pb as they are not required at inference time. As you suggested to use TF API’s I used “optimize_for_inference”, “remove_training_nodes” and “Transform Graph”. but training “map/*” nodes are not removed. But still training nodes are not removed.

============================================================================
import tensorflow as tf
from tensorflow.tools.graph_transforms import TransformGraph
from tensorflow.python.tools.optimize_for_inference_lib import optimize_for_inference

def optimize_graph(model_dir, graph_filename, transforms, output_node):
input_node = ‘image_tensor’
with tf.compat.v1.Session() as sess:

    # shape=[1, ?, ?, 3] -> shape=[1, 128, 128, 3]
    # name='image' specifies the placeholder name of the converted model
    inputs = tf.compat.v1.placeholder(tf.uint8, shape=[1, 128, 128, 3], name=input_node)
    with tf.io.gfile.GFile(graph_filename, 'rb') as f:
        graph_def = tf.compat.v1.GraphDef()
    graph_def.ParseFromString(f.read())

    input_nodes = [inputs]
    graph_def = optimize_for_inference(
        input_graph_def=graph_def,
        input_node_names=[_get_node_name(t) for t in input_nodes],
        output_node_names=[output_node],
        placeholder_type_enum=[node.dtype.as_datatype_enum for node in input_nodes]
    )
        
    print("Before:")
    print([n.name for n in graph_def.node])
    
    # 'image:0' specifies the placeholder name of the model before conversion
    tf.graph_util.import_graph_def(graph_def, input_map={'image_tensor:0': inputs}, name='')
    graph_def = tf.graph_util.remove_training_nodes(graph_def, protected_nodes=None)
    print([n for n in tf.compat.v1.get_default_graph().as_graph_def().node if n.name == input_node])

    optimized_graph_def = TransformGraph(
                              graph_def,
                              [input_node],
                              [output_node],
                              transforms)
    
    describe_graph(graph_def)
    describe_graph(optimized_graph_def)
    tf.io.write_graph(optimized_graph_def, model_dir, 'optimized_model.pb', as_text=False)

transforms = [
‘remove_nodes(op=Identity, op=CheckNumerics)’,

‘merge_duplicate_nodes’,

‘rename_attribute(old_attribute_name=“layer6/logits/BiasAdd”,new_attribute_name=“logits/BiasAdd”)’,
‘strip_unused_nodes(type=uint8, shape=“1,128,128,3”)’,
‘fold_constants(ignore_errors=true)’,
‘fold_batch_norms’,
‘fold_old_batch_norms’,

‘obfuscate_names’,

‘quantize_weights’,

‘quantize_nodes’,

‘strip_unused_nodes’,

‘sort_by_execution_order’

]

optimize_graph(“/home/ubuntu/notebook/head-pose-estimation/”,“frozen_graph.pb”,transforms, “layer6/logits/BiasAdd”)

Output:

After:
[‘map/Shape’, ‘map/strided_slice/stack’, ‘map/strided_slice/stack_1’, ‘map/strided_slice/stack_2’, ‘map/strided_slice’, ‘map/TensorArray’, ‘map/TensorArrayUnstack/Shape’, ‘map/TensorArrayUnstack/strided_slice/stack’, ‘map/TensorArrayUnstack/strided_slice/stack_1’, ‘map/TensorArrayUnstack/strided_slice/stack_2’, ‘map/TensorArrayUnstack/strided_slice’, ‘map/TensorArrayUnstack/range/start’, ‘map/TensorArrayUnstack/range/delta’, ‘map/TensorArrayUnstack/range’, ‘map/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3’, ‘map/Const’, ‘map/TensorArray_1’, ‘map/while/iteration_counter’, ‘map/while/Enter’, ‘map/while/Enter_1’, ‘map/while/Enter_2’, ‘map/while/Merge’, ‘map/while/Merge_1’, ‘map/while/Merge_2’, ‘map/while/Less/Enter’, ‘map/while/Less’, ‘map/while/Less_1’, ‘map/while/LogicalAnd’, ‘map/while/LoopCond’, ‘map/while/Switch_1’, ‘map/while/Switch_2’, ‘map/while/Switch’, ‘map/while/add/y’, ‘map/while/add’, ‘map/while/TensorArrayReadV3/Enter’, ‘map/while/TensorArrayReadV3/Enter_1’, ‘map/while/TensorArrayReadV3’, ‘map/while/resize/ExpandDims/dim’, ‘map/while/resize/ExpandDims’, ‘map/while/resize/size’, ‘map/while/resize/ResizeBilinear’, ‘map/while/resize/Squeeze’, ‘map/while/TensorArrayWrite/TensorArrayWriteV3/Enter’, ‘map/while/TensorArrayWrite/TensorArrayWriteV3’, ‘map/while/add_1/y’, ‘map/while/add_1’, ‘map/while/NextIteration’, ‘map/while/NextIteration_1’, ‘map/while/NextIteration_2’, ‘map/while/Exit_2’, ‘map/TensorArrayStack/TensorArraySizeV3’, ‘map/TensorArrayStack/range/start’, ‘map/TensorArrayStack/range/delta’, ‘map/TensorArrayStack/range’, ‘map/TensorArrayStack/TensorArrayGatherV3’, ‘layer1/conv2d/kernel’, ‘layer1/conv2d/bias’, ‘layer1/conv2d/Conv2D’, ‘layer1/conv2d/BiasAdd’, ‘layer1/conv2d/Relu’, ‘layer1/max_pooling2d/MaxPool’, ‘layer2/conv2d/kernel’, ‘layer2/conv2d/bias’, ‘layer2/conv2d/Conv2D’, ‘layer2/conv2d/BiasAdd’, ‘layer2/conv2d/Relu’, ‘layer2/conv2d_1/kernel’, ‘layer2/conv2d_1/bias’, ‘layer2/conv2d_1/Conv2D’, ‘layer2/conv2d_1/BiasAdd’, ‘layer2/conv2d_1/Relu’, ‘layer2/max_pooling2d/MaxPool’, ‘layer3/conv2d/kernel’, ‘layer3/conv2d/bias’, ‘layer3/conv2d/Conv2D’, ‘layer3/conv2d/BiasAdd’, ‘layer3/conv2d/Relu’, ‘layer3/conv2d_1/kernel’, ‘layer3/conv2d_1/bias’, ‘layer3/conv2d_1/Conv2D’, ‘layer3/conv2d_1/BiasAdd’, ‘layer3/conv2d_1/Relu’, ‘layer3/max_pooling2d/MaxPool’, ‘layer4/conv2d/kernel’, ‘layer4/conv2d/bias’, ‘layer4/conv2d/Conv2D’, ‘layer4/conv2d/BiasAdd’, ‘layer4/conv2d/Relu’, ‘layer4/conv2d_1/kernel’, ‘layer4/conv2d_1/bias’, ‘layer4/conv2d_1/Conv2D’, ‘layer4/conv2d_1/BiasAdd’, ‘layer4/conv2d_1/Relu’, ‘layer4/max_pooling2d/MaxPool’, ‘layer5/conv2d/kernel’, ‘layer5/conv2d/bias’, ‘layer5/conv2d/Conv2D’, ‘layer5/conv2d/BiasAdd’, ‘layer5/conv2d/Relu’, ‘layer6/flatten/Shape’, ‘layer6/flatten/strided_slice/stack’, ‘layer6/flatten/strided_slice/stack_1’, ‘layer6/flatten/strided_slice/stack_2’, ‘layer6/flatten/strided_slice’, ‘layer6/flatten/Reshape/shape/1’, ‘layer6/flatten/Reshape/shape’, ‘layer6/flatten/Reshape’, ‘layer6/dense/kernel’, ‘layer6/dense/bias’, ‘layer6/dense/MatMul’, ‘layer6/dense/BiasAdd’, ‘layer6/dense/Relu’, ‘layer6/logits/kernel’, ‘layer6/logits/bias’, ‘layer6/logits/MatMul’, ‘layer6/logits/BiasAdd’, ‘image_tensor’]
Input Feature Nodes: [‘image_tensor’]

Unused Nodes:

Output Nodes:

Quantization Nodes:

Constant Count: 40

Variable Count: 0

Identity Count: 1
Total nodes: 118
Input Feature Nodes: [‘image_tensor’]

Unused Nodes:

Output Nodes:

Quantization Nodes:

Constant Count: 40

Variable Count: 0

Identity Count: 0
Total nodes: 117

=========================================================================

Please suggest if any step missing in my code for removing training “map/” nodes?

yaduvir.singh · June 6, 2020, 1:42pm

@SunilJB Here is the link to Juypter notebook.

TransformGraph notebook:

I want to remove training nodes (map/*) using TransformGraph. Option i used are below, other then Input shape change and “Identity” node no other node removed like : ‘switch’, ‘exit’, ‘add’.

what options to choose in TransformGraph() to remove ‘switch’, ‘exit’, ‘add’ nodes?
Used Optimize_for_inference() and remvoe_traing_nodes() without success.

Saved_model:

Freeze_graph.pb:

SunilJB · June 9, 2020, 12:16pm

This issue doesn’t seems to be related to TensorRT.
Will request you to raise issue under Tensorflow github issues section:

Thanks

Topic		Replies	Views
Converting TF Model to TensorRT UFF Format Jetson TX2	27	23397	October 18, 2021
[TensorRT Nano] Input node, feed_dict has boolean value for training, How do I inference on Nano? Jetson Nano	6	916	October 18, 2021
Don't get any 'TRTEngineOp' after optimizing model via TensorRT in Jeton TX2 TensorRT	17	3810	October 12, 2021
TF TRT Integration removes the input nodes for an Object Detection Model TensorRT	2	681	October 12, 2021
Delete all the no conversion function registered for layer TensorRT	1	658	September 16, 2020
Convert tensorflow frozen graph to uff Jetson TX2	30	2223	October 18, 2021
Remove block of Preprocessor operations using GraphSurgeon TensorRT	0	596	May 31, 2019
sampleUffSSD conversion fails? (KeyError: 'image_tensor') TensorRT	22	4308	October 12, 2021
No optimization in TF-TRT conversion TensorRT tensorrt , tensorflow , tf-trt	1	723	March 27, 2021
convert_to_uff.py fails while trying to convert Tensorflow frozen model TensorRT	6	1029	October 12, 2021

Description

‘merge_duplicate_nodes’,

‘obfuscate_names’,

‘quantize_weights’,

‘quantize_nodes’,

‘sort_by_execution_order’

Environment

Relevant Files

Steps To Reproduce

‘merge_duplicate_nodes’,

‘obfuscate_names’,

‘quantize_weights’,

‘quantize_nodes’,

‘sort_by_execution_order’

optimize_graph(“/home/ubuntu/notebook/head-pose-estimation/”,“frozen_graph.pb”,transforms, “layer6/logits/BiasAdd”)

Related topics