I use this code:
import tensorflow as tf
import tensorflow.contrib.tensorrt as trt
from tensorflow.python.compiler.tensorrt.trt_convert import (
TrtPrecisionMode, TrtGraphConverter)
from tensorflow.python.compiler.tensorrt import trt_convert
config = tf.ConfigProto(
gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
)
ig = trt_convert.create_inference_graph(
input_graph_def=None,
outputs=[],
input_saved_model_dir='/home/eric/cr2/output/export/1554095533/',
precision_mode='FP32',
max_workspace_size_bytes=8 * 1024 * 1024 * 1024,
session_config=config,
minimum_segment_size=1,
max_batch_size=1,
is_dynamic_op=True,
output_saved_model_dir='trt13',
)
I get this error:
ValueError: Input 0 of node read_225/RefEnter was passed float from transformer/body/decoder/layer_0/self_attention/layer_prepostprocess/layer_norm/layer_norm_scale:0 incompatible with expected float_ref.
Also, this error mentions a batch norm layer, but I get the same error about a different node when taking out all normalization layers.
Reduced code to the following:
import tensorflow.contrib.tensorrt as trt
trt.create_inference_graph(
None,
None,
input_saved_model_dir='/home/eric/cr2/output/export/1554095533/',
output_saved_model_dir='trt23')
Still getting same error. Tested with TensorRT 5.0.2.6 .
This error is caused by some bugs with Tensorflow’s freeze graph tool. It can happen when you try to convert a model that was created with a different version of Tensorflow, but it also seems to happen randomly.
You can fix it by manually adjusting the bugged nodes:
# Load saved model and freeze graph
with tf.Session(graph=tf.Graph()) as sess:
tf.saved_model.loader.load(sess, [tag_constants.SERVING], saved_model_dir)
frozen_graph = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, out_names.split(','))
# Change all RefEnter to Enter
for n in frozen_graph.node:
if n.op == 'RefEnter':
n.op = 'Enter'
# Convert using TF-TRT
trt_convert.create_inference_graph(
input_graph_def=frozen_graph,
outputs=out_names.split(","),
max_batch_size=1,
max_workspace_size_bytes=8 * 1024 * 1024 * 1024,
precision_mode='FP32',
minimum_segment_size=1,
is_dynamic_op=True
output_saved_model_dir='trt13')