Hi,
I’m using tf slim to build InceptionV3 model and wanted to convert into tf-trt graph.
If set the model trainable(is_training) = True, it will failed to convert to tf-trt graph with unsupported constant type.
Somehow, I try to set trainable(is_training) = False, I got no error message and conversion went smoothly.
PC:Ubuntu 16.04
GPU:1080Ti
TF:1.7.0
CUDA:9.0
TensorRT:4.1.2
import sys
import os
import tensorflow as tf
import tensorflow.contrib.tensorrt as trt
import numpy as np
import tensorflow.contrib.slim as slim
sys.path.append('/home/user/Desktop/python_code/model_test/models/research/slim')
from nets import inception,inception_v3
###critical setting
is_training=True
config = tf.ConfigProto()
config.gpu_options.allow_growth
input_name='input'
num_classes=1001
output_name='prediction'
checkpoint='inception_v3.ckpt'
with tf.Graph().as_default() as tf_graph:
with tf.Session(config=config) as tf_sess:
tf_input = tf.placeholder(tf.float32, [None, 299, 299, 3],
name=input_name)
with slim.arg_scope(inception.inception_v3_arg_scope()):
with slim.arg_scope([slim.batch_norm],is_training=is_training):
tf_net, tf_end_points = inception.inception_v3(tf_input, is_training=is_training,
num_classes=num_classes)
tf_output = tf.nn.softmax(tf_net, name=output_name)
# load checkpoint
tf_saver = tf.train.Saver()
tf_saver.restore(save_path=checkpoint, sess=tf_sess)
# freeze graph
frozen_graph = tf.graph_util.convert_variables_to_constants(
tf_sess,
tf_sess.graph_def,
output_node_names=[output_name]
)
trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=[output_name],
max_batch_size=1,
max_workspace_size_bytes=1<<25,
precision_mode='FP32',
minimum_segment_size=50
)
Error message:
2019-01-17 17:41:31.726424: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-01-17 17:41:31.804534: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-01-17 17:41:31.804836: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 10.91GiB freeMemory: 10.55GiB
2019-01-17 17:41:31.804864: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2019-01-17 17:41:32.063939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-17 17:41:32.063971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2019-01-17 17:41:32.063977: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2019-01-17 17:41:32.064162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10187 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-01-17 17:41:34.839673: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2019-01-17 17:41:35.341416: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2624] Max batch size= 1 max workspace size= 33554432
2019-01-17 17:41:35.341461: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2630] starting build engine
2019-01-17 17:42:06.791967: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2635] Built network
2019-01-17 17:42:07.663609: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2640] Serialized engine
2019-01-17 17:42:07.703502: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2648] finished engine InceptionV3/my_trt_op0 containing 801 nodes
2019-01-17 17:42:07.703563: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2668] Finished op preparation
2019-01-17 17:42:07.738476: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2676] OK finished op building
2019-01-17 17:42:53.465747: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2019-01-17 17:42:53.465796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-17 17:42:53.465816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2019-01-17 17:42:53.465820: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2019-01-17 17:42:53.465920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10187 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-01-17 17:42:57.789125: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2019-01-17 17:42:58.150488: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:412] subgraph conversion error for subgraph_index:0 due to: "Unimplemented: Not supported constant type, at InceptionV3/InceptionV3/Mixed_7c/Branch_3/Conv2d_0b_1x1/BatchNorm/Const_2" SKIPPING......( 796 nodes)