Environment:
Ubuntu 16.04
tensorflow-gpu 1.10.0
python 3.6.6
TensorRT4 installed
I am trying to convert my keras model to tensorRT optimized graph. I think I have properly processed the keras model to a frozen model and can see that the frozen model holds numerous nodes. However, when it goes through create_inference_graph
function, the returned graphDef is empty.
The function call takes about 15-20 seconds so I am guessing that something is working but… it returns an empty GraphDef.
Can someone suggest any ideas to tackle this problem?
Here is the code that I ran, and the result.
import tensorflow as tf
from tensorflow.contrib import tensorrt as tftrt
model = build_model()
model.load_weights(weight_file_path)
# extract tf.graph from the keras model
with K.get_session() as sess:
graph = sess.graph.as_graph_def()
input_node_name = "input_1"
output_node_name = "predictions/concat"
# convert graph to frozen graph
frozen_graph = tf.graph_util.convert_variables_to_constants(
sess,
graph,
output_node_names=[input_node_name, output_node_name])
print("frozen graph node size:{}".format(len(frozen_graph.node)))
print("creating trt graph...")
tftrt_graph = tftrt.create_inference_graph(frozen_graph, outputs=output_node_name , max_workspace_size_bytes=1 << 25,precision_mode="FP32")
print("tftrt_graph: {}".format(tftrt_graph))
print("tftrt_graph nodes:{}".format(tftrt_graph.node))
output:
Using TensorFlow backend.
trying to load model
2019-03-18 16:00:26.273710: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-18 16:00:27.793146: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 11.90GiB freeMemory: 11.74GiB
2019-03-18 16:00:27.793181: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2019-03-18 16:00:28.114254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-18 16:00:28.114293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0
2019-03-18 16:00:28.114299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N
2019-03-18 16:00:28.114574: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11334 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:09:00.0, compute capability: 6.1)
WARNING:tensorflow:From /data/chadrick/penv/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3148: calling l2_normalize (from tensorflow.python.ops.nn_impl) with dim is deprecated and will be removed in a future version.
Instructions for updating:
dim is deprecated, use axis instead
frozen graph node size:4030
creating trt graph...
WARNING:tensorflow:TensorRT mismatch. Compiled against version 3.0.4, but loaded 4.0.1. Things may not work
2019-03-18 16:00:44.182601: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
tftrt_graph: library {
}
versions {
}
tftrt_graph nodes:[]