My current setup is,
CUDA 10.0
cuDnn 7.6
tensorflow 1.14
tensorrt 5.0
Every time I try to create an inference engine using tensorrt, there is no change, regarding the number of tensorrt engine nodes that are created.
# function to read a ".pb" model
# (can be used to read frozen model or TensorRT model)
def read_pb_graph(model):
with gfile.FastGFile(model,'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
return graph_def
frozen_graph = read_pb_graph("./model/YOLOv3/yolov3_gpu_nms.pb")
your_outputs = ["Placeholder:0", "concat_9:0", "mul_9:0"]
# convert (optimize) frozen model to TensorRT model
trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,# frozen model
outputs=your_outputs,
max_batch_size=1,# specify your max batch size
max_workspace_size_bytes=2*(10**9),# specify the max workspace
precision_mode="FP16") # precision, can be "FP32" (32 floating point precision) or "FP16"
#write the TensorRT model to be used later for inference
with gfile.FastGFile("./model/YOLOv3/TensorRT_YOLOv3_2.pb", 'wb') as f:
f.write(trt_graph.SerializeToString())
print("TensorRT model is successfully stored!")
# check how many ops of the original frozen model
all_nodes = len([1 for n in frozen_graph.node])
print("numb. of all_nodes in frozen graph:", all_nodes)
# check how many ops that is converted to TensorRT engine
trt_engine_nodes = len([1 for n in trt_graph.node if str(n.op) == 'TRTEngineOp'])
print("numb. of trt_engine_nodes in TensorRT graph:", trt_engine_nodes)
all_nodes = len([1 for n in trt_graph.node])
print("numb. of all_nodes in TensorRT graph:", all_nodes)
The above code is obtained from this https://github.com/ardianumam/Tensorflow-TensorRT
I’ve seen the same thing optimize the graph on a different machine but on the new machine on which I have the above mentioned packages the number of TRTEngineOp I get is 0
Does anyone have any idea why the optimization fails? Any help would be appreciated thanks