on TX2, there is no effect using tensorRT to speed up my trained model

Hello,

Linux distro and version:ubuntu16.04
GPU type :nvidia TX2
nvidia driver version:Jetpack3.2
CUDA version:9.0
CUDNN version:7.05
Python version [if using python]:2.7
Tensorflow version :1.8
TensorRT version:TensorRT 3.0 GA
If Jetson, OS, hw versions

Describe the problem
I trained a tensorflow model on the server and deployed it to the GPU of TX2. It processed a frame of image for about 0.14s. I accelerated the model with tensorRT and found that there is no effect. It still needs 0.14s to process one frame of image. I followed this guide https://github.com/NVIDIA-AI-IOT/tf_trt_models and run sudo nvpmodel -m 0
sudo ~/jetson_clocks.sh
this is my part of code:
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Graph().as_default() as tf_graph:
with tf.Session(config=config) as sess:
saver = tf.train.import_meta_graph(‘model/model.meta’)
saver.restore(sess, tf.train.latest_checkpoint(‘model/’))

            frozen_graph = tf.graph_util.convert_variables_to_constants(
                sess,
                sess.graph_def,
                output_node_names=['network/Squeeze']
            )

    trt_graph = trt.create_inference_graph(
        input_graph_def=frozen_graph,
        outputs=['network/Squeeze'],
        max_batch_size=1,
        max_workspace_size_bytes=1 << 25,
        precision_mode='FP16',
        minimum_segment_size=50
    )

    writer = open('trt_graph/trt_graph.pb','wb')
    writer.write(trt_graph.SerializeToString())
    writer.close()

    self.sess = tf.Session(config=config)
    self.f = tf.gfile.GFile('trt_graph/trt_graph.pb','rb')
    self.graph_def = tf.GraphDef()
    self.graph_def.ParseFromString(self.f.read())
    self.sess.graph.as_default()
    tf.import_graph_def(self.graph_def)
    self.placeholder = self.sess.graph.get_tensor_by_name('import/input/input:0')
    self.logits = self.sess.graph.get_tensor_by_name('import/network/Squeeze:0')