optimization for object detection: no improvement in inference speed

running https://github.com/NVIDIA-AI-IOT/tf_trt_models/blob/master/examples/detection/detection.ipynb in nvidia/tensorflow:19.01-p3 container with Tesla V100 16GB:

Running the notebook with

MODEL = 'ssd_mobilenet_v2_coco'

and evaluating with “FP32” and “FP16”:
in both cases, 'Average runtime" will be at aroung 14-15ms, not showing any significant improvements.

Running provided samples for TensorRT optimization for image classification works just fine, improving inderence time by ~3-4 times.

Does TensorRT have some issues with object detection that are not apparent in plain image classification?