I converted mobilenet v1/v2 tensorflow model by trt.create_inference_graph(); then use it in tensorflow-tensorrt.
however, the performance improvement isn’t much, it is ~10% improvement on infer time. （FLOPs/second: 104.72B）
(for other net as inception v2, the improvement is higher).
then I did some investigation:
most computation of mobilenet v1/v2 is about 1x1 conv, and 1x1 conv is memory friendly with NHWC data format. but tensorRT supports NCHW data format.
in <<TensorRT-Developer-Guide 5.pdf>>, the mentioned op includes Conv2d and DepthwiseCOnv2dNative. I guess 1x1 conv is treats as common Conv2d; not be optimized specifically.
since 1x1 conv is widely used in current net models, could tensorRT do some optimize on it?
for example: it may benefit if NHWC data format is supported.