I’m trying to use tf-trt to run inference on object detection networks on the Jetson AGX Xavier Developer Kit. I wanted to understand how different models performed on the Xavier, so I tried to benchmark all of the models from the Object Detection Model Zoo. I downloaded all of the models and converted using the code found here. However, the performance numbers I get do not seem consistent with DLA inference benchmarks I’ve seen (below). The performance numbers are taken by timing the session.run() call to TensorFlow, running on the Xavier in MAX_N mode with all clocks maxed (after jetson_clocks.sh). My hypothesis is that either the conversion step or the inference step is not making use of the DLA chips, and is utilizing the GPU for the entire inference stage. Based on that, I have a few questions:
- Is there a way to verify my hypothesis? That is, determine whether tf-trt is executing the model on the DLA or on the GPU?
- Assuming the GPU is being used, is there a way to convert models such that the DLA is used as much as possible?
- If it is not possible to use the DLA with tf-trt yet, is there an ETA on when support may be enabled?
- If it is not possible to use the DLA with tf-trt yet, what is the recommended way of running tensorflow object detection models with the DLA? As this is still early access hardware, with lots of things in flux, I’m not sure what the current best practice is.
- Another thing to note is that only some of the models were able to convert with the conversion script (which is just calling trt.create_inference_graph internally). Batch size > 1 only seems supported on SSD topologies. Is this expected?
Object Detection Model Zoo - https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
Conversion code - http://github.com/NVIDIA-AI-IOT/tf_trt_models
Performance numbers - https://docs.google.com/spreadsheets/d/1GFCVk90xP1oYUKLy_ESeW46RkXGARG7U_25zokN7K7A/edit?usp=sharing
Benchmark 1 - https://developer.nvidia.com/embedded/jetson-agx-xavier-dl-inference-benchmarks
Benchmark 2 - https://www.phoronix.com/scan.php?page=article&item=nvidia-jetson-xavier&num=2