Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): Jetson AGX Xavier
• DeepStream Version: 5.0
• JetPack Version (valid for Jetson only): 4.4
• TensorRT Version: 10.2
• Issue Type( questions, new requirements, bugs): question
I trained 2 yolov3 models using Nvidia’s TLT, both are pruned and retrained. The main different between them are the backbone network used. I then check the engine performance created by deepstream using trtexec
, these are the result:
model: yolo_mobilenet_v2_epoch_052.etlt (3.9 MB)
engine: yolo_mobilenet_v2_epoch_052.etlt_b2_gpu0_fp16.engine (14.8 MB)
batch size: 2
input size: 1152x1440
Jetson AGX (boosted clock): mean: 23.2028 ms (~86.1388 FPS) | F16
model: yolo_resnet10_epoch_033.etlt (4.5 MB)
engine: yolo_resnet10_epoch_033.etlt_b2_gpu0_fp16.engine (8.0 MB)
batch size: 2
input size: 1152x1440
Jetson AGX (boosted clock): mean: 17.6101 ms (~113.512 FPS) | F16
As expected, the model whose backbone is mobilenet_v2 lead to a smaller .etlt
file, but why the size of the engine generated by deepstream is smaller for the model whose backbone is resnet10 which lead to faster inference speed?