I check the results of face and people detection in NGC, The results are strange to me.
FaceDetectIR with jetson nano is 103 FPS and PeopleNet is 10 FPS with jetson nano, I accept the input resolution and feature extraction is different, but Is that really the difference?
It is reasonable due to below items.
960x544 ResNet34 FP16 precision
384x240 ResNet18 INT8 precision
Jetson nano doesn’t support INT8.
I mean for all the result showing in the graph. The figure shows several edge devices.
I have some question about DetectNet-v2:
1- Because both the Face-Detection and People-Net are used the DetectNet-v2 arch, Why in the PeopleNet didn’t use INT8 to get p[performance? Is it possible to convert the PeopleNet to INT8?
2- because the jetson nano doesn’t support INT8, If we deploy the INT8 model like face-detection, is it possible to get run on jetson nano? If so, because the nano doesn’t support INT8, in my opinion, This way should not be effect on latency, right?
3- In the figure shown the FaceDetection on jetson nano achieved 103 FPS, Is it possible to get this FPS with FP16 on jetson nano?
- Sure, Peoplenet can run inference in INT8 precision. But the link https://ngc.nvidia.com/catalog/models/nvidia:tlt_peoplenet does not mention. It just shows the result in fp16 precision for several edge devices.
Yes, you can use tlt-export to generate int8 trt engine.
- In Nano, it cannot run inference with int8 precision.
- Please note that 103 fps is for fp16. See below.
The inference performance is run using trtexec on Jetson Nano, AGX Xavier, Xavier NX and NVIDIA T4 GPU. On the Jetson Nano FP16 inference is run.