Variation in FPS with GPU/DLA

I have a built face detection and recognition model. when enabling the “FaceDetect” to run on DLA it used to give slighlty lower fps comapred to running it on GPU. And i have justified this by not all layers are running on int8 when running on DLA unlike GPU.
However when i have modified the code as suggested here: Multi streaming resulted to accuracy drop running face detect model on DLA gives x2 FPS comapred to the GPU. I’m bit confused why this happend in multi inout code and by only modifiying the nvstremmux hight and width although still facedetect running on fp16 not int8 in DLA.

Hi,

Suppose it will change the input size that feeds into the model.
With a smaller input resolution, it’s expected to be faster since it has fewer computational tasks.

Thanks.

i have enlarge the resolution from 1020x920 to 1920x1080, but will this change affect the DLA performance to be x3 than the previous case and makes it exceed the GPU performance in processing the model

Hi,

The fps you indicating here is the pure TensorRT performance or the whole pipeline fps?
Could you share the reproducible step with us so we can check it further?

Thanks.

hello, i have followed the below for FPS and plug-ins measurement