DLA FP16 FPS drop in jetpack 4.3_dp

Hello there,

I had generated a few FPS numbers for MobilenetV1 by using Jetpack 4.2.0 earlier.
Configuration :
MaxN mode
Jetson_clocks on
Trtexec script used
DLA0
FP16

FPS numbers were:
batch1-348 FPS
batch2-442 FPS
batch4-509 FPS
batch8-550 FPS
batch16-574FPS

and now after using Jetpack 4.3DP, with same config, I found that FPS using DLAs is dropped significantly, particularly for higher batches. I got these numbers with JP4.3dp:

batch1-297 FPS
batch2-292 FPS
batch4-310 FPS
batch8-320 FPS
batch16-325FPS

I am 100% sure about MaxN mode and jetson_clocks.
Since latest jp4.3 provides almost 2x improvement on GPU performance using tensorRT, something similar was expected on DLAs as well.

May I know why this drop is observed?
Anybody has similar observations?

Thanks in advance.

Hi,

We want to reproduce this issue in our environment.
Could you share the detail step of your instructions with us?

1. Where could we find the MobilenetV1 model you used?
Suppose it is TensorFlow based, is it correct?
If yes, could you share the uff file with us?

2. Did you measure the FPS with trtexec?
If not, could you share the sample with us?

Thanks.

Hello AastaLLL,
Here are my details:

  1. I did use MobilenetV1 from official Tensorflow repo. Converted Uff is attached.
  2. I used Trtexec for FPS measurement.

Precise logs for batchsize 1 are attached.
Difference in latency is clearly visible.
Please find the files here:

https://drive.google.com/drive/folders/1qf8j3FhImDWbb1bRVpV5nSU14kn73kIp?usp=sharing

Hi,

Thanks for your reply.

We are going to reproduce this issue in our environment.
Will update here once we find further information.

Thanks.

Hi,

Thanks for your patience.

This is not a bug and caused by more layer support in the DLA.
In our latest release, we have enabled more layers to run on DLA which would offload DL compute from the GPU.

In the JetPack 4.3, there is only three layers fallback to the GPU:

[11/19/2019-15:47:49] [I] [TRT] (Unnamed Layer* 222) [Shuffle] + MobilenetV1/Predictions/Reshape, 
MobilenetV1/Predictions/Softmax, 
(Unnamed Layer* 225) [Shuffle] + MobilenetV1/Predictions/Reshape_1,

However, there are bunch of layers running in the GPU in JetPack4.2.

MobilenetV1/Conv2d_0/weights, 
(Unnamed Layer* 1) [Padding], 
MobilenetV1/Conv2d_0/BatchNorm/gamma,
MobilenetV1/Conv2d_0/BatchNorm/beta,
MobilenetV1/Conv2d_0/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_0/BatchNorm/moving_variance, 
(Unnamed Layer* 9) [Constant], 
MobilenetV1/Conv2d_1_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_1_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_1_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_1_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_1_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 19) [Constant], 
MobilenetV1/Conv2d_1_pointwise/weights, 
MobilenetV1/Conv2d_1_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_1_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_1_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_1_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 29) [Constant], 
MobilenetV1/Conv2d_2_depthwise/depthwise_weights, 
(Unnamed Layer* 32) [Padding], 
MobilenetV1/Conv2d_2_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_2_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_2_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_2_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 40) [Constant], 
MobilenetV1/Conv2d_2_pointwise/weights, 
MobilenetV1/Conv2d_2_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_2_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_2_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_2_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 50) [Constant], 
MobilenetV1/Conv2d_3_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_3_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_3_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_3_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_3_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 60) [Constant], 
MobilenetV1/Conv2d_3_pointwise/weights, 
MobilenetV1/Conv2d_3_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_3_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_3_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_3_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 70) [Constant], 
MobilenetV1/Conv2d_4_depthwise/depthwise_weights, 
(Unnamed Layer* 73) [Padding], 
MobilenetV1/Conv2d_4_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_4_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_4_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_4_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 81) [Constant], 
MobilenetV1/Conv2d_4_pointwise/weights, 
MobilenetV1/Conv2d_4_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_4_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_4_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_4_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 91) [Constant], 
MobilenetV1/Conv2d_5_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_5_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_5_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_5_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_5_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 101) [Constant], 
MobilenetV1/Conv2d_5_pointwise/weights, 
MobilenetV1/Conv2d_5_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_5_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_5_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_5_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 111) [Constant], 
MobilenetV1/Conv2d_6_depthwise/depthwise_weights, 
(Unnamed Layer* 114) [Padding], 
MobilenetV1/Conv2d_6_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_6_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_6_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_6_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 122) [Constant], 
MobilenetV1/Conv2d_6_pointwise/weights, 
MobilenetV1/Conv2d_6_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_6_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_6_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_6_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 132) [Constant], 
MobilenetV1/Conv2d_7_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_7_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_7_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_7_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_7_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 142) [Constant], 
MobilenetV1/Conv2d_7_pointwise/weights, 
MobilenetV1/Conv2d_7_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_7_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_7_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_7_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 152) [Constant], 
MobilenetV1/Conv2d_8_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_8_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_8_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_8_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_8_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 162) [Constant], 
MobilenetV1/Conv2d_8_pointwise/weights, 
obilenetV1/Conv2d_8_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_8_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_8_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_8_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 172) [Constant], 
MobilenetV1/Conv2d_9_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_9_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_9_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_9_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_9_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 182) [Constant], 
MobilenetV1/Conv2d_9_pointwise/weights, 
MobilenetV1/Conv2d_9_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_9_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_9_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_9_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 192) [Constant], 
MobilenetV1/Conv2d_10_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_10_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_10_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_10_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_10_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 202) [Constant], 
MobilenetV1/Conv2d_10_pointwise/weights, 
MobilenetV1/Conv2d_10_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_10_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_10_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_10_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 212) [Constant], 
MobilenetV1/Conv2d_11_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_11_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_11_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_11_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_11_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 222) [Constant], 
MobilenetV1/Conv2d_11_pointwise/weights, 
MobilenetV1/Conv2d_11_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_11_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_11_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_11_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 232) [Constant], 
MobilenetV1/Conv2d_12_depthwise/depthwise_weights, 
(Unnamed Layer* 235) [Padding], 
MobilenetV1/Conv2d_12_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_12_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_12_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_12_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 243) [Constant], 
MobilenetV1/Conv2d_12_pointwise/weights,
MobilenetV1/Conv2d_12_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_12_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_12_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_12_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 253) [Constant], 
MobilenetV1/Conv2d_13_depthwise/depthwise_weights, 
MobilenetV1/Conv2d_13_depthwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_13_depthwise/BatchNorm/beta, 
MobilenetV1/Conv2d_13_depthwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_13_depthwise/BatchNorm/moving_variance, 
(Unnamed Layer* 263) [Constant], 
MobilenetV1/Conv2d_13_pointwise/weights, 
MobilenetV1/Conv2d_13_pointwise/BatchNorm/gamma, 
MobilenetV1/Conv2d_13_pointwise/BatchNorm/beta, 
MobilenetV1/Conv2d_13_pointwise/BatchNorm/moving_mean, 
MobilenetV1/Conv2d_13_pointwise/BatchNorm/moving_variance, 
(Unnamed Layer* 273) [Constant], 
MobilenetV1/Logits/Conv2d_1c_1x1/weights, 
MobilenetV1/Logits/Conv2d_1c_1x1/biases, 
MobilenetV1/Predictions/Reshape/shape, 
(Unnamed Layer* 281) [Shuffle], 
MobilenetV1/Predictions/Reshape, 
MobilenetV1/Predictions/Softmax, 
(Unnamed Layer* 284) [Shuffle], 
MobilenetV1/Predictions/Reshape_1,

Thanks.

Thanks a ton AstaLLL!
next time onward, Will have an eye on layers as well.