Environment
TensorRT Version: 7.2.3
GPU Type: rtx 3090 24Gb
Nvidia Driver Version: 460.73.01
CUDA Version: 11.1
CUDNN Version: 8.1.1
Operating System + Version: ubuntu 18.04
Description
I have two yolov4-csp models. First one is official model from darknet project trained on 80 classes. Second one is trained myself on 1 class (only person class). I successfully built engine models fp16 with batch=8 from both of them.
I test both models on following videos (part of output given by ffprobe):
Video 1:
Video: h264 (High), yuv420p, 1920x1080, 30 fps, 30 tbr, 1k tbn, 60 tbc (default)
Video 2:
Video: h264 (Main), yuvj420p(pc, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 30.30 fps, 29.97 tbr, 1k tbn, 59.94 tbc (default)
Video 3:
Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 2499 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
The detection of my model (trained on 1 class) is very good for all videos. My issue is about performance of my model for video3 because the FPS is very unstable and it goes from 40FPS to 85 FPS to 33 FPS to 84 FPS:
**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg) FPS 4 (Avg) FPS 5 (Avg) FPS 6 (Avg) FPS 7 (Avg)
**PERF: 43.29 (43.20) 43.29 (43.20) 43.29 (43.20) 43.29 (43.20) 43.92 (43.82) 43.29 (43.20) 43.29 (43.20) 43.29 (43.20)
**PERF: 38.37 (40.04) 38.37 (40.04) 38.37 (40.04) 38.37 (40.04) 38.37 (40.20) 38.37 (40.04) 38.37 (40.04) 38.37 (40.04)
**PERF: 37.21 (38.91) 37.21 (38.91) 37.21 (38.91) 37.21 (38.91) 37.21 (39.00) 37.21 (38.91) 37.21 (38.91) 37.21 (38.91)
**PERF: 55.03 (43.48) 55.03 (43.48) 55.03 (43.48) 55.03 (43.48) 55.03 (43.57) 55.03 (43.48) 55.03 (43.48) 55.03 (43.48)
**PERF: 84.25 (52.49) 84.25 (52.49) 84.25 (52.49) 84.25 (52.49) 84.25 (52.60) 84.25 (52.49) 84.25 (52.49) 84.25 (52.49)
**PERF: 85.92 (58.56) 85.92 (58.56) 85.92 (58.56) 85.92 (58.56) 85.92 (58.67) 85.92 (58.56) 85.92 (58.56) 85.92 (58.56)
**PERF: 59.41 (58.69) 59.41 (58.69) 59.41 (58.69) 59.41 (58.69) 59.41 (58.79) 59.41 (58.69) 59.41 (58.69) 59.41 (58.69)
**PERF: 45.51 (56.92) 45.51 (56.92) 45.51 (56.92) 45.51 (56.92) 45.51 (57.00) 45.51 (56.92) 45.51 (56.92) 45.51 (56.92)
**PERF: 39.51 (54.89) 39.51 (54.89) 39.51 (54.89) 39.51 (54.89) 39.51 (54.95) 39.51 (54.89) 39.51 (54.89) 39.51 (54.89)
**PERF: 33.49 (52.63) 33.49 (52.63) 33.49 (52.63) 33.49 (52.63) 33.49 (52.68) 33.49 (52.63) 33.49 (52.63) 33.49 (52.63)
**PERF: 35.73 (51.03) 35.73 (51.03) 35.73 (51.03) 35.73 (51.03) 35.73 (51.07) 35.73 (51.03) 35.73 (51.03) 35.73 (51.03)
**PERF: 48.86 (50.85) 48.86 (50.85) 48.86 (50.85) 48.86 (50.85) 48.86 (50.89) 48.86 (50.85) 48.86 (50.85) 48.86 (50.85)
**PERF: 84.23 (53.52) 84.23 (53.52) 84.23 (53.52) 84.23 (53.52) 84.23 (53.56) 84.23 (53.52) 84.23 (53.52) 84.23 (53.52)
**PERF: 83.24 (55.71) 83.24 (55.71) 83.24 (55.71) 83.24 (55.71) 83.24 (55.75) 83.24 (55.71) 83.24 (55.71) 83.24 (55.71)
**PERF: 70.73 (56.75) 70.73 (56.75) 70.73 (56.75) 70.73 (56.75) 70.73 (56.79) 70.73 (56.75) 70.73 (56.75) 70.73 (56.75)
**PERF: 51.72 (56.42) 51.72 (56.42) 51.72 (56.42) 51.72 (56.42) 51.72 (56.46) 51.72 (56.42) 51.72 (56.42) 51.72 (56.42)
**PERF: 67.37 (57.08) 67.37 (57.08) 67.37 (57.08) 67.37 (57.08) 67.37 (57.12) 67.37 (57.08) 67.37 (57.08) 67.37 (57.08)
**PERF: 58.60 (57.17) 58.60 (57.17) 58.60 (57.17) 58.60 (57.17) 58.60 (57.20) 58.60 (57.17) 58.60 (57.17) 58.60 (57.17)
**PERF: 58.07 (57.22) 58.07 (57.22) 58.07 (57.22) 58.07 (57.22) 58.07 (57.26) 58.07 (57.22) 58.07 (57.22) 58.07 (57.22)
**PERF: 79.39 (58.36) 79.39 (58.36) 79.39 (58.36) 79.39 (58.36) 79.39 (58.39) 79.39 (58.36) 79.39 (58.36) 79.39 (58.36)
In case of video1 and video2, the FPS is much more stable. This is the result of my model for video1 but output for video2 is very similar:
**PERF: FPS 0 (Avg) FPS 1 (Avg) FPS 2 (Avg) FPS 3 (Avg) FPS 4 (Avg) FPS 5 (Avg) FPS 6 (Avg) FPS 7 (Avg)
**PERF: 78.59 (78.07) 79.26 (78.72) 78.59 (78.07) 80.45 (79.90) 78.59 (78.07) 79.01 (78.49) 78.21 (77.70) 78.21 (77.70)
**PERF: 81.93 (80.77) 81.93 (81.02) 81.93 (80.77) 81.93 (81.42) 81.93 (80.77) 81.93 (80.92) 81.93 (80.64) 81.93 (80.64)
**PERF: 82.30 (81.33) 82.30 (81.49) 82.30 (81.33) 82.30 (81.73) 82.30 (81.33) 82.30 (81.42) 82.30 (81.25) 82.30 (81.25)
**PERF: 81.11 (81.29) 81.11 (81.41) 81.11 (81.29) 81.11 (81.58) 81.11 (81.29) 81.11 (81.36) 81.11 (81.24) 81.11 (81.24)
**PERF: 81.38 (81.27) 81.38 (81.36) 81.38 (81.27) 81.38 (81.49) 81.38 (81.27) 81.38 (81.32) 81.38 (81.22) 81.38 (81.22)
**PERF: 78.73 (80.82) 78.73 (80.89) 78.73 (80.82) 78.73 (81.00) 78.73 (80.82) 78.73 (80.86) 78.73 (80.79) 78.73 (80.79)
**PERF: 81.35 (80.91) 81.35 (80.97) 81.35 (80.91) 81.35 (81.06) 81.35 (80.91) 81.35 (80.95) 81.35 (80.88) 81.35 (80.88)
**PERF: 77.70 (80.50) 77.70 (80.55) 77.70 (80.50) 77.70 (80.63) 77.70 (80.50) 77.70 (80.53) 77.70 (80.47) 77.70 (80.47)
**PERF: 80.28 (80.46) 80.28 (80.50) 80.28 (80.46) 80.28 (80.58) 80.28 (80.46) 80.28 (80.49) 80.28 (80.44) 80.28 (80.44)
**PERF: 78.61 (80.26) 78.21 (80.26) 78.61 (80.26) 78.61 (80.37) 78.61 (80.26) 78.61 (80.29) 78.61 (80.24) 78.61 (80.24)
**PERF: 78.71 (80.12) 78.71 (80.12) 78.71 (80.12) 78.71 (80.22) 78.71 (80.12) 78.71 (80.15) 78.71 (80.11) 78.71 (80.11)
**PERF: 76.29 (79.78) 76.49 (79.80) 76.49 (79.80) 76.49 (79.88) 76.49 (79.80) 76.49 (79.82) 76.49 (79.78) 76.49 (79.78)
**PERF: 79.85 (79.80) 79.85 (79.81) 79.65 (79.80) 79.85 (79.89) 79.85 (79.82) 79.85 (79.83) 79.85 (79.80) 79.85 (79.80)
**PERF: 77.77 (79.64) 77.77 (79.65) 77.77 (79.64) 77.77 (79.72) 77.77 (79.65) 77.77 (79.67) 77.77 (79.64) 77.77 (79.64)
**PERF: 77.42 (79.50) 77.42 (79.51) 77.42 (79.50) 77.42 (79.58) 77.42 (79.51) 77.42 (79.52) 77.42 (79.50) 77.42 (79.50)
**PERF: 76.37 (79.28) 76.37 (79.29) 76.37 (79.28) 76.37 (79.36) 76.17 (79.28) 76.37 (79.31) 76.17 (79.27) 76.37 (79.28)
**PERF: 77.61 (79.18) 77.41 (79.18) 77.61 (79.18) 77.61 (79.25) 77.61 (79.18) 77.61 (79.21) 77.61 (79.17) 77.41 (79.17)
**PERF: 76.94 (79.06) 77.14 (79.06) 77.14 (79.07) 77.14 (79.13) 77.14 (79.07) 77.14 (79.09) 77.14 (79.06) 77.14 (79.06)
**PERF: 80.58 (79.14) 80.58 (79.15) 80.58 (79.15) 80.58 (79.21) 80.58 (79.15) 80.58 (79.17) 80.58 (79.14) 80.58 (79.14)
**PERF: 80.15 (79.19) 80.15 (79.20) 80.15 (79.20) 80.15 (79.26) 80.15 (79.20) 79.95 (79.21) 80.15 (79.19) 80.15 (79.19)
However when I run the official model from darknet (trained on 80 classes), the performance is stable for all these three videos and it is super weird.
Do you have any idea why my model is unstable for video3? The main difference between video1, video2 and video3 is the size, where video3 has only 1280x720.
deepstream_app_config:
[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
[source0]
enable=1
type=3
uri=file:///home/video3.mp4
num-sources=8
gpu-id=0
cudadec-memtype=0
[sink0]
enable=1
#1: Fakesink 2: EGL based windowed sink (nveglglessink)
type=1
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
[streammux]
gpu-id=0
live-source=0
batch-size=8
batched-push-timeout=33333
#video1 and video2
#width=1920
#height=1080
#video3
width=1280
height=720
enable-padding=0
nvbuf-memory-type=0
[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=/opt/nvidia/deepstream/deepstream-5.1/sources/yolov4-csp/config_infer_primary.txt
[tests]
file-loop=0