Please provide complete information as applicable to your setup.
• Hardware Platform (GPU) • DeepStream Version : 5.1 • TensorRT Version : 7.2 • NVIDIA GPU Driver Version (valid for GPU only) : 460.91.03 • Issue Type( questions, new requirements, bugs) : questions
I had a custom plugin with input cap video/x-raw(memory:NVMM) format: { (string)I420 }. this plugin after nvdsosd (with osd need input format support is video/x-raw(memory:NVMM) format: { (string)RGBA }. the input source from appsrc caps output video/x-raw, format NV12 data. and use nvvideconvert tovideo/x-raw(memory:NVMM), format NV12.
the problem is that the FPS really slow when using nvinfer (engine detector) provide the objects about 10FPS. if don’t have nvdsosd in the pipeline the FPS will be 100.
I have some questions please someone in Nvidia team could reply:
is nvdsosd work only if have an object? since when I keep nvdsosd and remove nvinfer from pipeline the FPS will be 100 too. this time nvdsosd not do anything when no more object?
I wonder this may be botteneck happen because I do nvvideoconvert too many times especially convert format NV12 to RGBA and then convert RGBA to I420?
below is three pipeline measure the latency performance with difference pipeline with and without nvinfer (know as have the objects to make osd worked).
the osd did something make slow down the pipeline here when I add it to the pipeline (especially when have the object)!
in your first case, primary gie take around 250ms, while in the third case, primary gie take around 11ms, did you use same batch and same stream? any difference between the two except there no nvosd in the third case?
Hi amycao. Three tests using the same file config app and config of engines. The difference that I just disable the osd and primary engine in the main file config with the field enable=0/1. the source test input also the same
We can not repro your issue, in our enviroments, the fps with and without nvosd differ around 2-3, can you provide the configuration used and extract your app so that can run in nvidia environments for us to repro your issue?
I tried these commands gst-launch and check the performance but this source is file source read from video file no more information. that is the pipeline with osd but with/without convert color I420 (test 0-2) the pipeline still got the time execude in 28-29s. In-time the pipeline no osd and with/without my plugin (test 3-4) just take 2s. Do you have any idea about this?
I run on my side, but it only take around 2.47s. I used T4 card, and boost GPU frequency to max, which GPU you are using, did you boost the GPU freq? and in the nvinfer config file, did you use builtin model? or your model? and how about your CPU model? this is our CPU model: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz 24 cores.
0:00:12.310309306 1012 0x564a95af3870 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1806> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
0:00:12.311520438 1012 0x564a95af3870 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus: [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app/config_infer_primary.txt sucessfully
Pipeline is PREROLLING …
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock
Got EOS from element “pipeline0”.
Execution ended after 0:00:02.473655020
Setting pipeline to PAUSED …
Setting pipeline to READY …
Setting pipeline to NULL …
Freeing pipeline …
Thank you amycao! The config model I used that is defautl in samples folder. I used 1080 Ti GPU. How to could I boost the GPU freq?
Could you do a test pipeline with the same 1080 GPU? I want to check is a problem with my GPU card or the nvosd not work the best performance on this card.
0:00:00.753143039 75 0x55e75f1ae240 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1908> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
0:00:00.755295270 75 0x55e75f1ae240 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus: [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/samples/configs/deepstream-app/config_infer_primary.txt sucessfully
Pipeline is PREROLLING …
Pipeline is PREROLLED …
Setting pipeline to PLAYING …
New clock: GstSystemClock
Got EOS from element “pipeline0”.
Execution ended after 0:00:02.059874230
Setting pipeline to PAUSED …
Setting pipeline to READY …
Setting pipeline to NULL …
Freeing pipeline …