How to improve the performance for DSExample

Please provide complete information as applicable to your setup.

*• Jetson Tx2
• DeepStream 5.0
• Jetpack 4.4
• TensorRT 7
• NVIDIA GPU Driver Version (valid for GPU only)
when I disable dsexample , my model can run 100fps,
after enable dsexample, the performance drop to 42fps.
I didn’t do anything for gst-dsexample, and I also try USE_OPTIMIZED_DSEXAMPLE , it is same.
i need use the dsexample to send the detect result, but now it block the performance in my applocation.

The dsexample is a sample of nvinfer without inferrence implimented. All functions such as format conversion, scaling, blur except inference with network are implemented. It will not be fast with all these processing.
Is it possible for you to use pad probe in application to send detect result? What kind of result will you get and send?

I want to send bounding box and the the image in bounding box to server.

after test, i found most of time use in NvBufSurfaceMemSet and NvBufSurfaceMap, but i has to access the detected image, do you know how to optimize it ?

the code :
NvBufSurfaceMemSet (dsexample->inter_buf, 0, 0, 0);
//GST_DEBUG_OBJECT (dsexample, “Scaling and converting input buffer\n”);
/* Transformation scaling+format conversion if any. /
err = NvBufSurfTransform (&ip_surf, dsexample->inter_buf, &transform_params);
if (err != NvBufSurfTransformError_Success) {
(“NvBufSurfTransform failed with error %d while converting buffer”, err),
goto error;
Map the buffer so that it can be accessed by CPU */
if (NvBufSurfaceMap (dsexample->inter_buf, 0, 0, NVBUF_MAP_READ) != 0){
goto error;

If the transformation is necessary for you, it can not be improved by application level.

Our deepstream-test4 sample shows how to send bbox information to cloud server, will this mechanism work for your case?
The deepstream-image-meta-test sample shows how to get object image and transfer them to jpg file. So you can combine these two samples together to send bbox information and object images to cloud server.
The image crop and format transferring need much more resources and time than text and numbers, especially the memory in NvBufSurface is special memory which is used by GPU but not CPU, it will take extra time for the CPU to access such memory.