DeepStream nvinfer input tensor contains incorrect image

sergyashin · July 12, 2022, 1:35pm

Hi
We’ve found the DeepStream nvinfer input tensor sometimes contains incorrect image when using GIE config option maintain-aspect-ratio=1.

Looks like forming input tensor for TRT inference non-atomically consists of operations:

set buffer to 0 (make black canvas)
copy input image crop over

And occasionally spurious crop from other object on same or previous frame being copied over before copy crop of correct object

In effect the input tensor looks like:
gstnvdsinfer_uid-02_layer-Input_batch-0000000010_batchsize-08.bin_8

Correct input tensor should looks like:
gstnvdsinfer_uid-02_layer-Input_batch-0000000013_batchsize-08.bin_8

The inference pipeline created by gst_parse_launch():

appsrc name=ds_appsrc caps=video/x-raw,format=(string)BGR,width=(int)1920,height=(int)1080,framerate=(fraction)5/1 !
queue !
videoconvert ! video/x-raw,format=GRAY8 ! 
nvvideoconvert ! video/x-raw(memory:NVMM),format=NV12,colorimetry=bt601 !
m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 !
nvinfer name=nvinfer_cd config-file-path=/opt/models/cd/pgie-1.txt ! 
nvinfer name=nvinfer_lpd config-file-path=/opt/models/lpd/sgie-2_lpd.txt raw-output-file-write=1 ! 
fakesink sync=false

The input GstBuffer being pushed to appsrc are created by gst_buffer_new_allocate().

I’ve attached nvinfer configs:
pgie-1.txt (4.0 KB)
sgie-2_lpd.txt (3.6 KB)

$ jetson_release

NVIDIA Jetson TX2
- Jetpack UNKNOWN [L4T 32.5.1]
- NV Power Mode: MAXP_CORE_ARM - Type: 3
- jetson_stats.service: active
Libraries:
- CUDA: 10.2.89
- cuDNN: 8.0.0.180
- TensorRT: 7.1.3.0
- Visionworks: 1.6.0.501
- OpenCV: 4.1.1 compiled CUDA: NO
- VPI: ii libnvvpi1 1.0.15 arm64 NVIDIA Vision Programming Interface library
- Vulkan: 1.2.70
  DeepStream 5.1

fanzh · July 14, 2022, 9:07am

which deepstream sample are you testing ? could you provide simple code reproduce this issue?

sergyashin · July 14, 2022, 9:46am

The python script based on deepstream_python_apps/deepstream_test_1.py at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub

deepstream_test_1_bug.py (10.3 KB)

The prob can be also reproduced by command line:

gst-launch-1.0 -e -v multifilesrc location=frame_%05d.jpg \
start-index=0 stop-index=-1 caps=image/jpeg,framerate=\(fraction\)25/2 \
! nvjpegdec \
! nvvideoconvert \
! capsfilter video/x-raw\(memory:NVMM\),format=RGBA \
! m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 \
! nvinfer config-file-path=pgie-1.txt \
! nvinfer config-file-path=sgie-2_lpd.txt raw-output-file-write=1 \
! nvdsosd ! nvvidconv ! nvjpegenc ! multifilesink location=out_dir/frame_%05d.jpg

Sample image:

Models are based on tlt_pretrained_object_detection_vresnet18/resnet_18.hdf5

sergyashin · July 18, 2022, 2:33pm

Hi
Should I provide additional info that could speed up diagnostics?
Regards.

fanzh · July 18, 2022, 2:53pm

sorry for late response, some questions:

what does your models do? do you mean the tensor the sgie got is wrong? nvinfer is opensource, you can add logs to narrow down this issue.
about “The prob can be also reproduced by command line:”, I have no the two models, could you provide the whole simple code to reproduce this issue?

sergyashin · July 18, 2022, 3:55pm

The models are RetinaNet trained with TAO for Car Detector and License Plate Detector according to article. I believe the problem isn’t with models.
I’ve dumped input tensors of License Plate Detector with raw-output-file-write=1 and converted them to images. Some dumps contains spurious data from other parts of input frame. Please look at two images I’ve attached in the first message: first image incorrectly composed of two crops. It looks like multi-thread access/locking issue:
1). the buffer cleared because of maintain-aspect-ratio=1
2). the crop of detected car made according one of NvDsObjectMeta, scaled and copied to buffer.
And sometimes operation 2) happens to repeat with some other NvDsObjectMeta for same input tensor.
This cause spurious detections from LPD.

Can it be some bug in the DeepStream 5.1 that already fixed?
Can it be some incompatibility issue with nvvideoconvert/nvstreammux/nvinfer/nvjpegdec?
Can it be some hardware issue with VIC? Maybe with NvBufSurfTransform? This is L4T 32.5.1.

Thank you.

fanzh · July 20, 2022, 3:16pm

after testing GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream on jetson xavier deepstream6.1, I can’t reproduce that issue, tensor is correctly filled with black border when maintain-aspect-ratio=1, please try the new version.
you can use DeepStream SDK FAQ - #9 by mchi to dump input tensor.

sergyashin · July 20, 2022, 4:29pm

Hi
Unfortunately we are restricted by product system requirements to use Jetson TX2, L4T 32.5.1, DeepStream 5.1.
I know the DeepStream 6.1 use NvBufSurfTransformAsync() to compose input image for secondary nvinfer instead of synchronous NvBufSurfTransform() as in DeepStream 5.1.
Could you ask colleagues maybe it was known issue in the DeepStream 5.1?

Thank you.

fanzh · July 21, 2022, 3:21am

did not find the same issue.

sergyashin · July 21, 2022, 9:10pm

Hi
Could you please clarify.
The function get_converted_buffer() in the sources/gst-plugins/gst-nvinfer/gstnvinfer.cpp
performs calls of cudaMemset2DAsync() to clear buffer for maintain-aspect-ratio=1.
The calls seems not synchronized.
Then the NvBufSurfTransform() called to scale/copy image of detected object to the input tensor.
My question is: when happens synchronization of these operations?

Thank you.

fanzh · July 25, 2022, 10:16am

cudaMemset2DAsync is used to accelerate processing, here no need to do sync.

sergyashin · July 25, 2022, 3:56pm

Hi
Are you sure about no need to do sync?
Fom CUDA documentation:
cudaMemset2DAsync() is asynchronous with respect to the host, so the call may return before the memset is complete. The operation can optionally be associated to a stream by passing a non-zero stream argument. If stream is non-zero, the operation may overlap with operations in other streams.
The following operation NvBufSurfTransform() performed on VIC by default.
I ask because call of cudaStreamQuery() just before NvBufSurfTransform() sometimes returns cudaErrorNotReady, that is buffer actually not cleared yet.

I tested with GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream too. With Jetson TX2/DeepStream 5.1. Got same issue - sometimes tensor of secondary detector contain stale data.

fanzh · July 26, 2022, 5:37am

yes, there is only one CUDA stream nvinfer->convertStream, which will be passed to NvBufSurfTransform, to user it is synchronous, using cudaMemset2DAsync GPU will start processing without waiting all data is received, compared cudaMemset2D it is an “async” mode. please compare deepstream 5.1 and 6.1, there is no sync operation.
2 . about "I’ve dumped input tensors of License Plate Detector with raw-output-file-write=1 and converted them to images. ", don’t know how you did that, we use DeepStream SDK FAQ - #9 by mchi section 3 to dump input tensor.

fanzh · August 8, 2022, 3:26am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

system · August 22, 2022, 3:26am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream can run async mode? DeepStream SDK gstreamer	7	1954	March 8, 2021
Cannot find the objectDetector_FastRCNN example DeepStream SDK deepstream	46	161	October 14, 2024
Config_infer_secondary_carcolor.txt failing DeepStream SDK	11	353	June 14, 2023
Collecting images with pyds.get_nvds_buf_surface DeepStream SDK nvbugs	23	3893	October 12, 2021
DeepStream 7.1 nvinferserver tensor clone error DeepStream SDK deepstream	12	72	November 29, 2024
How to get `nvinfer` to be as accurate as TensorRT's API? DeepStream SDK tensorrt , tensorflow , gstreamer , nvbugs , python , deepstream	25	170	November 19, 2024
Some question about Deep stream 5 DeepStream SDK	42	1780	October 12, 2021
How to use a classification video as primarye gieon deepstream-app DeepStream SDK	2	449	October 12, 2021
Detection on MJPEG stereocamera failure DeepStream SDK camera , gstreamer	13	1561	October 12, 2021
How to append DeepStream Metadata in Python without using Streammux / nvinfer for parallel branch? DeepStream SDK	21	657	March 12, 2024

DeepStream nvinfer input tensor contains incorrect image

Related topics