Interpolation method in Gst-nvstreammux not working

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) - Nvidia T4 GPU
• DeepStream Version - 5.1 and 6.0
• JetPack Version (valid for Jetson only) –
• TensorRT Version. 8.0.1-1+cuda11.3
• NVIDIA GPU Driver Version (valid for GPU only) 460.32.03
• Issue Type( questions, new requirements, bugs) bug
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=608 height=608 enable_padding=1 interpolation-method=2 ! queue ! nvstreamdemux name=demux \
    uridecodebin uri=file:///mounted/data/videos/horizontal.mov ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! queue ! mux.sink_0 \
    uridecodebin uri=file:///mounted/data/videos/vertical.mov ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=NV12' ! queue ! mux.sink_1 \
    demux.src_0 ! queue ! nvvideoconvert ! nvv4l2h264enc ! h264parse ! qtmux ! filesink location=/mounted/data/images/deepstream-streammux-set/src_0/horizontal_6_0_cubic.mp4 \
    demux.src_1 ! queue ! nvvideoconvert ! nvv4l2h264enc ! h264parse ! qtmux ! filesink location=/mounted/data/images/deepstream-streammux-set/src_1/vertical_6_0_cubic.mp4

This pipeline downscales input videos to 608x608 resolution with padding. interpolation-method in streammux element does not have any effect - changing it does not yield different result. This is important because my neural network is trained with bilinearly scaled-down images and the results are worse with default nearest-neighbors interpolation.

I was able to do proper scaling with Gst-nvvideoconvert element in pipeline below.

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=608 height=608 enable_padding=1 interpolation-method=1 ! queue ! nvstreamdemux name=demux \
    uridecodebin uri=file:///mounted/data/videos/horizontal.mov ! nvvideoconvert interpolation-method=1 ! 'video/x-raw(memory:NVMM),format=NV12,width=608,height=342' ! queue ! mux.sink_0 \
    uridecodebin uri=file:///mounted/data/videos/vertical.mov ! nvvideoconvert interpolation-method=1 ! 'video/x-raw(memory:NVMM),format=NV12,width=342,height=608' ! queue ! mux.sink_1 \
    demux.src_0 ! queue ! nvvideoconvert ! nvv4l2h264enc ! h264parse ! qtmux ! filesink location=/mounted/data/images/deepstream-streammux-set/src_0/horizontal_6_0_bilinear_nvvidconv.mp4 \
    demux.src_1 ! queue ! nvvideoconvert ! nvv4l2h264enc ! h264parse ! qtmux ! filesink location=/mounted/data/images/deepstream-streammux-set/src_1/vertical_6_0_bilinear_nvvidconv.mp4

The question is - is it a preferable way to do it? Is there another method?

Sorry for the late response, is this still an issue to support? Thanks

What is the original resolution of “horizontal.mov” and “vertical.mov”? Where and how do you do the inference?

Do you know H264 encoding will loss video quality?

Yes the issue is still valid

Both streams are 4k resolution. Horizontal is 3820x2160 and vertical is 2160x3840. They are using motion JPEG encoding instead of H264 - I have dumped images with the pipelines presented above - there are no encoding/decoding artifacts - this issue is about the interpolation method.

If you recreate those pipelines you will see that this flag (in streammux element) does not have any effect.

How do you know the flagdoes not have any effect? Hav you compare the binary bit by bit?

I have done it right now. I see there are some differences (very minor) between interpolation methods. However the images that come from this deepstream pipeline have less detail than images downscaled by OpenCV - there are some weird compression artifacts visible - even though I am comparing images that does not come from video source. This is the pipeline I am using for deepstream:

gst-launch-1.0 nvstreammux name=mux batch-size=2 width=608 height=608 enable_padding=1 interpolation-method=1 ! queue ! nvstreamdemux name=demux \
multifilesrc location=/mounted/data/images/deepstream-test-set-2-sources/horizontal/%05d.jpg ! jpegparse ! nvv4l2decoder ! nvvideoconvert ! queue ! mux.sink_0 \
multifilesrc location=/mounted/data/images/deepstream-test-set-2-sources/vertical/%05d.jpg ! jpegparse ! nvv4l2decoder ! nvvideoconvert ! queue ! mux.sink_1 \
demux.src_0 ! queue ! nvvideoconvert ! videoconvert ! "video/x-raw,format=I420" ! jpegenc ! multifilesink location=/mounted/data/images/deepstream-streammux-set/horizontal/%05d.jpg \
demux.src_1 ! queue ! nvvideoconvert ! videoconvert ! "video/x-raw,format=I420" ! jpegenc ! multifilesink location=/mounted/data/images/deepstream-streammux-set/vertical/%05d.jpg

And this is how images are downscaled for model training:

dim = (width, height)
resized = cv2.resize(img, dim, interpolation=cv2.INTER_LINEAR)

The whole goal of this issue is to solve why deepstream produces predictions with less accuracy and recall than raw TRT engine - we were lead to believe that this is due to the downscaling method. Might there be a different reason?

Can you recreate the pipeline and explain why deepstream images have less detail than OpenCV ones (even though the algorithm used is the same - bilinear scaling)?

Can you check your driver version again? DeepStream 6.0 GA is based on * NVIDIA driver 470.63.01 Quickstart Guide — DeepStream 6.0 Release documentation

Hi, I have verified this behavior with the following drivers:
NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4.
Do you know what interpolation-method should I specify to get the same results as OpenCV:
resized = cv2.resize(img, dim, interpolation=cv2.INTER_LINEAR) ?
If the resizing is not the reason of the precision degradation of deepstream pipeline (in the case of inference precision and recall) what could the reason be?