Possible multimedia api regression with decode interlace source

Hi,

i execute steps, this is what my environment looks like:

# cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/userspace/set_freq
601600000
# cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/max_freq
601600000
# cat /sys/devices/13e10000.host1x/15340000.vic/devfreq/15340000.vic/governor
userspace
# cat /sys/devices/13e10000.host1x/15340000.vic/power/control
on

and issue with broken picture observed on jetson xavier nx r32.6.1 sdk. Set max clock does not help.

Hi
We try on TX2 and don’t see the issue. The out.ts looks fine. So you use both TX2 and Xavier NX in your project? Or TX2 is for evaluation purpose and you will switch to Xavier NX?

Hi,

Just in case, I would like to reiterate that we are testing the video_decode sample with applied patch that adds 40ms latency to emulate 25fps live stream. Have you used the same sample in your tests?

I have tried on it both Jetson TX2 and Xavier NX with r32.6.1 release. And received the same artifacts, the same broken picture. Running VIC at max clock did not help either, unfortunately.

Regarding the module that we intend to use; we would really like to switch to the newest Jetson NX, but we will also keep using TX2 or perhaps TX2 NX.
Decoding interlaced issue is currently a blocker because we need to update l4t sdk, and decoding of interlaced sources does not work properly.

thanks

Hi,
Please apply this patch and check again:

diff --git a/multimedia_api/ll_samples/samples/00_video_decode/video_decode_main.cpp b/multimedia_api/ll_samples/samples/00_video_decode/video_decode_main.cpp
index 8bb14a9..f239996 100644
--- a/multimedia_api/ll_samples/samples/00_video_decode/video_decode_main.cpp
+++ b/multimedia_api/ll_samples/samples/00_video_decode/video_decode_main.cpp
@@ -992,6 +992,8 @@ dec_capture_loop_fcn(void *arg)
     NvVideoDecoder *dec = ctx->dec;
     struct v4l2_event ev;
     int ret;
+    NvBufferSession session;
+    session = NvBufferSessionCreate();
 
     cout << "Starting decoder capture loop thread" << endl;
     /* Need to wait for the first Resolution change event, so that
@@ -1133,6 +1135,8 @@ dec_capture_loop_fcn(void *arg)
                     break;
                 }
 #else
+                NvBufferSyncObj syncobj;
+
                 /* Clip & Stitch can be done by adjusting rectangle. */
                 NvBufferRect src_rect, dest_rect;
                 src_rect.top = 0;
@@ -1152,16 +1156,21 @@ dec_capture_loop_fcn(void *arg)
                 transform_params.transform_filter = NvBufferTransform_Filter_Nearest;
                 transform_params.src_rect = src_rect;
                 transform_params.dst_rect = dest_rect;
+                transform_params.session = session;
+
+                memset(&syncobj,0,sizeof(NvBufferSyncObj));
+                syncobj.use_outsyncobj = 1;
 
                 if(ctx->capture_plane_mem_type == V4L2_MEMORY_DMABUF)
                     dec_buffer->planes[0].fd = ctx->dmabuff_fd[v4l2_buf.index];
                 /* Perform Blocklinear to PitchLinear conversion. */
-                ret = NvBufferTransform(dec_buffer->planes[0].fd, ctx->dst_dma_fd, &transform_params);
+                ret = NvBufferTransformAsync(dec_buffer->planes[0].fd, ctx->dst_dma_fd, &transform_params, &syncobj);
                 if (ret == -1)
                 {
                     cerr << "Transform failed" << endl;
                     break;
                 }
+                NvBufferSyncObjWait(&syncobj.outsyncobj, NVBUFFER_SYNCPOINT_WAIT_INFINITE);
 
                 /* Write raw video frame to file. */
                 if (!ctx->stats && ctx->out_file)
@@ -1220,6 +1229,7 @@ dec_capture_loop_fcn(void *arg)
         }
     }
 #endif
+    NvBufferSessionDestroy(session);
     cout << "Exiting decoder capture loop thread" << endl;
     return NULL;
 }
@@ -1734,7 +1744,7 @@ static bool decoder_proc_blocking(context_t &ctx, bool eos, uint32_t current_fil
                 memcpy(&temp_buf,&v4l2_buf,sizeof(v4l2_buffer));
             }
         }
-
+usleep(40000);
         /* enqueue a buffer for output plane. */
         ret = ctx.dec->output_plane.qBuffer(v4l2_buf, NULL);
         if (ret < 0)

In the patch, it creates NvBufferSession and calls NvBufferTransformAsync(). Please try this method.

Hello DaneLLL,

The fix that you offered in the comment #28 replaces the function NvBufferTransform with the pair of functions NvBufferTransformAsync and NvBufferSyncObjWait. This solution works for us, but only to some extent - we do not see artifacts in the video anymore.
However, in some cases we are using another function NvBufferCompose to fit the decoded picture without transformations in a square and then to control filling of the fields with composite_bgcontrol. Also, we are planning to use blending capabilities of NvBufferCompose.

At the moment NvBufferCompose has the same issues with interlaced streams, but this function does not have an Async alternative similar to NvBufferTransformAsync.

Could you please fix NvBufferCompose for interlaced source or provide us with the similar workaround as for NvBufferTransform?

Finally, for the purposes of ML and image processing with CUDA we are using interop from the decoder to CUDA via EGL. After this interop, and if the source is interlaces we are observing artifacts in the resulting video (see attached screenshot).

It seems that all of those problems have the same root cause, and we were hoping that asking you to fix NvBufferTransform would also make the other two issues go away. Also, it was relatively easy to demonstrate the issue in NvBufferTransform using your own code examples.

Please help us to find a way to use NvBufferCompose and CUDA/EGL with interlaced sources.

Thanks,

Hi,
Please replace with the attachment and try again:

/usr/lib/aarch64-linux-gnu/tegra/libnvmmlite_video.so

NvbufferTransform() should also work fine. But for calling the functions in multiple threads, please still create NvBufferSession for better performance.

r32_6_1_TEST_libnvmmlite_video.zip (92.4 KB)

Hi khizbulin,

Have you tried with the suggestion above? Any result can be shared? Thanks

Hi,

this solution works, thank you.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.