Slow Optical Flow using VPI on Xavier NX

Hi,

I’m developing on Xavier NX platform with Jetpack 4.6 and I’d like to run VPI’s Dense optical flow Python sample with my data.

Running the dense optical flow script with sample data (…/assets/pedestrians.mp4, …/assets/dashcam.mp4) works as expected. I see around 50-90 FPS.

However, running the same script on my custom data (with similar resolution and bitrate) drops the FPS down to 0.1. After profiling, it seems like the locking operation with mv.rlock_cpu() as data: is the culprit.

I’d like to understand why this is happening.

Hi,

Could you share the video so we can check this issue in our environment?
mv.rlock_cpu() will trigger a memcpy to transfer the data back to the CPU.

Could you also try to run the sample with performance mode?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Thanks.

Hi,

Could you check if you’re able to access this? There’s no optical flow in the video but has the same processing time issue.

I gave performance mode a try but no luck, I still have the same 0.1 FPS.

I performed more tests on different videos with the same script.

I noticed that the performance is affected for videos with non-standard aspect ratios.

Here’s some open-source videos with standard aspect ratio:

And open-source videos with non-standard aspect ratio:

I also tried resizing the sample video I provided to a standard aspect ratio and I’m getting performance I expect, ~70-90 FPS.

Could you confirm this on your end? Also, I’d like to know if this is by design, and if there’s any way to use videos with non-standard aspect ratios without resizing.

Hi,

Sorry for the late update.

We cannot access the Google drive link due to invalid permission.
Will give the public data a try and share more info with you.

Thanks.

Hi,

The public data is .webm format so it cannot work with the sample by default.

But could you help to check if this issue is related to the multiple of 4 resolution?
It looks like the resolution of standard aspect ratio cases are multiple of 4.
While the non-standard cases are not.

Since NVENC splits input images into 4x4 pixel blocks, it might expect the input resolution be the multiple of 4

Thanks.

Hi,

My sample video (and in future, live video) is 820x616 or 1640x1232, both multiples of 4 and it doesn’t seem to work.

As for the open-source videos, could you try this?

# standard videos
wget https://motchallenge.net/sequenceVideos/ADL-Rundle-3-raw.mp4
wget https://motchallenge.net/sequenceVideos/ETH-Linthescher-raw.mp4
wget https://motchallenge.net/sequenceVideos/PETS09-S2L2-raw.mp4

# non-standard videos
wget https://motchallenge.net/sequenceVideos/KITTI-17-raw.mp4
wget https://motchallenge.net/sequenceVideos/KITTI-19-raw.mp4
wget https://motchallenge.net/sequenceVideos/KITTI-16-raw.mp4

Thanks!

Hi,

Thanks, we will give you a try.
Which quality do you use?

Thanks.

I used ‘high’ quality. Didn’t get a chance to try others.

Update: just tried it with low and medium, and I’m getting similar performance to ‘high’…i.e. slow processing.

Hi,

Sorry for the late update.

It looks like the non-standard video is not supported.
We test a non-standard video and meet NVMEDIA_STATUS_ERROR.

$ python3 main.py nvenc KITTI-16-raw.mp4 high
Processing frame 1
[ERROR] 2024-02-27 08:35:21 Work item execution failed with VPI_ERROR_INTERNAL: (NVMEDIA_STATUS_ERROR)
Fatal assertion error
#0 /opt/nvidia/vpi2/lib/aarch64-linux-gnu/libnvvpi.so.2(+0xfaac8c) [0xffff79311c8c]
#1 /opt/nvidia/vpi2/lib/aarch64-linux-gnu/libnvvpi.so.2(+0xf82350) [0xffff792e9350]
#2 /opt/nvidia/vpi2/lib/aarch64-linux-gnu/libnvvpi.so.2(+0x60fba4) [0xffff78976ba4]
#3 /opt/nvidia/vpi2/lib/aarch64-linux-gnu/libnvvpi.so.2(+0x62b9f4) [0xffff789929f4]
#4 /opt/nvidia/vpi2/lib/aarch64-linux-gnu/libnvvpi.so.2(+0x629848) [0xffff78990848]
#5 /opt/nvidia/vpi2/lib/aarch64-linux-gnu/libnvvpi.so.2(+0x12bae0c) [0xffff79621e0c]
#6 /lib/aarch64-linux-gnu/libpthread.so.0(+0x7624) [0xffff8e1ed624]
#7 /lib/aarch64-linux-gnu/libc.so.6(+0xd149c) [0xffff8e2e849c]
Aborted (core dumped)

Thanks.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

To check if this issue comes from NVENC or VPI, could you help to run the video with gst-launch-1.0 or 01_video_encode to see if any difference?

Moreover, the shared non-standard video is relatively small.
Could you try a video with a resolution > 256 to see if the same behavior occurs?

Thanks.