Performance about VPI ConvertImageFormat

fire9953 · July 1, 2024, 8:53am

Hi,

My L4t info:
R35 (release), REVISION: 4.1, GCID: 33958178, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 1 19:57:35 UTC 2023
VPI: NV_VPI_VERSION_STRING “2.3.9”

Recently, I experimented with using VPI for distortion correction，and it works. But I found that most of the time was spent on color space conversion. After did some test with VPI, i have some doubt about color convert with VPI.

I read a png file（1920*1080）, and test convert the BGRA format mat to VPI_IMAGE_FORMAT_NV12_ER。

I ran CUDA and VIC 10 times respectively，and it takes longer than I expected.
Please give me some help, why it is so slow?

Run with VPI_BACKEND_VIC:
截屏2024-07-01 16.37.56

Run with VPI_BACKEND_CUDA:
截屏2024-07-01 16.38.40


VPIImage vimg = nullptr;
CHECK_STATUS(vpiStreamCreate(VPI_BACKEND_CUDA, &stream));
CHECK_STATUS(vpiImageCreate(width, height, VPI_IMAGE_FORMAT_NV12_ER, 0, &tmpIn));

// run this method, loop 10 times
void testConvertImageFormat(cv::Mat &cvImage) {
    if (vimg == nullptr)
    {
        // Now create a VPIImage that wraps it.
        CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImage, VPI_IMAGE_FORMAT_BGRA8, 0, &vimg));
    }
    else
    {
        CHECK_STATUS(vpiImageSetWrappedOpenCVMat(vimg, cvImage));
    }

    CHECK_STATUS(vpiSubmitConvertImageFormat(stream, VPI_BACKEND_CUDA, vimg, tmpIn, NULL));
    CHECK_STATUS(vpiStreamSync(stream));

AastaLLL · July 2, 2024, 6:16am

Hi,

Have you maximized the device performance before benchmarking?

Please note that the VIC clocks aren’t included in the nvpmodel and jetson_clocks.
So you will need to use the script in the below link to set a higher VIC clock.

https://docs.nvidia.com/vpi/algo_performance.html#maxout_clocks

Thanks.

fire9953 · July 3, 2024, 8:47am

I have tryed max the clock, and tested convert image format
(1080P, BGRA to VPI_IMAGE_FORMAT_NV12_ER)

VPI_BACKEND_VIC cost about 2-3ms one time；
VPI_BACKEND_CUDA cost about 1-2ms one time;

I think this is still not as fast as i expected, as most time opencv will finish convert from BGR to YUV in 1ms on the same device；

1，Is there still room for improvement？whether i should choose cuda for color format conversion for better performance ?
2，can the max mode be maintained continuously？

AastaLLL · July 18, 2024, 7:12am

Hi,

Please check our performance table below:
https://docs.nvidia.com/vpi/algo_imageconv.html#algo_imageconv_perf

RGBA8 to NV12_ER is around 0.1ms with CUDA and 0.88ms with VIC.
Could you try to create the VPI images only with the backend you needed (via flag) and benchmark it again?

https://docs.nvidia.com/vpi/group__VPI__Image.html#gab2ecbae4459652c3e2ec8572860d1852

You can keep the device in the max mode continuously.

Thanks.

system · August 14, 2024, 4:55am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
VIC performance Jetson AGX Orin vpi	6	1109	January 25, 2024
VPI very slow compared to OpenCV CPU Jetson Nano vpi	7	1827	November 10, 2021
Very slow performance of blur using VPI Jetson AGX Xavier jetson-inference , vpi	15	1348	October 18, 2021
How to prevent vpiSubmitConvertImageFormat from calling cudaGraphicsEGLRegisterImage, which kills performance? Jetson AGX Orin cuda	9	90	December 5, 2024
VPI RGBA -> NV12 more than 2x slower than documented Jetson AGX Xavier graphics	6	779	October 18, 2021
Image processing speed issue with CUDA CUDA Programming and Performance	2	167	June 14, 2024
Calling vpi for computation is very slow Jetson AGX Xavier vpi	5	405	November 27, 2023
vpiSubmitTemporalNoiseReduction fails with VPI_ERROR_INVALID_ARGUMENT on buffer created by vpiImageCreateWrapper/VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR Jetson AGX Orin cuda , vpi	5	47	December 30, 2024
VPI timings Jetson AGX Xavier nvbugs , vpi	15	823	October 18, 2021
Using VPI in GStreamer Jetson AGX Orin camera , gstreamer , documentation , vpi	51	4984	March 8, 2023

Performance about VPI ConvertImageFormat

Related topics