Using EGLStream in CUDA application through gstreamer

I am writing a camera src based on libArgus on TX2 with tegra_multimedia_api 28.3

I modified the gstVideoEncode example to use 4x nveglstreamsrc to pass the media into our gstreamer plugin.
This plugin hands the data over to a CUDA kernel which processes the input of these 4 streams.

The first question is what data is produced by nveglstreamsrc ?
Do i get a ref to an EGLStream or does nveglstreamsrc already Acquires EGLImages and passes these down the pipeline?

My first assumption was that I get EGLImages from the src. So I am trying to do the following before starting the kernel. According to samples/common/algortihm/cuda/NvCudaProc.cpp

cuGraphicsEGLRegisterImage
cuGraphicsResourceGetMappedEglFrame
cuCtxSynchronize()

But I can’t even get the code to compile / link correctly. My gst plugin fails with:

(gst-plugin-scanner:17357): GStreamer-WARNING **: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libmyplugin.so': /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libmyplugin.so: undefined symbol: cuGraphicsEGLRegisterImage

I adapted our cmake to match argus/samples/utils/CMakeLists.txt

set( CMAKE_SKIP_BUILD_RPATH false )
set( CMAKE_BUILD_WITH_INSTALL_RPATH false )
set( CMAKE_INSTALL_RPATH ${CMAKE_INSTALL_FULL_LIBDIR}/aarch64-linux-gnu/gstreamer-1.0)

find_package( CUDA REQUIRED )
find_package( PkgConfig REQUIRED )

find_package( Argus REQUIRED )
find_package(OpenGLES)
find_package(EGL REQUIRED)
find_package(X11 REQUIRED)

pkg_search_module( GST REQUIRED gstreamer-video-1.0>=1.8 )

include_directories(
  ${GST_INCLUDE_DIRS}
  ${ARGUS_INCLUDE_DIR}
  ${EGL_INCLUDE_DIR}
  ${CUDA_INCLUDE_DIR}
  ${CMAKE_CURRENT_SOURCE_DIR}
  )

cuda_add_library( ${PROJECT_NAME} SHARED ${SOURCES})

target_link_libraries( ${PROJECT_NAME}
  ${GST_LIBRARIES}
  ${EGL_LIBRARIES}
  ${CUDA_LIBRARIES}
  )

target_include_directories( ${PROJECT_NAME} PUBLIC
  myplugin-kernel
  ${RAPIDJSON_INCLUDE_DIRS}
  )

The only exception seams to be cuda_add_library … SHARED which is required by gstreamer.
We are also manually setting the rpath to work with gstreamer.

I also tried:

set( CMAKE_INSTALL_RPATH ${CMAKE_INSTALL_FULL_LIBDIR}/aarch64-linux-gnu/gstreamer-1.0 /usr/lib/aarch64-linux-gnu/tegra /usr/lib/aarch64-linux-gnu/tegra-egl)

but it did not work.

I assume it is related to the rpath or some other linker option. Please help me out how to properly link libArgus libEGL cudaEGL and so on with cmake for shared libs.

Thank you, Stefan

I managed to get it working with this ugly workaround:

target_link_libraries( ${PROJECT_NAME}
  ${GST_LIBRARIES}
  ${EGL_LIBRARIES}
  ${CUDA_LIBRARIES}
  ${ARGUS_LIBRARIES}
  EGL
  argus
  cuda
  cudart
  )

target_link_directories( ${PROJECT_NAME}
  PUBLIC /usr/local/cuda/lib64
  PUBLIC /usr/lib/aarch64-linux-gnu
  PUBLIC /usr/lib/aarch64-linux-gnu/tegra
  )

by using find_package the variable EGL_LIBRARIES and so on should actually be populated correctly and point to the actual lib. The contents are:

-- GST_LIBRARIES=gstreamer-1.0;gobject-2.0;glib-2.0
-- EGL_LIBRARIES=/usr/lib/aarch64-linux-gnu/tegra-egl/libEGL.so
-- CUDA_LIBRARIES=/usr/local/cuda/lib64/libcudart_static.a;-lpthread;dl;/usr/lib/aarch64-linux-gnu/librt.so
-- ARGUS_LIBRARIES=/usr/lib/aarch64-linux-gnu/tegra/libargus_socketclient.so

why does find_package and target_link_library not work properly?

Hi,
We have seen issues of imprecise timestamps when using nveglstreamsrc:
https://devtalk.nvidia.com/default/topic/1055835/jetson-tx2/timestamp-from-cueglframe/post/5352952/#5352952

Would like to recommend you use Argus interface directly. Please refer to

tegra_multimedia_api\samples\09_camera_jpeg_capture
tegra_multimedia_api\samples\10_camera_recording

Thank you for pointing this out. In this case I have some additional questions.

Do I understand this correctly?

nveglstreamsrc only passes a ref to an EGLStream. I have to Acquire Frames myself in our cuda-powered element.

cudaEGLStreamConsumerAcquireFrame

The EGLStream is not buffered. So whatever I acquire, is what the camera currently produces.
We are going to sync the camera captures in hardware.
If I manage the acquisition of frames myself in our cuda-powered gst element I can make sure that all the frames are produced at the same time?!

What happens if a camera did not produce a new frame since the last Acquire?

Does nveglstreamsrc pass a new buffer on every new frame?
What is actually the content of the buffers passed by nveglstreamsrc?
I did not find any reference/doc to this element!! Please point me to the doc.

I need to use Interoperability to enable Argus > CUDA zero-copy
How do I do that when using nveglstreamsrc.

I hope to clarify the situation to get a better picture of the architecture and to better integrate it into our architecture.

Thank you

I figured out now that nveglstreamsrc actually has a consumer thread that acquires frames in a loop.

The question is now… what data do I get from the src inside of the GstBuffer?

The size of the buffer is 808 and the content is not empty. But what is the content?
And how can I use it?! What do I need to cast it to?

I am very grateful for further information or some documentation?
Thank you.

Hi,
We suggest yo refer to 09_camera_jpeg_capture and 10_camera_recording. You can do processing by calling:

// Create EGLImage from dmabuf fd
ctx->egl_image = NvEGLImageFromFd(ctx->egl_display, buffer->planes[0].fd);
if (ctx->egl_image == NULL)
{
    fprintf(stderr, "Error while mapping dmabuf fd (0x%X) to EGLImage\n",
             buffer->planes[0].fd);
    return false;
}

// Running algo process with EGLImage via GPU multi cores
HandleEGLImage(&ctx->egl_image);

// Destroy EGLImage
NvDestroyEGLImage(ctx->egl_display, ctx->egl_image);
ctx->egl_image = NULL;

Implementation of HandleEGLImage()

/**
  * Performs CUDA Operations on egl image.
  *
  * @param image : EGL image
  */
static void
Handle_EGLImage(EGLImageKHR image)
{
    CUresult status;
    CUeglFrame eglFrame;
    CUgraphicsResource pResource = NULL;

    cudaFree(0);
    status = cuGraphicsEGLRegisterImage(&pResource, image,
                CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
    if (status != CUDA_SUCCESS)
    {
        printf("cuGraphicsEGLRegisterImage failed: %d, cuda process stop\n",
                        status);
        return;
    }

    status = cuGraphicsResourceGetMappedEglFrame(&eglFrame, pResource, 0, 0);
    if (status != CUDA_SUCCESS)
    {
        printf("cuGraphicsSubResourceGetMappedArray failed\n");
    }

    status = cuCtxSynchronize();
    if (status != CUDA_SUCCESS)
    {
        printf("cuCtxSynchronize failed\n");
    }

    if (eglFrame.frameType == CU_EGL_FRAME_TYPE_PITCH)
    {
        //Rect label in plan Y, you can replace this with any cuda algorithms.
        addLabels((CUdeviceptr) eglFrame.frame.pPitch[0], eglFrame.pitch);
    }

    status = cuCtxSynchronize();
    if (status != CUDA_SUCCESS)
    {
        printf("cuCtxSynchronize failed after memcpy\n");
    }

    status = cuGraphicsUnregisterResource(pResource);
    if (status != CUDA_SUCCESS)
    {
        printf("cuGraphicsEGLUnRegisterResource failed: %d\n", status);
    }
}

Please also refer to
https://devtalk.nvidia.com/default/topic/1028387/jetson-tx1/closed-gst-encoding-pipeline-with-frame-processing-using-cuda-and-libargus/post/5256753/#5256753