Deepstream VPI 3.0 Wrapper Compatibility dGPU

Please provide complete information as applicable to your setup.

**• dGPU
• Deepstream 6.4
**• Is it possible to use the VPI 3.0 vpiImageCreateWrapper on x86 dGPU devices to keep the image data from NvBufSurface in the GPU memory? **

As an example, I am taking the gst-dsexample gst-plugin and I am modifying it so that it creates a PIP(Picture-in-Picture) in the bottom right 20% of the screen with an enhanced close-up of the cropped bounding box region. I successfully did this with OpenCV but this is doing so with a hit because I have to copy the Image Data from GPU memory to the CPU.

As an example, In the gst_pip_transform_ip function, I have modified it to the following so far:

/**
 * Called when element recieves an input buffer from upstream element.
 */
static GstFlowReturn gst_pip_transform_ip(GstBaseTransform *btrans, GstBuffer *inbuf) {
    GstPip *pip = GST_PIP(btrans);
    VPIStatus status;

    // Get batch metadata
    NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta(inbuf);
    if (!batch_meta) {
        GST_ERROR_OBJECT(pip, "No batch metadata found");
        return GST_FLOW_ERROR;
    }

    // Find best object
    NvDsObjectMeta *obj_meta = find_central_object_with_highest_confidence(pip, batch_meta);
    if (!obj_meta)
        return GST_FLOW_OK;

    // Map the input buffer
    GstMapInfo in_map_info;
    if (!gst_buffer_map(inbuf, &in_map_info, GST_MAP_READ)) {
        GST_ERROR_OBJECT(pip, "Failed to map input buffer");
        return GST_FLOW_ERROR;
    }

    NvBufSurface *surface = (NvBufSurface *)in_map_info.data;

    // Calculate PIP dimensions
    int32_t pipWidth = surface->surfaceList[0].width * 0.2;
    int32_t pipHeight = surface->surfaceList[0].height * 0.2;
    int32_t pipPosX = surface->surfaceList[0].width - pipWidth;
    int32_t pipPosY = surface->surfaceList[0].height - pipHeight;

  // Create input image from CUDA memory
    // TODO Can only Wrap the image into an NVBuffer on Tegra platforms ONLY!!
    VPIImageData image_data;
    
    image_data.bufferType = VPI_IMAGE_BUFFER_NVBUFFER;
    NvBufSurfaceMapParams nvbuf_surface_map_params;
    int result = NvBufSurfaceGetMapParams(surface, 0, &nvbuf_surface_map_params);
    assert(result == 0);
    assert(nvbuf_surface_map_params.fd != 0);
    image_data.bufferType = VPI_IMAGE_BUFFER_NVBUFFER;
    image_data.buffer.fd = nvbuf_surface_map_params.fd;

    status = vpiImageCreateWrapper(&image_data, nullptr, pip->backend, &pip->input_image);

    if (status != VPI_SUCCESS)
        goto error;

    // Synchronize the stream
    status = vpiStreamSync(pip->vpi_stream);
    if (status != VPI_SUCCESS)
        goto error;

    gst_buffer_unmap(inbuf, &in_map_info);
    return GST_FLOW_OK;

error:
    gst_buffer_unmap(inbuf, &in_map_info);
    return GST_FLOW_ERROR;
}

Stepping through with a debugger, the above works on the Jetson Device but it doesn’t work on my dGPU x86 system. Looking at the documentation, it appears that I am not able to use this for dGPU per this:

The following buffer types are supported:
VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR
VPI_IMAGE_BUFFER_EGLIMAGE (on Tegra platforms only)
VPI_IMAGE_BUFFER_NVBUFFER (on Tegra platforms only)

Looks like I will need to determine if a system is using a dGPU and either use VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR or VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR. Does anyone have examples of how to properly do this?

Thanks

Theoretically it can if you match the NvBufSurface and the VPIImageData correctly. There is no open source sample for it.

You may try:
NvBufSurface.memType VPIImageData.bufferType
NVBUF_MEM_CUDA_DEVICE VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR
NVBUF_MEM_CUDA_UNIFIED VPI_IMAGE_BUFFER_CUDA_PITCH_LINEAR
NVBUF_MEM_CUDA_PINNED VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR