How to use nvcamerasrc with CUDA


I use opencv to get the video frame from nvcamerasrc,the code is :

VideoCapture cap(nvcamerasrc sensor-id=0 name=camerasrc ! video/x-raw(memory:NVMM), format=UYVY, width=some_width, height=some_height, framerate=30/1 ! nvvidconv ! video/x-raw, format=BGRx, width=some_width, height=some_height ! videoconvert ! video/x-raw, format=BGR ! appsink name=appsink)

The filter nvcamerasrc store the video in CPU memort ,When I use CUDA to process the frame, I must send the frame to CUDA memory firstly ,then use CUDA function, finally I download the frame from CUDA memory to CPU memory.
It will cost lots of time when sending data to CUDA or getting data from CUDA
so I wonder if the source filter nvcamerasrc can store the video in CUDA memory directly?

Hi BradleyY, you can make a plugin for nvivafilter or gst-videocuda elements, which will deliver the GPU memory to your CUDA kernel with zero copy. Refer the L4T Accelerated GStreamer User Guide for info about the plugins, and see this post to locate the source code example to build your own plugins:

Alternatively, here’s a C++ class that captures nvcamerasrc to CUDA memory using GStreamer appsink element:

Hi BradleyY,

You may found interesting the following information about the GstCUDA framework, I think that is exactly what you are looking for. Below you will find a more detailed description, but in summary, it consists of a framework that allows to easily and optimally interface GStreamer with CUDA, guaranteeing zero memory copies. It also supports several inputs.

GstCUDA is a RidgeRun developed GStreamer plug-in enabling easy CUDA algorithm integration into GStreamer pipelines. GstCUDA offers a framework that allows users to develop custom GStreamer elements that execute any CUDA algorithm. The GstCUDA framework is a series of base classes abstracting the complexity of both CUDA and GStreamer. With GstCUDA, developers avoid writing elements from scratch, allowing the developer to focus on the algorithm logic, thus accelerating time to market.

GstCUDA offers a GStreamer plugin that contains a set of elements, that are ideal for GStreamer/CUDA quick prototyping. Those elements consist in a set of filters with different input/output pads combinations, that are run-time loadable with an external custom CUDA library that contains the algorithm to be executed on the GPU on each video frame that passes through the pipeline. GstCUDA plugin allows users to develop their own CUDA processing library, pass the library into the GstCUDA filter element that best adapts to the algorithm requirements, executes the library on the GPU, passing upstream frames from the GStreamer pipeline to the GPU and passing the modified frames downstream to the next element in the GStreamer pipeline. Those elements were created with the CUDA algorithm developer in mind - supporting quick prototyping and abstracting all GStreamer concepts. The elements are fully adaptable to different project needs, making GstCUDA a powerful tool that is essential for CUDA/GStreamer project development.

One remarkable feature of GstCUDA is that it provides a zero memory copy interface between CUDA and GStreamer on Jetson TX1/TX2 platforms. This enables heavy algorithms and large amounts of data (up to 2x 4K 60fps streams) to be processed on CUDA without the performance caused by copies or memory conversions. GstCUDA provides the necessary APIs to directly handle NVMM buffers to achieve the best possible performance on Jetson TX1/TX2 platforms. It provides a series of base classes and utilities that abstract the complexity of handle memory interface between GStreamer and CUDA, so the developer can focus on what actually gives value to the end product. GstCuda ensures an optimal performance for GStreamer/CUDA applications on Jetson platforms.

You can find detailed information about GstCUDA on the following link:

I hope this information can be useful to you.

Best regards,