What type memory does BufferStream use

I have some questions about BufferStream.

  1. It seems that BufferStream needs the application to manage the buffer (allocate and release). What type of memory can I use, CPU, GPU, unified memory?

  2. Is there any memory copy involved if I use BufferStream?

  3. How is the image data transferred from the camera device to the user?

Please give more detail for the BufferStream?

For example, in the “eglimage” sample code, it creates “NativeBuffer”.

  1. Is that buffer in CPU memory?
  2. In this case, will there memory copy in the stream?
  3. Is there a way to allocate unified “NativeBuffer”, which can be directly accessed by both CPU and GPU, and use it in BufferStream?
  4. When the image data is available, can CPU and/or GPU directly access this buffer? Or I need to map it in some way?

Thanks a lot!

I would suggest checking the cudaBayerDemosaic sample for your case.

Thanks

  1. From the cudaBayerDemosaic sample, it seems the sample code creates GPU device memory buffers. Then it uses these buffers to hold the output of the CUDA kernel and then sends to EGL display. From the output of the EGLStream, it seems that the output of the stream can be mapped and directly used by the CUDA kernel. Am I correct?

  2. But for the NativeBuffer, it seems the default buffer type is NVBUF_MEM_SURFACE_ARRAY, which is not CUDA device memory, CUDA pinned memory and CUDA unified memory. Can both CPU and GPU directly access this type of memory? (Maybe through some kind of mapping?)

  3. Also, is it possible that I allocate some unified memory myself and convert it into “NvBufSurface” type and use it in a BufferStream?

Any help with above questions is really appreciated.

Hi,

NvBuffer is a CPU buffer but GPU can access via EGL mapping.
https://docs.nvidia.com/jetson/l4t-multimedia/classNvBuffer.html

In our MMAPI 18_v4l2_camera_cuda_rgb sample, the application uses V4L2 user-space buffers:

https://docs.nvidia.com/jetson/archives/r35.4.1/ApiReference/l4t_mm_18_v4l2_camera_cuda_rgb.html

The application allocates V4L2 user-space buffers (V4L2_MEMORY_USERPTR). In this situation, the driver directly fills in the user-space memory. When you allocate CUDA-mappable memory (with cudaHostAlloc), the CUDA device can access the V4L2 captured buffer without memory copy.

So you should be able to preallocate a buffer (unified if you want) and feed it into the v4l2 pipeline.

Thanks.

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.