Overview of memory - driver, kernel, DMA, userspace, CUDA, zero-copy

user108834 · January 25, 2022, 9:36pm

Is there a conceptual description of Jetson Nano memory management, including driver/kernel, DMA memory, user space, CUDA memory access, zero-copy etc.?

Would like a good overview of where memory is stored, who allocates/deallocates, who has access under what conditions, constraints for zero-copy access, all that sort of stuff. Haven’t found yet on search.

Our specific C/C++ application would read from a USB camera V4L2 (YUYV), likely do CUDA processing on the frame, forward to the H.264 encoder, and send the resulting bitstream via our custom network code. So we’d like to understand the most efficient way to use Jetson Nano memory for the handoffs.

Thanks in advance for your help!

DaneLLL · January 26, 2022, 4:19am

Hi,
For this use-case we would suggest run 12_camera_v4l2_cuda sample and check memory usage by executing top or sudo tegrastats. The sample demonstrates capturing frame data into NvBuffer(DMA buffer) directly.

The sample can show camera preview. For video encoding, please apply this patch and give it a try:
TX2 Camera convert/encode using Multimedia API issue - #17 by DaneLLL

user108834 · January 26, 2022, 6:25pm

Thank you for your reply!

I agree re 12_camera_v4l2_cuda, and have been already looking at that sample.

Re “DMA buffer” etc, is there any conceptual overview on the different types of memory, how things get pinned and by whom, what sorts of memory are accessible from which levels (kernel, userspace, GPU), memory lifecycle (who allocates, deallocates)?

I’ve seen forum posts where zero-copy memory (that we’d think would be more efficient because there’s no copying) results in lower performance because zero-copy disables CPU and GPU cache. So, for designing our solution (that will include stuff similar to the 12_camera_v4l2_cuda code), we need thorough understanding of the Nvidia memory options available and how to properly use them.

Is there any chance there’s an overview writeup in an introduction-to-Nvidia-architecture somewhere?

I did find this document for distant clues, though not sure if the described architecture applies to the Nano: https://docs.nvidia.com/cuda/gpudirect-rdma/index.html#basics-of-uva-cuda-memory-management

DaneLLL · January 27, 2022, 2:35am

Hi,
The document is for desktop GPU and may not work on Jetson Nano. General use-case is USB camera and it use USB Video Class(UVC) driver to capture frames through v4l2. 12_camera_v4l2_cuda is the optimal solution on Jetson platforms. Please give it a try.

user108834 · January 27, 2022, 2:43am

Thank you very much!

user108834 · January 29, 2022, 1:10am

For a partial answer to my original question re the overview of V4L2 memory architecture/concepts, the link below from the Unix Kernel folks may be a useful starting point. This is generic V4L2 without NVIDIA refinements.

https://docs.kernel.org/userspace-api/media/v4l/io.html

system · March 2, 2022, 2:55am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
V4L2 reference driver source code Jetson TX2 mmapi	13	3196	October 18, 2021
Optimising GPU and CPU memory transfer time (CUDA/Hardware)? CUDA Programming and Performance hw , cuda	8	3770	January 7, 2022
How to access DMA buffer from CUDA device? Jetson Nano camera , mmapi	4	1259	January 12, 2022
Best hardware options to reduce GPU and CPU memory transfer time? Jetson Nano	6	1025	January 19, 2022
Encode frames using Nvenc v4l2 Jetson Nano camera , encoder , nano	14	2563	October 3, 2021
Does USB Camera Make Use of GPU/Hardware Encoding? Jetson Nano	2	690	October 14, 2021
Transfer video frames from a PCIe capture card to Jetson TX1 device memory (for RT video processing) Jetson TX1	20	5775	June 1, 2018
Jetson Nano Device Local Memory Specifications Jetson Nano	7	3590	October 18, 2021
How do I receive camera data with GPU using Jetson nano? Jetson Nano camera , cuda	2	465	August 24, 2022
CUDA Zero Copy On TX1 Jetson TX1	20	6817	October 18, 2021

Overview of memory - driver, kernel, DMA, userspace, CUDA, zero-copy

Related topics