After reading this guide I thought I no longer need to use mecpy in case of unified memory on Tegra but probably I am wrong: CUDA for Tegra :: CUDA Toolkit Documentation This application note provides an overview of NVIDIA® Tegra® memory architecture and considerations for porting code…

I fail to understand what your code does in some functions, but you would try to : First allocate unified memory buffers for kernel, input and output buffers. These would be available from CPU and GPU at same address. Fill your unified memory kernel from CPU. It will be available from GPU as well. …

You may try these examples for a way of using unified memory: [image] Eliminate upload/download for OpenCV cuda::GpuMat using shared memory? Jetson Nano I don’t know for sure, but my understanding or feeling so far is that you would have to allocate with a special allocato…

Thanks, but I used cudaMallocManaged and then allocated the unified memory. In this case, shouldn’t memcpy be used anymore? Can I please ask you to modify my code to understand where am I wrong in allocating the memory to pass to the CUDA kernel? Thanks again

I took a few minutes to give you these links that would help to better understand. You replied before trying any. Sorry, but I’m not working for you.

I have already read these links before making the post, I apologize if you interpreted this way … unfortunately in the posts you linked to me (one is a topic that I created and to which you have brilliantly answered) the code is written in opencv and I don’t understand where am I wrong in my CUDA co…

Sorry I didn’t remember your pseudo. Ok, I’ll try to have a look at your code and help if I can, but I can’t promise any delay for personnal reasons.

Dear Honey thank you. The problem was related to the fact that in the posted code, not using opencv I was messing with pointers. I followed your suggestions step by step and realized that I was not using managed memory when passing the pointer to the kernel. I rewrote everything using OpenCv and rea…

Unified memory with CUDA on Jetson Nano needs memcpy?

Robotics & Edge Computing Jetson Systems Jetson Nano

Honey_Patouceul October 22, 2020, 5:27pm 2

You may try these examples for a way of using unified memory:

In short, first allocate unified memory. You would then be able to use its address for both CPU and GPU processing, such as read from CPU into it, then transform it from GPU.

Topic		Replies	Views
Unified memory for NPP library Jetson Nano cuda	1	462	February 16, 2022
Zero Copy Memory vs Unified memory CUDA processing Jetson TX1	27	21272	February 23, 2018
Eliminate upload/download for OpenCV cuda::GpuMat using shared memory? Jetson Nano opencv	13	21741	September 11, 2020
Initialize unified memory Jetson TX2 camera , cuda	1	408	May 5, 2020
Unified Memory not reachable by GPU? Jetson Nano cuda	2	727	October 27, 2021
Dual problems with unified memory Jetson Nano	7	1359	March 2, 2020
Help please! - Image prcoessing in Jetson Jetson Nano cuda	4	452	September 5, 2022
Unified memory not working completely Jetson TX1	3	1491	July 19, 2017
cudaMallocManaged not implemented? Jetson Orin Nano cuda	12	1028	September 20, 2023
Kernel lunch overhead increases significantly (10x) when using unified memory on TK1 and TX1 Jetson TK1	5	3514	August 31, 2018

Unified memory with CUDA on Jetson Nano needs memcpy?

Related topics