Hi all,
I’m working on a Orin Nano with Jetpack 5.1.1 [L4T 35.3.1].
In my code, I have a cudaMemcpy2DAsync
with the flag cudaMemcpyDeviceToDevice
set. However, when I’m profiling the execution with $ nsys profile ...
, this memcpy is interpreted as a DeviceToHost
, (device to pinned memory).
I couldn’t create a minimal reproducible example, but does anyone have an idea of what might be happening?
Could it be that Nsight is misinterpreting the cudaMemcpy
?
Or is NVCC perhaps doing something unexpected with Jetson’s unified memory?
Thank you in advance,