On Jetson kit if we allocate pinned memory to increase the speed of H2D/D2H transfer what we have observed is that the time to access the memory increases on the host side. i.e.
If I allocated memory using malloc and did some simple computation like vector addition it takes lesser time as compared to allocating pinned memory and doing same computation on cpu.
The difference in timing is as high as 2x slowdown. To increase speed on GPU we end up increasing host side time. This is true for even small allocations of few kb memory ( i.e. to nullify the effect of OS not getting enough pages ) which is very small.
Has anyone observed this behavior?