Hi,
Someone knows the approach to speedup memcpy from buffer to pinned memory?
Here is my code:
-
comes frame as uchar buffer
-
memcpy buffer to pinned buffer (allocated with cudaMallocHost) <— slow(200ms per 25mb frame buffer)
-
cudaMemcpy from pinned buffer to device <— fast
-
cuda operations <— fast
…
Ubuntu 14, Tegra3, Cuda 6.5
Best regards, Viktor.