memory problem about TX2

I found that the zero copy and unified memory on TX2 are completely different from the general computer. Where can I learn about the zero copy and unified memory details of TX2? Thanks!

Hi,

You can start with this tutorial:
https://docs.nvidia.com/cuda/cuda-for-tegra-appnote/index.html#memory-management

Thanks.

Thanks,I have saw it, but I still have two problem.
1, I test the TX2 bandwidth, when using pageable memory I found the DtoH and HtoD bandwidth is different,but when using pinned memory they are same, why?
2, how to choose to use zero-copy or unified memory? I have use three kinds memory in vectorAdd, the result show that unified memory is best, but when processing multiple frames of data, I found zero-copy is best. So I want to know the good method to use these memory?

Hi,

First, have you maximized the CPU/GPU clocks?

sudo ./jetson_clocks.sh

Try to choose a memory type based on your use case:

1. If you will read/write the memory frequently, it’s recommended to use unified memory.
Our GPU driver will take response for the synchronization. (more efficient)

2. If you just read the image one time per frame, you can try zero-copy memory.
This can save you the overhead to enable unified memory.

Here is another experiment for your reference:
https://devblogs.nvidia.com/maximizing-unified-memory-performance-cuda/

Thanks.

Thank you very much!