Memory System Questions

huzi · October 31, 2019, 7:33am

Hello,

I had a few questions about the Xavier SoC.

Is it a unified shared memory system? That is, can data be passed from the CPU to the GPU without any copies at all?
Are the CPU and GPU coherent with each other? That is, are there cache flushes required when CPU data is accessed by the GPU and vice versa?
Do the CPU and GPU share a last level cache?

Thank you!

AastaLLL · November 4, 2019, 9:07am

Hi,

1.
The physical memory is shared by CPU and GPU so that data can shared from CPU to GPU.
Please noticed that CPU and GPU have their own memory address so a special type allocation is required for this feature.

2.
There is a new hardware on Xavier for IO coherency.
It allows GPU to snoop CPU cache but CPU cannot snoop through device’s cache.

3.
Our L3 cache is configured to share cross CPU clusters.

Thanks.

huzi · November 6, 2019, 6:23pm

Thank you for the info! I had a few followup questions.

1.So there would be no memcopies but there would be an address translation and potentially page faults?

If I understand this correctly, the CPU caches are not flushed upon kernel launch, and when the GPU accesses that cached data, a coherence mechanism gets it from the CPU’s cache.
However, upon kernel finish, the GPU’s caches are flushed. Is this correct?
Tying in with the above, the GPU cannot access the L3 but if there is data in the L3, the new IO coherence mechanism will get it for the GPU from the the L3?

AastaLLL · November 19, 2019, 7:23am

Hi,

1. The CPU memory can be shared to GPU after EGL mapping. However, cache handling is various from device and GPU generation.
Please check this page for the support matrix of cache:
https://docs.nvidia.com/cuda/cuda-for-tegra-appnote/index.html#memory-management

2. CPU and GPU have their own cache. Cache is flushed on demand.

3. No. Currently, L3 cache is only accessible for CPU clusters.

Thanks.