Unified Memory / Page Fault Profiling on Turing

Hello everyone,

Is there a way to profile memory migrations/copies on a turing GPU ?

I used nvprof for a while on pascal GPUs but it won’t work on a turing one. Moreover I can’t find any metrics for that in nsight compute …

Best,

nsight systems has a unified memory trace:

https://docs.nvidia.com/nsight-systems/index.html#nsight_systems/2018.3.0-x86/11-cuda-trace.htm%3FTocPath%3D_____11