Hi folks,
I’m trying to use nsys to track page fault and I’m trying to find some detailed information like why the page fault is triggered (e…g., by prefetch or memadvise, …) But the nsys only gives me some general information, such as total faults, time, address, etc. Is there anything I can do to get the detailed info? Thanks!!
Best,
@jasoncohen to respond more fully
I assume you mean GPU page faults?
Are you using unified memory?
@santiscowgl – Sorry, the trace system we have through our kernel-mode driver for unified memory events doesn’t contain any “reason” information. That said, I am not sure if a prefetch would ever trigger a page-fault event… Keep in mind that faults are different from transfer events. Prefetch will definitely cause a transfer if the memory is not currently owned by the GPU you’re prefetching it to, but I wouldn’t expect any faults there. Faults should only occur when the CPU or GPU tries to do a load or store instruction on an address in a unified memory region which is currently owned by a different device, i.e. GPU owns it and CPU tries to read from it, or vice-versa. The fault is then handled by the unified memory driver, which initiates a transfer of the memory to the device where the fault occurred, and once the transfer finishes, the faulting instruction is retried. So while unified memory faults will basically always cause transfers, the inverse is not true, i.e. transfers aren’t only caused by faults.
Hope that helps a bit!
1 Like