I have been trying to profile my application (which uses managed memory) with the following command:
ncu --replay-mode kernel --target-processes all --kernel-name kernelName-f -o ncu_profile_output ./executable
I get the following error:
==WARNING== An error was reported by the driver
==WARNING== Backing up device memory in system memory. Kernel replay might be slow. Consider using “–replay-mode application” to avoid memory save-and-restore.
==ERROR== UnknownError
==ERROR== Failed to profile “kernelNDGridIndexGlobalManage…” in process 3896305
==PROF== Trying to shutdown target application
==ERROR== The application returned an error code (9).
==ERROR== An error occurred while trying to profile.
==WARNING== No kernels were profiled.
This does not occur when I try to profile a version of the application that does not use managed memory. I believe the issue lies with the relevant metrics which I am looking for requiring multiple kernel passes to compute.
Is there a work around for this?