I want to profile a program which has two phases. Assume after 10 seconds it enters the main phase. If I use nvprof for the whole run, it will include the first phase data in the result. The document at https://docs.nvidia.com/gameworks/content/developertools/desktop/attach_cuda_to_process.htm seems to be a windows and visual studio method.
Is there any way to attach nvprof to a running process?