I kept getting the message:
==WARNING== Launching the workload is taking more time than expected. If this continues to hang, terminate the profile and re-try by profiling the range of all related launches using ‘–replay-mode range’. See Kernel Profiling Guide :: Nsight Compute Documentation for more details.
It turns out I was simply impatient, I thought given the message the profiling had hung but after 2.5 hours it reported it had reached 50% and after 4 hours completed the report.
You really should provide some feedback that your application is still operating prior to 2.5 hours!