NCU and Nsys hangs Indefinitely

Hello!

I am trying to profile a fairly big kernel on an A100. This kernel completely saturates GPU resources, with 240 registers per thread and configured with 216 blocks, with each having 128 threads and 48KB shared memory.

Running Nsys profile on the kernel works with 1 or 2 launches but hangs indefinitely with >= 4. NCU completely freezes for even a single launch of the kernel. How do I get both working?

NCU reports:

0%...
==WARNING== Launching the workload is taking more time than expected. If this continues to hang, terminate the profile and re-try by profiling the range of all related launches using '--replay-mode range'. See https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html#replay for more details.

nsys -v

NVIDIA Nsight Systems version 2024.6.2.225-246235244400v0

ncu -v

NVIDIA (R) Nsight Compute Command Line Profiler
Copyright (c) 2018-2025 NVIDIA Corporation
Version 2025.1.1.0 (build 35528883) (public-release)

Apologies, my code was at fault, not the tools. All is resolved now.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.