LaunchFailed when using Nsight Compute 2023.2

Hi, I’m using Nsight Compute to profile kernels on a Ubuntu system. I installed the 2022.2 and 2023.2 versions. But the 2023.2 gives error logs like this:

==ERROR== LaunchFailed
==PROF== Trying to shutdown target application
==ERROR== The application returned an error code (9).
==ERROR== An error occurred while trying to profile.

but the 2022.2 works fine with the same command.

I have searched to find solutions, and then I learned something and create a nvlog.log to set it as the NVLOG_CONFIG_FILE in which the below config is set:

UseStdout
ForceFlush
Format $sev:${level:-3}|$proc|$name|$sfunc>> $text

  • 0i   0w 100ef 0IW 100EF   global
    

Then I run 2023.2 again and show follow logs:

==PROF== Connected to process 22276 (/path/to/exec/cpp_run)
err:50 | cpp_run| cuda_tools_kernels| >> Failed to load memcmp module (error = 200)
==PROF== Profiling “RndInit” - 0: err:50 | cpp_run| cuda_tools_kernels| >> Failed to get modules for context
err:50 | cpp_run|cuda_tools_kernels_syscall| >> Failed to find tools module
err:50 | cpp_run| cuda_tools_kernels| >> Failed to get modules for context
err:50 | cpp_run|cuda_tools_kernels_memcmp| >> Failed to find tools module
err:50 | cpp_run| cuda_context_state| >> Failed to compare memory (999)
err:50 | cpp_run| cuda_context_state| >> Failure while looping over the copy units
err:50 | cpp_run| cuda_replay| >> Failed to optimize backing store
err:50 | cpp_run| cuda_replay| >> Failed to optimize backing store
err:50 | cpp_run| profiler| >> Client pre iteration failed (InternalError)
err:50 | cpp_run| profiler_experiment| >> Failed launching for LOP Counters::0
err:50 | cpp_run| profiler_experiment| >> Skipping OnEnd due to previous traversal error
0%…50%…err:50 | ncu| profiler_client| >> Invalid gpu duration: nan
err:50 | cpp_run| profiler| >> Profile end failed (LaunchFailed)
err:20 | cpp_run| cuda| >> ProfileSeries returned an error: LaunchFailed
err:50 | cpp_run| cuda| >> executeInternal returned an error: LaunchFailed
err:50 | cpp_run| profiler| >> Sending profiler error message: LaunchFailed
.100% - 1 pass
err:20 | ncu| api_debugger| >> Received profiler error message
err:20 | ncu| CmdlineProfiler| >> Error: 0: LaunchFailed

The command is like this:

sudo NVLOG_CONFIG_FILE=/path/to/nvlog.config LD_LIBRARY_PATH=/usr/local/cuda/lib64:/someotherlibs/ ./ncu --export /path/to/outputfile --force-overwrite --target-processes all --replay-mode kernel --kernel-name-base function --launch-skip-before-match 0 --section-folder /tmp/var/sections --sampling-interval auto --sampling-max-passes 5 --sampling-buffer-size 33554432 --profile-from-start 1 --cache-control all --clock-control base --apply-rules yes --import-source no --check-exit-code yes /path/to/cpp_run

p.s.
The 2023.2 ncu could work with option replay mode=application. But it will take a longer time to pfile and I can’t use interactive mode. I would like to know whether it would be solved or fixed in some way.

Thanks!

1 Like

Thanks for sharing this detailed log information. It looks like we’re failing to load an internal module. To understand that better, can you share what driver version you are running on? You can find it with the “nvidia-smi” CLI command.

1 Like

Hi,

The driver version is 495.29.05

Best,

It’s likely that the old driver version is mismatching with the newest version of Nsight Compute. Are you able to update to a newer driver? You can see the compatibility table for tools packaged in Cuda Toolkit 12.x (the 2023.2 of Nsight Compute) here CUDA Compatibility :: NVIDIA Data Center GPU Driver Documentation

I see you submitted a similar issue here Nsight compute 2023.2: consistent Launch Fails for one of the kernels Is it the same? If so, let’s continue the discussion there.

1 Like