Hi, I’m using Nsight Compute to profile kernels on a Ubuntu system. I installed the 2022.2 and 2023.2 versions. But the 2023.2 gives error logs like this:
==ERROR== LaunchFailed
==PROF== Trying to shutdown target application
==ERROR== The application returned an error code (9).
==ERROR== An error occurred while trying to profile.
but the 2022.2 works fine with the same command.
I have searched to find solutions, and then I learned something and create a nvlog.log to set it as the NVLOG_CONFIG_FILE in which the below config is set:
UseStdout
ForceFlush
Format $sev:${level:-3}|$proc|$name|$sfunc>> $text
0i 0w 100ef 0IW 100EF global
Then I run 2023.2 again and show follow logs:
==PROF== Connected to process 22276 (/path/to/exec/cpp_run)
err:50 | cpp_run| cuda_tools_kernels| >> Failed to load memcmp module (error = 200)
==PROF== Profiling “RndInit” - 0: err:50 | cpp_run| cuda_tools_kernels| >> Failed to get modules for context
err:50 | cpp_run|cuda_tools_kernels_syscall| >> Failed to find tools module
err:50 | cpp_run| cuda_tools_kernels| >> Failed to get modules for context
err:50 | cpp_run|cuda_tools_kernels_memcmp| >> Failed to find tools module
err:50 | cpp_run| cuda_context_state| >> Failed to compare memory (999)
err:50 | cpp_run| cuda_context_state| >> Failure while looping over the copy units
err:50 | cpp_run| cuda_replay| >> Failed to optimize backing store
err:50 | cpp_run| cuda_replay| >> Failed to optimize backing store
err:50 | cpp_run| profiler| >> Client pre iteration failed (InternalError)
err:50 | cpp_run| profiler_experiment| >> Failed launching for LOP Counters::0
err:50 | cpp_run| profiler_experiment| >> Skipping OnEnd due to previous traversal error
0%…50%…err:50 | ncu| profiler_client| >> Invalid gpu duration: nan
err:50 | cpp_run| profiler| >> Profile end failed (LaunchFailed)
err:20 | cpp_run| cuda| >> ProfileSeries returned an error: LaunchFailed
err:50 | cpp_run| cuda| >> executeInternal returned an error: LaunchFailed
err:50 | cpp_run| profiler| >> Sending profiler error message: LaunchFailed
.100% - 1 pass
err:20 | ncu| api_debugger| >> Received profiler error message
err:20 | ncu| CmdlineProfiler| >> Error: 0: LaunchFailed
The command is like this:
sudo NVLOG_CONFIG_FILE=/path/to/nvlog.config LD_LIBRARY_PATH=/usr/local/cuda/lib64:/someotherlibs/ ./ncu --export /path/to/outputfile --force-overwrite --target-processes all --replay-mode kernel --kernel-name-base function --launch-skip-before-match 0 --section-folder /tmp/var/sections --sampling-interval auto --sampling-max-passes 5 --sampling-buffer-size 33554432 --profile-from-start 1 --cache-control all --clock-control base --apply-rules yes --import-source no --check-exit-code yes /path/to/cpp_run
p.s.
The 2023.2 ncu could work with option replay mode=application. But it will take a longer time to pfile and I can’t use interactive mode. I would like to know whether it would be solved or fixed in some way.
Thanks!