Hello! I am currently learning cuda and this is my first time using nsight compute.
I am trying to use compute to generate a report. So I opened compute as admin.
Please help me.
Output:
Preparing to launch the Profile activity on localhost...
Launched process: ncu.exe (pid: 25320)
C:/Program Files/NVIDIA Corporation/Nsight Compute 2025.3.0/target/windows-desktop-win7-x64/ncu.exe --config-file off --export "C:/Users/yash/OneDrive/Documents/NVIDIA Nsight Compute/gettings_started.ncp-rep" --force-overwrite C:/cuda/getting-started/cuda-getting-started/build/bin/Debug/cis5650_getting_started.exe
Launch succeeded.
Profiling...
==PROF== Connected to process 12840 (C:\cuda\getting-started\cuda-getting-started\build\bin\Debug\cis5650_getting_started.exe)
==PROF== Profiling "createVersionVisualization" - 0: 0%==ERROR== UnknownError
--> ==ERROR== Failed to profile "createVersionVisualization" in process 12840 <--
==PROF== Trying to shutdown target application
Process terminated.
Yeah but I cant when I try : Sorry, the file you are trying to upload is not authorized (authorized extensions: woff2, woff, pdf, doc, docx, txt, gz, zip, log, gif, jpeg, png, jpg, mov, mp4, webm, m4v, 3gp, ogv, avi, mpeg).
I rename the file to executable.exe. Then I can run the sample successfully on my Turing GPU. Also I can execute “ncu executable.exe” successfully.
I have 2 GPUs on my machine, another one is not Turing. So I set CUDA_VISIBLE_DEVICES to my Turing. I see there are 2 graphics card in your env also. That may be the problem.
When I close the application, I can get profile result successfully.
Hi @viraj, I have tried commenting out the body of createVersionVisualization()but the error still persist. Also the exe is able to run on its own and nsight VS integration and nsight system had no issue
hi, so I was playing around with the arguments of ncu and --replay-mode application seems to work (although I have no idea if it’s working corretly or not.)
When buliding the project were you facing issues with cudaThreadSynchronize()? and did you replaced it with CudaDeviceSynchronize()?
Are you facing any other issues with Nsight System?
Are you able to access System Info from VS Nsight Integration like Shehzan Mohommad did in the debugging lab recording?
So from my research there are two possiblities.
Somehow there is an issue with our installation or cuda toolkit has issues with our GPUs (mine is GTX 1650)
More likely where there in interop of cuda and display (OpenGL) there windows has strict watchdog timers and profilling overhead. So any long running kernels can trigger TDR when trying to save the state. So either we have to disable tdr. After going through MSDN one viable option seemed to be in regedit setting TdrLevel to 0. I did it but the issue still persisted.
The other workaround which I found on developers forum was to use flag --replay-mode application.
NOTE: Still I am not sure if this is working correctly so can you please ask the TAs. I am not a student of Uni of Penn and am learning on my own so if you find a definite solution please inform me.
Edit:
I dont get it why createVersionVisualization() is failing. I commented out the entier body of the function and still the error persisted. Since the function does nothing so, it shouldnt have any overhead and it shouldnt trigger TDR. I also tried filling just one pixel instead of entier buffer but the issue persists.
Please create a file with below content and setting the NVLOG_CONFIG_FILE environment variable to this file. Then run with “ncu”, you’ll get log recorded in /tmp/nvlog.log
$ /tmp/nvlog.log
UseStdout
ForceFlush
Format |$time|$sev:${level:-3}|$proc|$tid|$name>> $text
Hi @veraj. I am on windows. I created nvlog.log file in C:\Users\yash\AppData\Local\Temp\nvlog.log and pasted the below contents in it. and then set the env variable NVLOG_CONFIG_FILE to its location
UseStdout
ForceFlush
Format |$time|$sev:${level:-3}|$proc|$tid|$name>> $text
-0i 0w 100ef 0IW 100EF global
-100i 100w 100ef 0IW 100EF regop_tgt_dta
But it didnt work. The program still crashed and it didnt recorded the log. maybe the above soln works for linux bash
Hi veraj. Now on running ncu I am getting the follow diagnostics:
==PROF== Connected to process 8864 (C:\university-of-penn\gpu-programming-and-architecture\getting-started\cuda-getting-started\build\bin\Debug\cis5650_getting_started.exe)
==PROF== Profiling "createVersionVisualization" - 0: 0%
|20:42:59:433|err:50| cis5650_getting_started.exe|24604| cuda_context_state>> Async context error while copying!
|20:42:59:434|err:50| cis5650_getting_started.exe|24604| cuda_context_state>> Failed to transfer context state!
|20:42:59:434|err:50| cis5650_getting_started.exe|24604| cuda>> Failed to save context state!
|20:42:59:434|err:20| cis5650_getting_started.exe|24604| cuda_replay>> Failed to create replay state
|20:42:59:434|err:50| cis5650_getting_started.exe|24604| cuda>> Failed to save context state
|20:42:59:434|err:50| cis5650_getting_started.exe|24604| cuda>> Failed to save context state
|20:42:59:434|err:50| cis5650_getting_started.exe|24604| cuda>> executeInternal returned an error: UnknownError
|20:42:59:434|err:50| cis5650_getting_started.exe|25988| profiler_target>> Sending profiler error message: UnknownError
|20:42:59:434|err:20| ncu.exe|17472| api_debugger>> Received profiler error message
|20:42:59:434|err:20| ncu.exe|17472| CmdlineProfiler>> Error: 0: UnknownError
==ERROR== UnknownError
==ERROR== Failed to profile "createVersionVisualization" in process 8864
==PROF== Trying to shutdown target application
==ERROR== An error occurred while trying to profile.
I think this might be useful:
C:\university-of-penn\gpu-programming-and-architecture\getting-started\cuda-getting-started\build\bin\Debug>ncu --version
NVIDIA (R) Nsight Compute Command Line Profiler
Copyright (c) 2018-2025 NVIDIA Corporation
Version 2025.3.0.0 (build 36273991) (public-release)
C:\university-of-penn\gpu-programming-and-architecture\getting-started\cuda-getting-started\build\bin\Debug>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Jul_16_20:06:48_Pacific_Daylight_Time_2025
Cuda compilation tools, release 13.0, V13.0.48
Build cuda_13.0.r13.0/compiler.36260728_0