Right now I am following this tutorial to scratch on NCU.
However, I get this ==ERROR== Launching the target application failed
after running the command ncu -f -o tutorial --set full ./tutorial
I also tried other tests, such as model_iniference.py and simple_add.cu.
At last, I tried /usr/local/cuda/nsight-compute-2022.3.0/nv-nsight-cu-cli -f -o tutorial --set full ./tutorial
.
All these returned the same Error.
Environment Setting:
ncu --version
NVIDIA (R) Nsight Compute Command Line Profiler
Copyright (c) 2018-2022 NVIDIA Corporation
Version 2022.3.0.0 (build 31729285) (public-release)
nvidia-smi
NVIDIA-SMI 470.161.03 Driver Version: 470.161.03 CUDA Version: 11.4
GPU 0: NVIDIA A100-PCIE-40GB (UUID: GPU-744fbd6b-59e0-0daa-223e-19ac51400ee0)
– MIG disabled
Operating System: AlmaLinux 8.7 (Stone Smilodon)
CPE OS Name: cpe:/o:almalinux:almalinux:8::baseos
Kernel: Linux 4.18.0-425.3.1.el8.x86_64
Architecture: x86-64
Best
Tianyu
It’s difficult to say what the issue is with this information. Could you share the output of running just “./tutorial” and then right after run “ncu -f -o tutorial --set full ./tutorial” and share the full output of that?
Hi jmarusarz,

./tutorial
just performs matrix multiplication that finishes too soon. So here I posted model_inference.py
I also tried ncu -f -o inference --set full python model_inference.py
, this returned
==ERROR== Launching the target application failed.
In the /extras/samples directory of Nsight Compute you can find a uncoalescedGlobalAccesses sample that includes a simple source file and instructions to build and profile it. Can you try that to see if it’s a specific issue with your application or whether it’s profiling in general?
I successfully compiled files and ran the following commands without error. They returned CUDA kernel information and Done
./uncoalescedGlobalAccesses
./uncoalescedGlobalAccesses 1
./uncoalescedGlobalAccesses 0 512
However, following two commands,
ncu --set full --import-source on -f -o addConstDouble3.ncu-rep ./uncoalescedGlobalAccesses
ncu --set full --import-source on -f -o addConstDouble.ncu-rep ./uncoalescedGlobalAccesses 1
still returned ==ERROR== Launching the target application failed.
Are there any informational lines before that or is it:
$>ncu --set full --import-source on -f -o addConstDouble3.ncu-rep ./uncoalescedGlobalAccesses
==ERROR== Launching the target application failed.
$>
Nothing shown before ==ERROR== Launching the target application failed.
It is just:
ncu --set full --import-source on -f -o addConstDouble3.ncu-rep ./uncoalescedGlobalAccesses
==ERROR== Launching the target application failed.
I would recommend the following steps for root-causing the issue:
- Re-install Nsight Compute, as it’s possible that a broken installation could be causing this error.
- Update to the latest version of Nsight Compute. The tool can be installed outside of the CUDA toolkit and is backwards-compatible with older drivers and toolkits. Doing so is recommended regardless of any errors, as it will include other fixes and features.
- If none of the above help, create a file named nvlog.config with the following content, and point to it with
NVLOG_CONFIG_FILE=nvlog.config
environment variable. Run again with this env variable set and post the extended log output here.
UseStdout
ForceFlush
Format $sev:${level:-3}|$proc|$name|$sfunc>> $text
+ 0i 0w 100ef 0IW 100EF global
Please also note that AlmaLinux is not officially supported. You may want to switch to a supported Linux distro.
1 Like