LaunchFailed on windows, nsignt compute 2023.3

Hi, I am new to Nsight compute. And when I am using Nsight compute 2023.3 on windows, I got LaunchFailed.
Can anyone help me?
Thanks!

Hi,
Thanks for using NVIDIA Nsight Compute ! And sorry for the issue you encountered.
Can you provide more information ?

  1. Does the sample work correctly without Nsight Compute ?
  2. Which command line do you execute for profile ?

Hi,

  1. Yes. the sample works correctly without Nsight Compute.
  2. I use the Nsight Compute gui for profile. “New project” → specify the working directory and Application Executable → choose “Interactive Profile” → “Launch” → click “Run to Next Kernel” → click “Profile Kernel”, then I get the error.
    The GUI is like below.

Does this issue happen to all CUDA samples or just this specific sample ? If it happens to all, please tell us

  1. Windows version
  2. Driver version

If other sample do not have such issue, then this should be sample specific. Can you provide the source-code or binary to us for further investigate ?

My windows version is windows10, and the driver version is 545.84.
I tried another simple sample, it worked correctly.
I think it’s sample specific. But the source-code is embedded in a very complicated program, and the it requires a very complicated environment to run the binary.
I guess maybe it’s because some memory access is out of bounds. Will the out of bounds memory access cause the launchFailed error?
Thanks~

I guess another issue is the shared memory. If I don’t use the shared memory, the profiling will be OK. But if I use the shared memory, the profiling will cause the launchFailed error.
Will the shared memory cause the launchFailed error?

Hi, @390032167
This looks like a issue our dev recently fixed, can you get a latest Nsight Compute 2023.3.1 to have a try ?
Please collect the “basic” set or a selection of sections that don’t include the “SourceCounters”.

It works when I collect the “basic” set. Thanks~
What is the issue? Will the “SourceCounters” issue be fixed in coming Nsight Compute versions?

Hi, @390032167

Can you try with 2023.3.1 to see if your original issue has gone ?

No. The original issue still exists. When I choose the full metric, the computer will restart.

Sorry for the issue you met.

Did you see “LaunchFailed” again on this 2023.3.1?
If not, I suppose this maybe a different issue.
Did you mean NCU will cause computer restart, and only specific to your sample ?

Yes, I see “LaunchFailed” again on 2023.3.1 when I choose the full metric.
Sometimes, the NCU will cause computer restart. I think it may be specific to my sample.

Thanks for trying again. Yes, it looks like specific to your sample.
If possible, please provide a repro and we can help to check.

For now, I think you can continue to use the tool by disable “SourceCounters” collection.

Hi, the nsight compute reported this: NVIDIA Nsight Compute,An error was reported by the driver. Profiling failed because a driver resource was unavailable. Ensure that no other tool (like DCGM) is concurrently collecting profiling data. See Kernel Profiling Guide :: Nsight Compute Documentation for more details.

Is the problem of the driver?

Hi, @390032167

  • Profiling failed because a driver resource was unavailable. The error indicates that a required CUDA driver resource was unavailable during profiling. Most commonly, this means that NVIDIA Nsight Compute could not reserve the driver’s performance monitor, which is necessary for collecting most metrics.This can happen if another application has a concurrent reservation on this resource. Such applications can be e.g. DCGM, a client of CUPTI’s Profiling API, Nsight Graphics, or another instance of NVIDIA Nsight Compute without access to the same file system (see serialization for how this is prevented within the same file system).If you expect the problem to be caused by DCGM, consider using dcgmi profile --pause to stop its monitoring while profiling with NVIDIA Nsight Compute.

Yes, I know this. But I think my case is not one of the cases listed. Is there any other methods? Or can I see some more imformation about the error?

I mean, is there some text log, instead of the little information given by the nsight?

Sorry, I am afraid we can’t know what exactly happened unless we had a repro

This topic was automatically closed after 9 days. New replies are no longer allowed.