"Timeout reached when waiting for CUDA event to initiate flush." - No CUDA Traces from Profiler

I am trying to check whether my kernel and memcpy runs concurrently on two GPUs or not. So I am using Nsight Systems 2019.3.3 and run the .exe of my CUDA project written with Visual Studio 2015 with it. I have all options within the project enabled so that I can be sure that it should output CUDA Traces. When I began with profiling, some runs had CUDA Traces and some didn’t, to me for no obvious reason. But now, I can’t get it to output them anymore.

In the report it shows me these errors related to CUDA:

Source: Injection | Description Timeout reached when waiting for CUDA event to initiate flush. Some CUDA proiling data might be missing.
Source: Analysis  | Not all CUDA events might have been collected.
Source: Analysis  | Zero CUDA events were collected. Does the application use CUDA?

My application uses CUDA, it does the whole palette of CUDA calls;

streamCreate(), cudaMalloc(), cudaMemcpyAsync(), kernel run, streamDestroy(), and cudaFree()

So I don’t know what I did differently back when it worked, but it doesn’t anymore right now.
Do I need an include within my CUDA project for CUDA Tracing to work? This seems to be unlikely to me since it has worked without it

In case further information is needed, like source code from the project, I will be able to provide it.

General Information:

  • Windows 10 (64-bits)
  • 2x RTX 2080Ti
  • NVIDIA Nsight 2019.3.3

EDIT: these are the analysis options of a run that worked:

Analysis options
Sampling frequency	8,000 Hz
Collect thread activity	On
Collect backtraces	On
Collect NVTX trace	On
Collect CUDA trace	On
Collect DX12 trace	On
Collect Vulkan trace	Off
Trace fork before exec	Off

When I try these now, it doesnt output CUDA traces, with the same errors as above.

Hi slotboom.n,

A few comments about this post:

  1. Which CUDA release are you using? Nsight Systems CUDA trace on Windows supports CUDA 10.0 or later.
  2. DX12 trace is not required for tracing CUDA. You can safely uncheck it in the project settings.
  3. The new release of Nsight Systems 2019.4.2 is available on the NVIDIA public website https://developer.nvidia.com/nsight-systems. You could give it a try.
  4. Is your app completing its execution in a very short time period? It is possible that the CUDA activity is not flushed to disk before the app exits. Try extending the app lifetime or change the Nsight Systems project settings under Collect CUDA trace to Flush data periodically every 0.10 seconds.

If you would like to share your source code so that I can take a deeper look then you can send me a private message through the forum website or email it to devtools-support@nvidia.com

Doron

Hello,

I had the same three errors and calling cudaDeviceReset() resolved the last two errors namely:

  • Not all CUDA events might have been collected.
  • Zero CUDA events were collected. Does the application use CUDA?

I still don’t understand what “Timeout reached when waiting for CUDA event to initiate flush.” means though.