I am trying to check whether my kernel and memcpy runs concurrently on two GPUs or not. So I am using Nsight Systems 2019.3.3 and run the .exe of my CUDA project written with Visual Studio 2015 with it. I have all options within the project enabled so that I can be sure that it should output CUDA Traces. When I began with profiling, some runs had CUDA Traces and some didn’t, to me for no obvious reason. But now, I can’t get it to output them anymore.
In the report it shows me these errors related to CUDA:
Source: Injection | Description Timeout reached when waiting for CUDA event to initiate flush. Some CUDA proiling data might be missing.
Source: Analysis | Not all CUDA events might have been collected.
Source: Analysis | Zero CUDA events were collected. Does the application use CUDA?
My application uses CUDA, it does the whole palette of CUDA calls;
streamCreate(), cudaMalloc(), cudaMemcpyAsync(), kernel run, streamDestroy(), and cudaFree()
So I don’t know what I did differently back when it worked, but it doesn’t anymore right now.
Do I need an include within my CUDA project for CUDA Tracing to work? This seems to be unlikely to me since it has worked without it
In case further information is needed, like source code from the project, I will be able to provide it.
General Information:
- Windows 10 (64-bits)
- 2x RTX 2080Ti
- NVIDIA Nsight 2019.3.3
EDIT: these are the analysis options of a run that worked:
Analysis options
Sampling frequency 8,000 Hz
Collect thread activity On
Collect backtraces On
Collect NVTX trace On
Collect CUDA trace On
Collect DX12 trace On
Collect Vulkan trace Off
Trace fork before exec Off
When I try these now, it doesnt output CUDA traces, with the same errors as above.