Hi… trying to use Nsight Systems to graph CPU and GPU. Previously got it to work with nvvp including with nvtx annotations, trying to “upgrade” to the newer tool set. Found that Nsight Systems 2020.1 wasn’t able to get info about GPU… a warning in Nsight Systems said to update the driver. I updated via Geforce Experience to latest driver (nvidia-smi shows 445.87 and CUDA 11.0) and updated Nsight Systems to 2020.2. With this configuration, Nsight Systems can’t connect to the application at all, also reports it is not compatible with CUDA 11.0. What is the right combination of driver + nsight systems that should work on Windows 10?
Additional data. I reverted my driver version in Windows 10 device manager. nvidia-smi shows driver 442.23 and cuda version 10.2. With Nsight Systems 2020.2 I observe I can get CPU details, but no GPU details. Error in Nsight Systems is "
Incompatible CUDA driver version. Please try updating the CUDA driver or use more recent profiler version."
Found advanced search that shows historical geforce game ready driver releases, but the descriptions don’t show the corresponding CUDA version of each driver. SMH. Couldn’t find a table anywhere that shows CUDA versions for each driver release.
Attempted to reinstall CUDA 10.2, which appeared that it should reinstall driver and nvidia cuda tools to specific versions, which should therefore all be compatible. I didn’t catch the exact driver version it was going to install, but noted that it was earlier than 442.23. After CUDA installation, found that nvidia-smi still shows 442.23, so I presume that the CUDA install chose not to install an earlier version of the driver.
CUDA 10.2 installed Nsight Systems 2019.5.2. With this version and driver 442.23, I can get GPU memory usage, but do not have any info on GPU kernels. I do see NVTX annotations however. Nsight Systems error is “Incompatible CUDA driver version. Please try updating the CUDA driver or use more recent profiler version.”
Guessing that 441.41 must be the closest compatible driver.
(Side quest: Had a bit of a struggle getting a driver to install. Ended up that I needed DCH driver type rather than standard.)
Was able to install driver 441.66 with CUDA 10.2. Seeing same result with Nsight Systems 2019.5.2 where it can see CPU side of things but no GPU results. “Incompatible CUDA driver version. Please try updating the CUDA driver or use more recent profiler version.”
And as another twist, Nsight Compute seems to run without any issues. I’m able to break on kernels and I get GPU utilization and analysis results in the report pane. So it looks like only Nsight Systems is borked.
Looks like the log misses some information we needed to investigate. Could you try another way - copy “nvlog.config” to the working directory of the application that you are profiling and collect another log file and share with us.
Here’s an abbreviated version of my path showing how I’m pointing to CUPTI lib. (I had to add that to path manually due to some other code / tool not being able to find CUPTI, though I can’t recall at the moment which code/tool needed it. Perhaps this is part of what’s going on.)
The environment variable should not cause this issue because we do not rely on it to find CUPTI library. We carry our own versions under Nsight Systems’ directory. However, you could try removing the additional CUPTI paths you added just in case. If that does not fix the issue, could you collect another log following same steps using Nsight Systems 2020.2 (i.e. our current latest version)?
Thanks for providing the log. We’ve been investigating it. Meanwhile, could you try profiling a simple NVIDIA sample app to verify if this issue is related to your target application? You can follow steps in CUDA Samples :: CUDA Toolkit Documentation to find and build samples. I suggest trying “0_Simple/vectorAdd”. If possible, please attach the log for the sample app also.
Something to be aware of… I’m invoking python using the numba library’s CUDA support, which builds CUDA kernels on the fly using LLVM and NVVM IR (I believe). Perhaps this is part of the issue. It’s curious that nvvp works fine but Nsight Systems does not, however. Since nvvp works, seems that it should be possible.
Maybe your team needs to play with some simple numba cuda samples to see what happens on your end?
Thanks for sharing the information. I am now able to reproduce this issue on my side using a python script with numba to generate CUDA kernels. We are looking into it.