Latest Nsight Systems and Nvidia Driver aren't compatible?

jrjbertram · April 26, 2020, 12:34am

Hi… trying to use Nsight Systems to graph CPU and GPU. Previously got it to work with nvvp including with nvtx annotations, trying to “upgrade” to the newer tool set. Found that Nsight Systems 2020.1 wasn’t able to get info about GPU… a warning in Nsight Systems said to update the driver. I updated via Geforce Experience to latest driver (nvidia-smi shows 445.87 and CUDA 11.0) and updated Nsight Systems to 2020.2. With this configuration, Nsight Systems can’t connect to the application at all, also reports it is not compatible with CUDA 11.0. What is the right combination of driver + nsight systems that should work on Windows 10?

Thanks,

Josh.

jrjbertram · April 26, 2020, 2:03pm

Additional data. I reverted my driver version in Windows 10 device manager. nvidia-smi shows driver 442.23 and cuda version 10.2. With Nsight Systems 2020.2 I observe I can get CPU details, but no GPU details. Error in Nsight Systems is "
Incompatible CUDA driver version. Please try updating the CUDA driver or use more recent profiler version."

Found advanced search that shows historical geforce game ready driver releases, but the descriptions don’t show the corresponding CUDA version of each driver. SMH. Couldn’t find a table anywhere that shows CUDA versions for each driver release.

Attempted to reinstall CUDA 10.2, which appeared that it should reinstall driver and nvidia cuda tools to specific versions, which should therefore all be compatible. I didn’t catch the exact driver version it was going to install, but noted that it was earlier than 442.23. After CUDA installation, found that nvidia-smi still shows 442.23, so I presume that the CUDA install chose not to install an earlier version of the driver.

CUDA 10.2 installed Nsight Systems 2019.5.2. With this version and driver 442.23, I can get GPU memory usage, but do not have any info on GPU kernels. I do see NVTX annotations however. Nsight Systems error is “Incompatible CUDA driver version. Please try updating the CUDA driver or use more recent profiler version.”

jrjbertram · April 26, 2020, 2:52pm

Reran the CUDA 10.2 install. It was wanting to install 441.22.

Looked through the historical driver installs. There’s no 441.22 that’s available to install. ::cry::

Guessing that 441.41 must be the closest compatible driver.

(Side quest: Had a bit of a struggle getting a driver to install. Ended up that I needed DCH driver type rather than standard.)

Was able to install driver 441.66 with CUDA 10.2. Seeing same result with Nsight Systems 2019.5.2 where it can see CPU side of things but no GPU results. “Incompatible CUDA driver version. Please try updating the CUDA driver or use more recent profiler version.”

I’m out of ideas.

jrjbertram · April 26, 2020, 3:05pm

And just to confirm, nvvp works fine with this same setup (441.66, CUDA 10.2, Visual Profiler v 10.2)

jrjbertram · April 26, 2020, 3:18pm

And as another twist, Nsight Compute seems to run without any issues. I’m able to break on kernels and I get GPU utilization and analysis results in the report pane. So it looks like only Nsight Systems is borked.

liuyis · April 27, 2020, 7:44pm

Hi jrjbertram, could you share your report files together with CUDA driver version (result of nvidia-smi) and Nsight Systems version?

jrjbertram · May 4, 2020, 1:41am

Uploaded the report (renamed to have a .log extension so it would attach to the post.)

Report 19.qdrep.log (2.1 MB)

Sun May 03 20:37:16 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 441.66       Driver Version: 441.66       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208... WDDM  | 00000000:01:00.0 Off |                  N/A |
| N/A   51C    P8     3W /  N/A |    182MiB /  8192MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1312    C+G   ...019.4\Monitor\Common\Nsight.Monitor.exe N/A      |
+-----------------------------------------------------------------------------+

Nsight systems version in this case is 2019.5.2.

Looking for any working combination that allows nsight systems to work. None of the combinations I have tried have worked on Windows 10.

liuyis · May 4, 2020, 2:44am

Hi jrjbertram,

Thanks for your response. Looks like there is some issue with loading CUPTI library. Could you collect logs for us to investigate further?

You could turn on logging with following steps:

Copy “nvlog.config.template” from "C:\Program Files\NVIDIA Corporation\Nsight Systems \host-windows-x64" to a place you want.
Rename “nvlog.config.template” to “nvlog.config”.
Set environment variable “NVLOG_CONFIG_FILE” to be the path of the “nvlog.config” file you just created.
Run a normal collection, you should then be able to find a “nsight-sys.log” file under the same directory of “nvlog.config” file.

Please share the “nsight-sys.log” file you collected. Thanks!

jrjbertram · May 4, 2020, 1:37pm

liuyis
      Employee




    May 4
Hi jrjbertram,

Thanks for your response. Looks like there is some issue with loading CUPTI library. Could you collect logs for us to investigate further?

You could turn on logging with following steps:

Copy “nvlog.config.template” from “C:\Program Files\NVIDIA Corporation\Nsight Systems \host-windows-x64” to a place you want.

Rename “nvlog.config.template” to “nvlog.config”.

Set environment variable “NVLOG_CONFIG_FILE” to be the path of the “nvlog.config” file you just created.

Run a normal collection, you should then be able to find a “nsight-sys.log” file under the same directory of “nvlog.config” file.

Please share the “nsight-sys.log” file you collected. Thanks!

Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

nsight-sys.log (1.31 MB)

liuyis · May 4, 2020, 2:29pm

Hi jrjbertram,

Looks like the log misses some information we needed to investigate. Could you try another way - copy “nvlog.config” to the working directory of the application that you are profiling and collect another log file and share with us.

Thanks!

jrjbertram · May 4, 2020, 3:01pm

Do I need to clear the environment variable as well?

liuyis · May 4, 2020, 3:09pm

Yes

jrjbertram · May 4, 2020, 3:24pm

Seeing a CUPTI error in the log now.

Here’s an abbreviated version of my path showing how I’m pointing to CUPTI lib. (I had to add that to path manually due to some other code / tool not being able to find CUPTI, though I can’t recall at the moment which code/tool needed it. Perhaps this is part of what’s going on.)

C:\Users\josh>echo %PATH%
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\libnvvp;C:\Program Files\NVIDIA GPU Computing Too
lkit\CUDA\v10.2\extras\CUPTI\lib64;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Win
dows\System32\OpenSSH;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;

nsight-sys.log (74.6 KB)

liuyis · May 4, 2020, 4:52pm

Hi jrjbertram,

The environment variable should not cause this issue because we do not rely on it to find CUPTI library. We carry our own versions under Nsight Systems’ directory. However, you could try removing the additional CUPTI paths you added just in case. If that does not fix the issue, could you collect another log following same steps using Nsight Systems 2020.2 (i.e. our current latest version)?

Thanks.

jrjbertram · May 4, 2020, 5:56pm

Removing CUPTI lib dir from path made no difference.

Log from Nsight Systems 2020.2.1.

nsight-sys.log (43.6 KB)

liuyis · May 6, 2020, 9:06pm

Hi Josh,

Thanks for providing the log. We’ve been investigating it. Meanwhile, could you try profiling a simple NVIDIA sample app to verify if this issue is related to your target application? You can follow steps in CUDA Samples :: CUDA Toolkit Documentation to find and build samples. I suggest trying “0_Simple/vectorAdd”. If possible, please attach the log for the sample app also.

Thanks

jrjbertram · May 7, 2020, 2:12pm

64-bit debug build of sample app worked with Nsight Systems 2019.5.2, log attached.

Also confirmed that the release build of the sample app worked as well (log not attached).

nsight-sys.log (27.5 KB)

jrjbertram · May 7, 2020, 2:16pm

64-bit debug build of sample app also worked with Nsight Systems 2020.2.1, log attached.

nsight-sys.log (58.5 KB)

jrjbertram · May 7, 2020, 2:25pm

Something to be aware of… I’m invoking python using the numba library’s CUDA support, which builds CUDA kernels on the fly using LLVM and NVVM IR (I believe). Perhaps this is part of the issue. It’s curious that nvvp works fine but Nsight Systems does not, however. Since nvvp works, seems that it should be possible.

Maybe your team needs to play with some simple numba cuda samples to see what happens on your end?

Josh.

liuyis · May 7, 2020, 7:34pm

Hi Josh,

Thanks for sharing the information. I am now able to reproduce this issue on my side using a python script with numba to generate CUDA kernels. We are looking into it.

Best,
Liuyi

Topic		Replies	Views
How to profile an application with Cuda 12.1 driver? Profiling Linux Targets	19	2570	July 18, 2023
Nsight nsys not collecting any CUDA kernel data (2023.1.2.43-32377213v0) Profiling Linux Targets	19	2583	September 14, 2023
Can't use Nsight Systems 2020.3.1.72 (Windows) Can't load profilings Profiling x86 Windows Targets	18	1437	June 30, 2021
Updated Nsight Systems and lost CUDA API trace Profiling Embedded Targets	11	2196	February 1, 2022
Nsight system HPC Linux installation nvc, nvc++ and nvfortran	7	1725	August 31, 2021
Unable to capture "Can't find UUID for CUDA device" Profiling Linux Targets	10	2411	November 9, 2023
Unsupported CUDA driver version: 10010 Profiling Linux Targets	10	7005	August 2, 2019
Profiling DCGan Tutorial Spins forever Nsight Compute	13	1161	June 7, 2020
Nsight 5.2/VS2012 C++/CUDA 7.5 Crash Nsight Visual Studio Edition	20	2442	March 23, 2017
Next-Gen debugger fails to start Nsight Visual Studio Edition	36	7136	March 16, 2018

Latest Nsight Systems and Nvidia Driver aren't compatible?

Related topics