Nsight system HPC Linux installation


I am having an issue getting Nsight systems to work as I expected.

I am working on a 8x V100 server within a Linux HPC cluster. I have a CUDA 10.1 module installed and loaded.

I then tried to install Nsight system:


but now I get:
Error: Nsight Systems 2019.3.7 hasn’t been installed with CUDA Toolkit 10.1

How do I fix this, preferably so I have a module?

If you install the standalone Nsight Systems RPM downloaded from the developer zone, it will be installed to /opt/nvidia/nsight-systems. So just add /opt/nvidia/nsight-systems/<version>/bin to your PATH. The one you’re currently attempting to interact with is the one attached to the CUDA toolkit in /usr/local/cuda. (I could discuss how to resolve that if you need.)

Hi Thanks for your help. I have now two module a CUDA 10.1 and a nsight-systems 2020.3.1.

As you can see below, I loaded both modules, rebuild my test app, and then ran the nsys profiler, but I still do not get the expected output.

What am I doing wrong, Any idea?



[rregan@gn001 Managing-Memory] module load nsight-systems/2020.3.1 [rregan@gn001 Managing-Memory] module list
Currently Loaded Modulefiles:

  1. cuda/10.1 2) nsight-systems/2020.3.1

[rregan@gn001 Managing-Memory] nvcc -o singlethread-vector-add 01-vector-add.cu [rregan@gn001 Managing-Memory] ll
total 2124
-rw-r–r-- 1 rregan dphlss 1953 Jul 30 10:44 01-vector-add.cu
-rw-r–r-- 1 rregan dphlss 98292 Jul 30 11:15 report1.qdrep
-rw-r–r-- 1 rregan dphlss 151552 Jul 30 11:15 report1.sqlite
-rw-r–r-- 1 rregan dphlss 98228 Jul 30 11:17 report2.qdrep
-rw-r–r-- 1 rregan dphlss 147456 Jul 30 11:17 report2.sqlite
-rw-r–r-- 1 rregan dphlss 97803 Jul 30 16:54 report3.qdrep
-rw-r–r-- 1 rregan dphlss 131072 Jul 30 16:54 report3.sqlite
-rw-r–r-- 1 rregan dphlss 98187 Jul 30 17:05 report4.qdrep
-rw-r–r-- 1 rregan dphlss 151552 Jul 30 17:05 report4.sqlite
-rw-r–r-- 1 rregan dphlss 87454 Jul 30 17:50 report5.qdrep
-rw-r–r-- 1 rregan dphlss 122880 Jul 30 17:50 report5.sqlite
-rwxr-xr-x 1 rregan dphlss 639248 Jul 31 10:51 singlethread-vector-add

[rregan@gn001 Managing-Memory]$ nsys profile --stats=true ./singlethread-vector-add
Collecting data…

The target application terminated with signal 11 (SIGSEGV)
Processing events…
Capturing symbol files…
Saving temporary “/tmp/nsys-report-0c63-94a7-cf5c-8c10.qdstrm” file to disk…
Creating final output files…

Processing [==============================================================100%]
Saved report file to “/tmp/nsys-report-0c63-94a7-cf5c-8c10.qdrep”
Exporting 1018 events: [==================================================100%]

Exported successfully to

Generating CUDA API Statistics…
CUDA API Statistics (nanoseconds)

CUDA trace data was not collected.

Generating Operating System Runtime API Statistics…
Operating System Runtime API Statistics (nanoseconds)

Generating NVTX Push-Pop Range Statistics…
NVTX Push-Pop Range Statistics (nanoseconds)

Report file moved to “/cosma/home/rregan/Projects/GPU/DLI/Fundamentals-of-Accerated-Computing/C/Managing-Memory/report6.qdrep”
Report file moved to “/cosma/home/rregan/Projects/GPU/DLI/Fundamentals-of-Accerated-Computing/C/Managing-Memory/report6.sqlite”
[rregan@gn001 Managing-Memory]$

It looks like your code is seg faulting on the host. Does it run correctly without the profiler?