Nsys segfault during profile: pthread_create when connected over ssh with X11

uday1 · August 26, 2021, 12:20pm

This is on an Ubuntu 20.04 LTS with NVIDIA Nsight Systems version 2021.3.1.54-ee9c30a (either installed via the installer or the .deb package). The GPU is a 3090 RTX. The profile step segfaults:

$ nsys profile ./binary
Warning: LBR backtrace method is not supported on this platform. DWARF backtrace method will be used.
Segmentation fault (core dumped)

A backtrace reveals

Thread 11 “nsys” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffee7fc700 (LWP 2472241)]
0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x00000000007ee9b1 in ?? ()
#2 0x00000000007eed86 in ?? ()
#3 0x00000000007ef39b in ?? ()
#4 0x00000000007e6a6a in ?? ()
#5 0x000000000072539b in ?? ()
#6 0x0000000000727563 in ?? ()
#7 0x00000000006ae7b6 in ?? ()
#8 0x00000000006aec7f in ?? ()
#9 0x00000000006aed55 in ?? ()
#10 0x00000000006aee0b in ?? ()
#11 0x00000000006af92a in ?? ()
#12 0x00000000006bbfa9 in ?? ()
#13 0x00000000006bf13b in ?? ()
#14 0x00000000006b18c5 in ?? ()
#15 0x00000000006c9809 in ?? ()
#16 0x00000000006ca045 in ?? ()
#17 0x00000000006ca094 in ?? ()
#18 0x00000000015602c0 in ?? ()
#19 0x00007ffff7f96609 in start_thread (arg=) at pthread_create.c:477
#20 0x00007ffff7d3b293 in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:95

Unexpectedly, the segfault goes away if X11 forwarding over ssh is turned off (I was connected over ssh). I stumbled upon this from another forum post here from about 2 years ago: Nsight system profile segmentation fault on startup

It appeared odd that a command-line utility would end up being sensitive to X11 forwarding.

hwilper · August 27, 2021, 1:43am

@rknight could you take a look at this one?

rknight · August 27, 2021, 6:06pm

Hi uday1,

What specific data are you trying to capture? For example, are you trying to trace CUDA, NVTX, the OS Runtime, Open GL? Are you interested in CPU IP/backtrace sampling? All of these data collectors are enabled by default when you use the nsys command line interface (CLI).

I suspect this might be an OpenGL issue. I suggest limiting the collection to one or two data collectors andtrying again. For example, try the following command line;

nsys profile --trace=cuda --sample=cpu ./binary

Then, add each data collector you need one at a time until the issue happens again. Please let us know what you find out.

uday1 · October 4, 2021, 7:20am

Sorry about the delay in getting back here. I’m profiling a CUDA binary and am only interested in GPU performance statistics, for example, the kernel execution times. This is how I normally run:

$ nsys profile …
$ nsys stats -q --report gpukernsum report13.qdrep

My binary itself makes no calls to anything OpenGL but only to the CUDA runtime and to launch and run a CUDA kernel. When it runs fine, this is how it runs:

$ nsys profile …
Warning: LBR backtrace method is not supported on this platform. DWARF backtrace method will be used.
Collecting data…
Processing events…
Saving temporary “/tmp/nsys-report-6554-ca1c-479d-cb38.qdstrm” file to disk…

Creating final output files…
Processing [===============================================================100%]
Saved report file to “/tmp/nsys-report-6554-ca1c-479d-cb38.qdrep”
Report file moved to “/home/uday/project-dir/report2.qdrep”

$ nsys stats -q --report gpukernsum report2.qdrep

Time(%) Total Time (ns) Instances Average (ns) Minimum (ns) Maximum (ns) StdDev (ns) Name

100.0 50,559 1 50,559.0 50,559 50,559 0.0 main_kernel

I tried it with --trace=cuda --sample=cpu and this avoids the crash. Thanks!

Topic		Replies	Views
Nsight profile segmentation fault on startup Linux	1	786	August 27, 2019
Segfaults during Deep Learning Profiling [BUG?] Profiling Linux Targets	9	861	December 13, 2023
Nsight system profile segmentation fault on startup Profiling Linux Targets	1	1447	August 29, 2019
NSIGHT SYSTEM: Runtime Error and reported QuadDCommon::NotFoundException Profiling Linux Targets nsight	13	6205	September 8, 2023
Nsight Systems Issue: Unable to configure the collection of CPU IP samples Profiling Linux Targets	12	8973	December 27, 2021
Nsys command line on agx pegasus Profiling DRIVE Targets drive-devtools	13	1885	November 16, 2021
Segmentation Fault: Running nvidia systems on Ubuntu Profiling Linux Targets nsight	8	1576	September 2, 2021
Nsys Failed to initialize client to server communication. Aborting Profiling Embedded Targets	8	1445	October 4, 2023
Nsys hangs when profile cuda applications Profiling Linux Targets	10	832	March 8, 2024
Can not get CUDA python backtrace Profiling Linux Targets	12	1957	May 7, 2023

Nsys segfault during profile: pthread_create when connected over ssh with X11

Related topics