Nv-nsight-cu-cli segfault on nsight-compute-2020.1.2

I’m encounting the same error return code(11) described inNv-nsight-cu-cli segfault

I collected some results from dmesg as well as gdb and here are some of them:

Normal run:
jason@tyan:~$ nv-nsight-cu-cli ./NVIDIA_CUDA-11.0_Samples/0_Simple/matrixMul/matrixMul
[Matrix Multiply Using CUDA] - Starting…
==PROF== Connected to process 8189 (/home/jason/NVIDIA_CUDA-11.0_Samples/0_Simple/matrixMul/matrixMul)
==ERROR== The application returned an error code (11).
==WARNING== No kernels were profiled.
==WARNING== Profiling kernels launched by child processes requires the --target-processes all option.

dmesg:
[264668.146501] matrixMul[8258]: segfault at 7ffe0fb45fe4 ip 00007f5fbce6b2b6 sp 00007ffe0fb25710 error 6 in libcuda-injection.so[7f5fbc37d000+12dc000]

with gdb:
jason@tyan:~$ nv-nsight-cu-cli ./NVIDIA_CUDA-11.0_Samples/0_Simple/matrixMul/matrixMul
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright © 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type “show copying”
and “show warranty” for details.
This GDB was configured as “x86_64-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from /usr/local/cuda-11.0/bin/…/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/ncu…(no debugging symbols found)…done.
(gdb) set follow-fork-mode child
(gdb) r
Starting program: /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/ncu ./NVIDIA_CUDA-11.0_Samples/0_Simple/matrixMul/matrixMul
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[New Thread 0x7ffff6bff700 (LWP 8432)]
[New Thread 0x7ffff63fe700 (LWP 8433)]
[New process 8434]
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
process 8434 is executing new program: /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/TreeLauncherSubreaper
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[New process 8438]
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
process 8438 is executing new program: /home/jason/NVIDIA_CUDA-11.0_Samples/0_Simple/matrixMul/matrixMul
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
[Matrix Multiply Using CUDA] - Starting…
[New Thread 0x7ffff1e58700 (LWP 8440)]
[New Thread 0x7ffff1657700 (LWP 8441)]
[New Thread 0x7ffff0c52700 (LWP 8442)]
==PROF== Connected to process 8438 (/home/jason/NVIDIA_CUDA-11.0_Samples/0_Simple/matrixMul/matrixMul)
[New Thread 0x7fffebfff700 (LWP 8444)]

Thread 3.1 “matrixMul” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fdea40 (LWP 8438)]
0x00007ffff58442b6 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
(gdb) bt
#0 0x00007ffff58442b6 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#1 0x00007ffff583b54e in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#2 0x00007ffff5844932 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#3 0x00007ffff584501c in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#4 0x00007ffff5824e6a in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#5 0x00007ffff5712b6d in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#6 0x00007ffff56c474b in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#7 0x00007ffff5099f43 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#8 0x00007ffff507ef69 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#9 0x00007ffff4edda9a in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#10 0x00007ffff2243626 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#11 0x00007ffff20cd6b5 in cuInit () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#12 0x00007ffff4f8aa37 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#13 0x00007ffff4f8ac98 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#14 0x00007ffff504cb52 in ?? () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#15 0x00007ffff4f59004 in cuInit () from /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
#16 0x0000555555570569 in cudart::__loadDriverInternalUtil() ()
#17 0x00007ffff45c8827 in __pthread_once_slow (once_control=0x5555557e2a90 cudart::globalState::loadDriver()::loadDriverControl, init_routine=0x5555555704d0 cudart::__loadDriverInternalUtil()) at pthread_once.c:116
#18 0x00005555555ad329 in cudart::cuosOnce(int*, void ()()) ()
#19 0x000055555556d743 in cudart::globalState::initializeDriver() ()
#20 0x000055555558b563 in cudaGetDeviceCount ()
#21 0x000055555555cd6c in gpuGetMaxGflopsDeviceId() ()
#22 0x000055555555d04e in findCudaDevice(int, char const**) ()
#23 0x000055555555bcbd in main ()
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007ffff7dd5f10 0x00007ffff7df4b50 Yes /lib64/ld-linux-x86-64.so.2
0x00007ffff4eaed00 0x00007ffff5a295ec Yes (
) /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libcuda-injection.so
0x00007ffff49f5ce0 0x00007ffff4b04bdc Yes () /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libTreeLauncherTargetUpdatePreloadInjection.so
0x00007ffff47da200 0x00007ffff47dd70c Yes /lib/x86_64-linux-gnu/librt.so.1
0x00007ffff45bebb0 0x00007ffff45cd101 Yes /lib/x86_64-linux-gnu/libpthread.so.0
0x00007ffff43b5e50 0x00007ffff43b6bee Yes /lib/x86_64-linux-gnu/libdl.so.2
0x00007ffff40b8490 0x00007ffff41679de Yes (
) /usr/lib/x86_64-linux-gnu/libstdc++.so.6
0x00007ffff3e16ac0 0x00007ffff3e2736d Yes () /lib/x86_64-linux-gnu/libgcc_s.so.1
0x00007ffff3a442d0 0x00007ffff3bbceac Yes /lib/x86_64-linux-gnu/libc.so.6
0x00007ffff3820e70 0x00007ffff382193a Yes /lib/x86_64-linux-gnu/libutil.so.1
0x00007ffff348da80 0x00007ffff354c1d5 Yes /lib/x86_64-linux-gnu/libm.so.6
0x00007ffff1f1d8f0 0x00007ffff22c3fd4 Yes (
) /usr/lib/x86_64-linux-gnu/libcuda.so.1
0x00007ffff0c53710 0x00007ffff0c54baf Yes /usr/lib/x86_64-linux-gnu/gconv/UTF-32.so
0x00007fffe9944b00 0x00007fffeaf680ac Yes () /usr/local/cuda-11.0/nsight-compute-2020.1.2/target/linux-desktop-glibc_2_11_3-x64/./libnvperfapi64.so
(
): Shared library is missing debugging information.