using nsys to profile my current app crashes on the first memory access (tested with cudaMemset and cudaMemcpy/cuMemcpyHtoD). I’m calling it simply nsys nvprof my_app
.
Here’s the stack trace I’m getting when compiled in debug mode:
#13 Object "MY_LIB.so", at 0x71217c8a875e, in initFunction(Arg1&, Arg2&, CUstream_st*)
#12 Object "/lib/x86_64-linux-gnu/libcuda.so.1", at 0x71218dee9c89, in
#11 Object "/lib/x86_64-linux-gnu/libcuda.so.1", at 0x71218dd3ab81, in
#10 Object "/lib/x86_64-linux-gnu/libcuda.so.1", at 0x71218e0bc99f, in
#9 Object "/lib/x86_64-linux-gnu/libcuda.so.1", at 0x71218dd40545, in
#8 Object "/lib/x86_64-linux-gnu/libcuda.so.1", at 0x71218e0e1885, in
#7 Object "/lib/x86_64-linux-gnu/libcuda.so.1", at 0x71218e0e1663, in
#6 Object "/opt/nvidia/nsight-systems/2022.1.3/target-linux-x64/libcupti.so.11.7", at 0x7121276fb83b, in
#5 Object "/opt/nvidia/nsight-systems/2022.1.3/target-linux-x64/libcupti.so.11.7", at 0x7121276f1ace, in
#4 Object "/opt/nvidia/nsight-systems/2022.1.3/target-linux-x64/libcupti.so.11.7", at 0x712127737fc8, in
#3 Object "/opt/nvidia/nsight-systems/2022.1.3/target-linux-x64/libcupti.so.11.7", at 0x7121277093b4, in
#2 Object "/opt/nvidia/nsight-systems/2022.1.3/target-linux-x64/libcupti.so.11.7", at 0x712127708fbf, in
#1 Object "/opt/nvidia/nsight-systems/2022.1.3/target-linux-x64/libcupti.so.11.7", at 0x71212771d0e3, in
#0 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x712196297ef4, in pthread_mutex_lock
System info:
Ubuntu 22.04
Driver Version: 550.90.07
CUDA Version: 12.4
NVIDIA Nsight Systems version 2022.1.3.3-1c7b5f7