±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1921 G /usr/lib/xorg/Xorg 1006MiB |
| 0 N/A N/A 2146 G /usr/bin/gnome-shell 93MiB |
| 0 N/A N/A 15502 G …9901597,15340099953604003785,131072 286MiB |
| 0 N/A N/A 16520 C+G …tems/2023.2.3/target-linux-x64/nsys 4MiB |
±--------------------------------------------------------------------------------------+
qtfan@legion:~$ nsys status -e
Timestamp counter supported: Yes
CPU Profiling Environment Check
Root privilege: disabled
Linux Kernel Paranoid Level = 2
Linux Distribution = Ubuntu
Linux Kernel Version = 6.5.0-21-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Not Available
CPU Profiling Environment (process-tree): OK
CPU Profiling Environment (system-wide): Fail
Assuming that you are running from the command line, can you try adding “–trace=cuda,nvtx” to your command line? This bypasses the OS runtime trace, which sometimes has issues in complicated systems.
Hi @hwilper,
I used ‘-trace=cuda’ options, such as: nsys profile -trace=cuda deviceQuery
In this deviceQuery example, I confirmed that it got stuck at first cuda API call cudaGetDeviceCount().
It is very strange, acutally i can run nsight-systems a few weeks ago, but now it hangs. I have no idea what changes on my system and there is no clue about what’s the problems.
I checked nvidia-smi command, it shows Xorg and gnome-shell process before nsys command. I also tried reboot system and update system to latest repro, but no help.
If it still does not work, could you try collecting logs with the following steps:
Save the following content to nvlog.config:
+ 75iwef global
- quadd_verbose_
$ /tmp/nsight-sys.log
ForceFlush
Format $sevc$time|${name:0}|${tid:5}|${file:0}:${line:0}[${sfunc:0}]:$text
Add NVLOG_CONFIG_FILE=<path to 'nvlog.config'> to your Nsys CLI command line, for example NVLOG_CONFIG_FILE=/tmp/nvlog.config nsys profile --trace=cuda ...
Run the command as usual, and if it works as expected, there should be a log file at /tmp/nsight-sys.log. Share the file to us and we will try to figure out why it could hang.
Also, could you try the option --trace=none - this is not a solution, it just helps us confirm if the issue is related to CUDA trace
@liuyis I tried 2024.1, it still does not work. I uploaded logs following your steps. (log was collected after i terminated the hanging process by ctr+c).
@liuyis The system is in local network, i am not sure how to access it outside.
I will try reinstall everything from scratch. If you have any other way for debug, please let me know.
One thing we can try is for you to attach GDB to the applicaiton when it hangs, and capture the backtrace. But we cannot share debugging version of libraries to you, so I’m not sure if meaningful backtrace can be capture from your side.