I install nsys in docker
system Ubuntu 20.04 x86
5.10.134-16.1.al8.x86_64
nsys status -e
Timestamp counter supported: Yes
CPU Profiling Environment Check
Root privilege: enabled
Linux Kernel Paranoid Level = 2
Linux Distribution = Ubuntu
Linux Kernel Version = 5.10.134-16.1.al8.x86_64: OK
Linux perf_event_open syscall available: Fail
Sampling trigger event available: Fail
Intel(c) Last Branch Record support: Not Available
CPU Profiling Environment (process-tree): Fail
CPU Profiling Environment (system-wide): Fail
See the product documentation at https://docs.nvidia.com/nsight-systems for more information,
including information on how to set the Linux Kernel Paranoid Level.
I run this cmd on GPU server
nsys profile --kill=sigterm --force-overwrite=true -o work/private/test1 --export=sqlite --trace=cuda,cudnn,cublas,osrt,nvtx python -u client.py --xxx xxxx --model_dir xxxxx
2024-07-30 09:32:14 - INFO - models init succeed!
2024-07-30 09:32:30 - INFO - No task, sleeping
2024-07-30 09:32:40 - INFO - No task, sleeping
2024-07-30 09:32:50 - INFO - No task, sleeping
2024-07-30 09:33:00 - INFO - No task, sleeping
^C
Generating '/tmp/nsys-report-0f1f.qdstrm'
Importer error status: An unknown error occurred.
Unable to retrieve the importer version: skipping importation of the QDSTRM file.
Generated:
/workspace/work/private/test1.qdstrm
And I also want to know that which file to import on my local OSX . Is it *.qdstrm ? But can’t import
the import stuck here and can’t continue
Can’t close.
I add --privileged=true
in docker run to solve the fail in nsys status -e
Timestamp counter supported: Yes
CPU Profiling Environment Check
Root privilege: enabled
Linux Kernel Paranoid Level = 2
Linux Distribution = Ubuntu
Linux Kernel Version = 5.10.134-16.1.al8.x86_64: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Available
CPU Profiling Environment (process-tree): OK
CPU Profiling Environment (system-wide): OK
See the product documentation at https://docs.nvidia.com/nsight-systems for more information,
including information on how to set the Linux Kernel Paranoid Level.
But still get same error
I have tried launch to monitor my app still get same error
window A
nsys start --stop-on-exit=false
nsys launch --trace=cuda,cudnn,cublas,osrt,nvtx python -u client.py --task_host https://dev-aip.textile-story.com/aigc-scheduler-server --model_dir /data/modeldata/aigc-fg-gen/v1.9 --conf default.json
window B
nsys stop
nsys shutdown --kill sigkill
I install it by Nsight Systems - Get Started | NVIDIA Developer
/opt/nvidia/nsight-systems/2024.4.1/host-linux-x64# ./QdstrmImporter --version
src/tcmalloc.cc:332] Attempt to free invalid pointer 0x44f778dd500
Aborted (core dumped)
update to 2024.5.1 version
root@ffb12767d3aa:/workspace/work# /opt/nvidia/nsight-systems-cli/2024.5.1/host-linux-x64/QdstrmImporter
src/tcmalloc.cc:332] Attempt to free invalid pointer 0x383a60dd4d0
Aborted (core dumped)