DLProf error when running profile

Hello,

I am currently trying to run a DLProf on a Deep Learning python script with pytorch. I have the following errors showing up and I can’t fix it. Also, I can’t find any references related to this issues on any blogs.

undefined symbol: __libc_dlclose, version GLIBC_PRIVATE
Floating point exception
[DLProf-17:08:33] System call failed to run: nsys profile -t cuda,nvtx -s none --show-output=true --force-overwrite=true --export=sqlite -o ./nsys_profile python train.py
[DLProf-17:08:33] Exited with code: 34816
[DLProf-17:08:33] Error Occurred:
[DLProf-17:08:33] System call failed

cat /etc/*release

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION=“Ubuntu 22.04.3 LTS”
PRETTY_NAME=“Ubuntu 22.04.3 LTS”
NAME=“Ubuntu”
VERSION_ID=“22.04”
VERSION=“22.04.3 LTS (Jammy Jellyfish)”
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL=“https://www.ubuntu.com/
SUPPORT_URL=“https://help.ubuntu.com/
BUG_REPORT_URL=“Bugs : Ubuntu
PRIVACY_POLICY_URL=“Data privacy | Ubuntu

nvidia-smi
Fri Aug 25 13:17:53 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03 Driver Version: 531.41 CUDA Version: 12.1 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 Ti On | 00000000:01:00.0 On | N/A |
| 0% 54C P8 18W / 200W| 746MiB / 8192MiB | 1% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 20 G /Xwayland N/A |
| 0 N/A N/A 20 G /Xwayland N/A |
| 0 N/A N/A 23 G /Xwayland N/A |
±--------------------------------------------------------------------------------------+

I think this is due to using an old version of nsys (if you installed dlprof with pip, then probably nvidia-cli-nsys version from 2021). The solution is cited here (GPU related information missing when using nsys profile).

Bsically, you install new version of nsys (cli only). then make sure the command ‘nsys’ refer to the new version not the old one (check with ‘which nsys’ ‘nsys --version’.