Error while running a small program in nsight compute

,ID,API Name,Details,Func Return,Func Parameter,Start,Duration,Queued,Submitted ,525,vector_add,
Error: ERR_NVGPUCTRPERM - The user does not have permission to access NVIDIA GPU Performance Counters on the target device. For instructions on enabling permissions and to get more information see https://developer.nvidia.com/ERR_NVGPUCTRPERM,,,,,,

I am using remote linux desktop. Many users are connected to this desktop. So, I cannot reboot to get permissions for GPU counters. Please provide the steps to get permissions to profile GPU counters without reboot and causing issues to other users.
Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-73-generic x86_64)
NVIDIA-SMI 530.30.02
Driver Version: 530.30.02
CUDA Version: 12.1

Thanks and regards
Ramya Sri

As detailed on the website you reference, you have to become root on that machine to profile or load the nvidia kernel modules with the proper flags (which may be specified in the config file or on the command line while loading them). Loading kernel modules also requires root permissions. It is possible to unload and reload the kernel modules without rebooting, but not while other processes are accessing them. There are no other alternatives. You should reach out to that system’s administrator to facilitate this configuration change.

I have sudo privileges. But there are some processes using nvidia.

sudo lsof /dev/nvidia*
output for executing above command:

lsof: WARNING: can’t stat() fuse.gvfsd-fuse file system /run/user/1006/gvfs
Output information may be incomplete.
lsof: WARNING: can’t stat() fuse.gvfsd-fuse file system /run/user/1001/gvfs
Output information may be incomplete.
lsof: WARNING: can’t stat() fuse.gvfsd-fuse file system /run/user/1010/gvfs
Output information may be incomplete.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nvidia-pe 2014 root 2u CHR 195,255 0t0 817 /dev/nvidiactl
nvidia-pe 2014 root 3u CHR 195,0 0t0 818 /dev/nvidia0
nvidia-pe 2014 root 5u CHR 195,0 0t0 818 /dev/nvidia0
nvidia-pe 2014 root 6u CHR 195,0 0t0 818 /dev/nvidia0
nvidia-pe 2014 root 7u CHR 195,254 0t0 821 /dev/nvidia-modeset
nvidia-pe 2014 root 8u CHR 195,0 0t0 818 /dev/nvidia0
python3 127794 root mem CHR 195,255 817 /dev/nvidiactl
python3 127794 root mem CHR 195,0 818 /dev/nvidia0
python3 127794 root mem CHR 506,0 819 /dev/nvidia-uvm
python3 127794 root 58u CHR 195,255 0t0 817 /dev/nvidiactl
python3 127794 root 59u CHR 506,0 0t0 819 /dev/nvidia-uvm
python3 127794 root 60u CHR 195,0 0t0 818 /dev/nvidia0
python3 127794 root 61u CHR 195,0 0t0 818 /dev/nvidia0

If you have sudo privileges, you can either profile directly on the remote system as sudo ncu ... and create a report file to open locally, or should also be able to launch on the remote system with sudo ncu --mode launch ... and then attach from your local system.

1 Like

sudo: ncu: command not found.
image

Previously, I used nsight compute 2023.11 in local system and connected remotely. it is not working with the command

Thank you . It is working with sudo ncu…

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.