CUPTI has no sufficient privilege

hile I updated my system from CUDA 10.1/cuDNN 7.3.1 to CUDA11.0/cuDNN 8.0.1, the CUPTI had a problem.

I followed the guideline of Nvidia CUPTI to add CAP_SYS_ADMIN. For instance, I gave the following command with the test application.

$ python alexnet_classifying.py --cap-add=CAP_SYS_ADMIN

My application can show the Alexnet model and iterations correctly. However, it still includes the reminding messages as follows.

CUPTI_ERROR_INSUFFICIENT_PRIVILEGES

CUPTI_ERROR_INVALID_PARAMETER

CUPTI_ERROR_NOT_INITIALIZED

tried the following methods but could not solve the reminding error messages.

1. Add the following lines in to the bashrc

LD_LIBRARY_PATH=:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64

LD_INCLUDE_PATH=:/usr/local/cuda/include:/usr/local/cuda/extras/CUPTI/include

2. Add the parameter.

I want to add the parameter to the designated directory.

“NVreg_RestrictProfilingToAdminUsers=0”

/etc/modprobe.d/nvidia-kernel-common.conf

However, there is no file of nvidia-kernel-common.conf in the modprobe.d file .

How can I avoid the CUPTI error-reminding messages?

Appreciate for your early answer.

Hi,

Do you have root access to the system? As an easy solution, you can launch the application with root privileges.

Since NVreg_RestrictProfilingToAdminUsers, you can create a .conf file under /etc/modprobe.d. Detailed steps are listed in a forum comment:

Thanks. Even though I executed the following two options, it still has the persistent error reminding message while I run the script on the local linux machine.

Option One.

1.Create profile.conf

2.Add the parameters

Add
NVreg_RestrictProfilingToAdminUsers=0
to
/etc/modprobe.d/profile.conf

3. update

$ sudo update-initramfs -u

libkmod: ERROR …/libkmod/libkmod-config.c:656 kmod_config_parse: /etc/modprobe.d/profile.conf line 1: ignoring bad line starting with ‘NVreg_RestrictProfilingToAdminUsers=0’

4. export LD_LIBRARY_PATH

Since there is no lib64 under the CUPTI directory, so the following measure is not effective.

export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.0/extras/CUPTI/lib64
to
~/.bashrc

5.Reboot

$ sudo reboot

Error messages:

2020-07-29 23:37:48.267567: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1408] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2020-07-29 23:37:48.269002: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1447] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
2020-07-29 23:37:48.269106: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1430] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI_ERROR_INVALID_PARAMETER

Option Two

If I use the sudo to run the script, it has the following error.

$ sudo -i

root@mike~# cd /home/mike/Documents/Alexnet_Callback
root@mike:/home/mike/Documents/Alexnet_Callback# python alexnet_classify.py
Traceback (most recent call last):
File “alexnet_classify.py”, line 86, in
import numpy as np
ImportError: No module named numpy
root@mike:/home/mike/Documents/Alexnet_Callback#

I check that the numpy is ok. But it still has the above-mentioned error.

Please help find a solution.

I find the method to solve libkmod: ERROR… Please have a look at the steps as follows.

1. Export the environment parameter

1). Check the path of both CUPTI and Debugger

Because I get to know that CUPTI has a debugger mode, I check both the CUPTI and Debugger directories. The CUPTI includes docs and samples. In contrast, the Debugger directory includes include and lib64.

/usr/local/cuda-11.0/extras/CUPTI/docs
/usr/local/cuda-11.0/extras/CUPTI/samples

/usr/local/cuda-11.0/extras/Debugger/include
/usr/local/cuda-11.0/extras/Debugger/lib64

2). Open bashrc

$ sudo gedit ~/.bashrc

3). Export the environment parameters

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.0/extras/Debugger/lib64
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.0/extras/Debugger/include

2. Creat profile.conf with the parameter and update initramfs

1). Create profile.conf and add to the directory of modprobe.d

Use Text Editor to create profile.conf;
Add it to modprobe.d

2). Open the profile.conf

$ sudo gedit /etc/modprobe.d/profile.conf

3). Add the parameter

options nvidia "NVreg_RestrictProfilingToAdminUsers=0"

3). Close profile.conf

4). Update initramfs

$ sudo update-initramfs -u

update-initramfs: Generating /boot/initrd.img-5.4.0-42-generic
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8125a-3.fw for module r8169

5). Check the missing firmware

The following weblink includes the comprehensive information of firmwares.

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/rtl_nic/

6). Download the missing firmware

cd /lib/firmware/rtl_nic/
sudo wget https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/rtl_nic/rtl8125a-3.fw

7).Update initramfs again

update-initramfs: Generating /boot/initrd.img-5.4.0-42-generic

It has a little complex but effectively solves the problem.

Cheers!