pgprof - Unable to locate cuda driver library

Hi,

I’m trying to run a simple code, from the openacc tutorials, in pgprof in order to analyze the program in terms of computing time.

Although, i get the error “Unable to locate cuda driver library - GPU profiling skipped”. But i double checked the path directory that is being used and its all good, directory is : “/opt/pgi/linux86-64/2017/cuda/9.0/bin/”.

How can i solve this? Does this mean cuda e badly installed?

[edit] If i go to my cuda installation directory “/opt/pgi/linux86-64/2017/cuda/9.0/bin” and run ./nvcc -V i get this:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

So, it seems that it’s well installed

Hi Caap,

You need to install the CUDA Device Driver in order to run code on your device. “nvcc” is the CUDA compiler driver not to be confused with the device driver.

You can download the device driver at: http://www.nvidia.com/Download/index.aspx

-Mat

Hey Mat,

I already downloaded, what’s the command to install the driver? (sorry for this basic questions, but i dont have much experience in linux)

You should be able to just run it. Though you’ll need to be root.

Here’s the driver installation guide:
https://us.download.nvidia.com/XFree86/Linux-x86_64/390.42/README/index.html
https://us.download.nvidia.com/XFree86/Linux-x86_64/390.42/README/installdriver.html

From the install guide:

Starting the Installer
After you have downloaded the file NVIDIA-Linux-x86_64-390.42.run, change to the directory containing the downloaded file, and as the root user run the executable:

cd yourdirectory

sh NVIDIA-Linux-x86_64-390.42.run

The .run file is a self-extracting archive. When executed, it extracts the contents of the archive and runs the contained nvidia-installer utility, which provides an interactive interface to walk you through the installation.

Hey Mat,

Reinstalling CUDA and drivers did the work. I can now run with my GPU and pgaccelinfo is working.

Although, my pgprof still doesn’t work, now it gives me this error:

unified memory profiling failed.

Any idea what this might be?

[EDIT]

I saw that running the profiler with root privileges solves the problem, but when i do “sudo pgprof” i get “pgprof command not found”

Hi Caap,

There was a problem with Unified Memory profiling in CUDA 8.0 on some devices, but this should have only given you a warning, not an error. You can try using CUDA 9.0 or 9.1 instead (i.e. use -ta=tesla:cuda9.0, -ta=tesla:cuda9.1, -Mcuda=cuda9.0, or -Mcuda-9.1).

However, since the problem goes away when running as root, it could be a permission problem with the CUDA driver. If so, then I’m not sure how to fix this.

Alternatively, you can try disabling unified memory profiling via the option:

pgprof --unified-memory-profiling=off a.out



I saw that running the profiler with root privileges solves the problem, but when i do “sudo pgprof” i get “pgprof command not found”

I’m assuming when ran as root you set the PATH to include the PGI installation. Using “sudo”, the default root environment will be used and unless you have root’s shell configuration file to include the PGI path, it wont be able to find pgprof.

-Mat

I solved this by always running with root permissions and giving the right path:

sudo /opt/pgi/linux86-64/17.10/bin/pgprof