run on K40

chriaa.intissar · May 15, 2018, 1:21pm

We have a HPE server equipped with two NVIDIA K40 accelerators used for parallelization with OpenACC and FORTRAN.
The compilation is done successfully: pgfortran -Minfo = accel -ta = nvidia vecadd.f90 -o test.out
While during the execution I have this error message:
./test.out
Current file: /home/admin/intissar/vecadd.f90
function: main
line: 28
Current region was compiled for:
NVIDIA Tesla GPU sm30 sm35 sm30 sm35 sm50
Available accelerators:
device[1]: Native X86
The accelerator does not match the profile for which this program was compiled

Can you help me ?

generix · May 15, 2018, 1:48pm

What’s the output of deviceQuery from the cuda demos (/opt/cuda/extras/demo_suite/deviceQuery or where you installed it)?

chriaa.intissar · May 15, 2018, 1:53pm

i can’t found CUDA

generix · May 15, 2018, 2:02pm

Should be installed, otherwise I wouldn’t know how you suceeded compiling your program.
Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post will reveal a paperclip icon.

chriaa.intissar · May 15, 2018, 2:15pm

Sorry,
How can I attach the report ?

generix · May 15, 2018, 2:41pm

Hovering the mouse over an existing post will reveal a paperclip icon.

chriaa.intissar · May 15, 2018, 2:46pm

I just attached the report
nvidia-bug-report.log.gz (96.7 KB)

generix · May 15, 2018, 3:13pm

You are running a 384.111 nvidia driver on a RHEL 7.5 system. So the maximum cuda version should be 9.0.
If you installed it via rpm, it should reside in /usr/local, there should be a directory cuda-9.0, please check.
In directory cuda-9.0, there’s a directory samples/1_Utilities which should contain deviceQuery. If so, please run that and post the output.

chriaa.intissar · May 16, 2018, 9:19am

Thanks for your help
there are many directories contain deviceQuery, which one should I run it ?

generix · May 16, 2018, 12:01pm

Use that from the CUDA-Fortran/SDK/deviceQuery directory. You’ll have to run ‘make’ there first to compile the binary.

chriaa.intissar · May 17, 2018, 9:35am

[root@srv-app ~]# cd /opt/pgi/linux86-64/2017/examples/CUDA-Fortran/SDK/deviceQuery/
[root@srv-app deviceQuery]# make
pgfortran -fast -o deviceQuery.out deviceQuery.cuf
make: pgfortran : commande introuvable
make: *** [build] Erreur 127

generix · May 17, 2018, 9:37am

Run it as normal user, not root.

chriaa.intissar · May 17, 2018, 9:41am

[admin@srv-app deviceQuery]$ make
pgfortran -fast -o deviceQuery.out deviceQuery.cuf
/usr/bin/ld : ne peut ouvrir le fichier de sortie deviceQuery.out : Permission non accordée
pgacclnk: child process exit status 1: /usr/bin/ld
make: *** [build] Erreur 2

generix · May 17, 2018, 9:50am

Adjust the directory permissions, delete deviceQuery.out which might have been generated previously when you ran as root.

chriaa.intissar · May 17, 2018, 10:00am

The .out file does not exist
[admin@srv-app deviceQuery]$ ls -a -l
total 16
drwxrwxr-x. 2 root root 43 30 oct. 2017 .
drwxrwxr-x. 11 root root 4096 30 oct. 2017 …
-r–r–r–. 1 root root 5056 30 oct. 2017 deviceQuery.cuf
-r–r–r–. 1 root root 974 30 oct. 2017 Makefile

generix · May 17, 2018, 10:05am

copy both files to a directory where your user has write permission, then run make again.

chriaa.intissar · May 17, 2018, 10:14am

[admin@srv-app ~]$ make
pgfortran -fast -o deviceQuery.out deviceQuery.cuf
./deviceQuery.out
cudaGetDeviceCount failed – CUDA driver and runtime may be mismatched
FORTRAN STOP

generix · May 17, 2018, 10:38am

Did you install CUDA at all? If not, please do so.
[url]https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=RHEL&target_version=7&target_type=rpmlocal[/url]
Version 9.2 requires using a newer driver.

generix · May 17, 2018, 10:40am

Here cuda 9.0 can be acquired which matches your 384 driver:
[url]https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=RHEL&target_version=7&target_type=rpmlocal[/url]

chriaa.intissar · May 17, 2018, 1:07pm

Both are installed
[admin@srv-app ~]$ sudo rpm -i cuda-repo-rhel7-9-2-local-9.2.88-1.x86_64.rpm
attention : cuda-repo-rhel7-9-2-local-9.2.88-1.x86_64.rpm: Entête V3 RSA/SHA512 Signature, clé ID 7fa2af80: NOKEY
paquet cuda-repo-rhel7-9-2-local-9.2.88-1.x86_64 déjà installé

[admin@srv-app ~]$ sudo rpm -i cuda-repo-rhel7-9-0-local-9.0.176-1.x86_64.rpm
[sudo] Mot de passe de admin :
attention : cuda-repo-rhel7-9-0-local-9.0.176-1.x86_64.rpm: Entête V3 RSA/SHA512 Signature, clé ID 7fa2af80: NOKEY
paquet cuda-repo-rhel7-9-0-local-9.0.176-1.x86_64 déjà installé

Topic		Replies	Views
cuda install fail - ubuntu 14.04 CUDA Setup and Installation	8	3715	February 4, 2016
Install Problem CUDA Programming and Performance	32	12706	December 17, 2009
CUDA 10 installation problems on Ubuntu 18.04 CUDA Setup and Installation	24	94579	December 11, 2020
"NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver" Ubuntu 16.04 CUDA Setup and Installation	79	371505	March 19, 2021
[INFO]: Finished with code: 256 , [ERROR]: Install of driver component failed CUDA Setup and Installation	24	179606	September 29, 2024
NVidia driver not loading after CUDA 9.1 installation with runfile CUDA Setup and Installation	15	21474	March 11, 2018
Solved: NVIDIA driver installation fails. CUDA Setup and Installation	34	52976	March 7, 2018
Followed guide NVIDIA CUDA Installation Guide for Linux, failing at driver install CUDA Setup and Installation cuda , ubuntu	1	1524	October 27, 2020
CUDA install unmet dependencies: cuda : Depends: cuda-10-0 (>= 10.0.130) but it is not going to be installed CUDA Setup and Installation	37	182891	September 17, 2023
'No devices were found' after installing cuda 11.02 on Ubuntu 20.04 for RTX3080 Linux cuda , ubuntu , driver	19	12631	July 31, 2021

run on K40

Related topics