Problems installing Quantum epsresso with GPU acceleration

I am trying to install the latest version of quantum espresso (6.8) with GPU-
support on an Ubuntu 18.04.5 LTS (GNU/Linux 4.15.0-135-generic x86_64)

System Configuration:
Processor: Intel Xeon Gold 5120 CPU @ 2.20 GHz (2 Processor)
RAM: 96 GB
Graphics Card: NVIDIA Quadro P5000 (16 GB)

Following the steps given at Home · Wiki · QEF - Quantum Espresso Foundation / q-e-gpu · GitLab, I
installed all required packages (CUDA Toolkit v8+, PGI Compilers v17.10+,
OpenMP package v3+), and tried configuring using

./configure --with-cuda="/opt/nvidia/hpc_sdk/Linux_x86_64/21.7/cuda/11.4/" --with-cuda-runtime=11.4 --with-cuda-cc=6.1 --enable-openmp --with- scalapack=no

It successfully configures, but on doing make all

It works fine until it starts compiling pw.x. This is the error that I am getting

make[1]: Entering directory '/home/anson/qe/qe-6.8/PW'
( cd src ; make all || exit 1 )
make[2]: Entering directory '/home/anson/qe/qe-6.8/PW/src'
if test -n "" ; then \
( cd ../.. ; make  || exit 1 ) ; fi
mpif90 -mp -cuda -gpu=cc6.1,cuda11.4 -o pw.x \
   pwscf.o  libpw.a ../../Modules/libqemod.a ../../KS_Solvers/libks_solvers.a ../../upflib/libupf.a ../../XClib/xc_lib.a ../../FFTXlib/libqefft.a ../../LAXlib/libqela.a ../../UtilXlib/libutil.a ../../dft-d3/libdftd3qe.a /home/anson/qe/qe-6.8//clib/clib.a /home/anson/qe/qe-6.8//MBD/libmbd.a  -cudalib=cufft,cublas,cusolver /home/anson/qe/qe-6.8//external/devxlib/src/libdevXlib.a /home/anson/qe/qe-6.8//EIGENSOLVER_GPU/lib_eigsolve/lib_eigsolve.a -L/home/anson/qe/qe-6.8//external/devxlib/src -ldevXlib  -L/usr/local/lib -llapack  -lblas  -L/home/anson/qe/qe-6.8//FoX/lib  -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys   -lblas
../../Modules/libqemod.a(random_numbers_gpu.o): In function `random_numbers_gpum_randy_vect_gpu_':
/home/anson/qe/qe-6.8/Modules/random_numbers_gpu.f90:67: undefined reference to `curandDestroyGenerator'
/home/anson/qe/qe-6.8/Modules/random_numbers_gpu.f90:68: undefined reference to `curandCreateGenerator'
/home/anson/qe/qe-6.8/Modules/random_numbers_gpu.f90:69: undefined reference to `curandSetPseudoRandomGeneratorSeed'
/home/anson/qe/qe-6.8/Modules/random_numbers_gpu.f90:73: undefined reference to `curandGenerateUniformDouble'
pgacclnk: child process exit status 1: /usr/bin/ld
Makefile:315: recipe for target 'pw.x' failed
make[2]: *** [pw.x] Error 2
make[2]: Leaving directory '/home/anson/qe/qe-6.8/PW/src'
Makefile:9: recipe for target 'pw' failed
make[1]: *** [pw] Error 1
make[1]: Leaving directory '/home/anson/qe/qe-6.8/PW'
Makefile:70: recipe for target 'pw' failed
make: *** [pw] Error 1

Any suggestions to solve this would be greatly appreciated.

Thank you.

My guess is that you haven’t followed the post-install steps properly that the HPC SDK requires you to do. Specifically, to set up the HPC SDK after install properly you are prompted to do something like this (example for 21.9, not 21.7):

MANPATH=$MANPATH:/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/man; export MANPATH

PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/bin:$PATH; export PATH

export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/comm_libs/mpi/bin:$PATH

 export MANPATH=$MANPATH:/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/comm_libs/mpi/man

 export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/comm_libs/openmpi4/openmpi-4.0.5/lib/:/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/compilers/lib/

It’s evident that configure depends on that PATH, so it is important. In addition, when I tried this, I had trouble during configure using the CUDA install from the 21.9 HPC SDK. I suspect that the QE configure process is not really up-to-date with the latest HPC SDK method of installing the (former PGI) compilers.

Here is the recipe I used:

  1. Install the CUDA 11.4 toolkit in the usual location (/usr/local/cuda-11.4/ with symlink). This is also provides the GPU driver install anyway.

  2. Install the 21.9 HPC SDK that bundles CUDA 11.4 only. I used the tarfile/install method. Note the path setup above.

  3. Adjust your path to point to the nvcc compiler here:

    export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/21.9/cuda/11.4/bin:$PATH
  4. do configure with this CUDA install:

    ./configure --with-cuda="/usr/local/cuda/" --with-cuda-runtime=11.4 --with-cuda-cc=6.1 --enable-openmp --with- scalapack=no
  5. do make pw

There are probably variations on above that will work also. That is one example that worked for me.

BTW I used the gpu-develop branch.

Thank you so much Robert for your suggestion.

Unfortunately, make pw still returns the same error. I tried with different versions of QE (latest- 6.8, 6.7, gpu-develop), but all of them are giving the same error.

I’m afraid I couldn’t install the CUDA toolkit with symlink properly (from step 1; I installed the toolkit without symlink). Could you please explain how to install HPC_SDK with symlink?

Thanks in advance

You can’t/don’t install the HPC_SDK with symlink. I won’t be able to help unless you give more description for this:

Thank you for your response. I am a total newbie to this field, so please bear with me.

I meant to say that I installed HPC_SDK according to the steps mentioned at NVIDIA HPC SDK 21.9 Downloads | NVIDIA Developer and as you suggested in points 1, 2 and 3, and added all the paths to my .bashrc file.

I simply ran nvhpc_2021_219_Linux_x86_64_cuda_11.4/install to install it. How can I install it with symlink?

To install the CUDA toolkit, see here. Read and follow carefully the instructions in the linux install guide.

The GPU driver installation is one part of CUDA install that can be troublesome. If you already have a GPU driver installed, its probably best to use that, if it is of a proper version, rather than trying to install a new one. One easy way to do this is to use the runfile CUDA toolkit installation method, and deselect the GPU driver install, when the runfile installer gives you the menu choice. To find out if you have a GPU driver installed (and its version) you can run nvidia-smi.

Also note the following at the QE FAQ:

the users’ mailing list is the typical place where to ask questions about Quantum ESPRESSO .

I mention this for at least 2 reasons. 1. I am not a QE expert. 2. There is almost certainly more than one way to get things going here. Talking about it with experienced users may be more productive.

1 Like

Thank you for your kind suggestions. I will take a look at the suggested resources.

Here is solution for this problem.

1 Like

Yes, this solved the problem.
Thank you so much

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.