CUDA Fortran and CUDA C (with nvcc) error

Hello, I am trying to run CUDA C CUFFT wrappers (“fortran_cufft1d.cu”), compiled with nvcc, in a main CUDA Fortran program (“fortran-test.f90”) compiled with pgfortran. The program runs and prints out results, but shows the error

free(): invalid pointer: 0x0000000000fbc510 ***
======= Backtrace: =========

If I compile the main program as f90 (i.e., without “-Mcuda” flag) with pgfortran, it links the CUDA C object without problems. The error shows when compiling the main program as CUDA Fortran. As I plan to use pinned memory on the main program to call the CUDA C wrappers, I will need a CUDA Fortran main. Any ideas? And if this is not possible, what alternatives could be used.

If I compile the main program as f90 (i.e., without “-Mcuda” flag) with pgfortran, it links the CUDA C object without problems.

Does this run successfully as well? My answer below is assuming that the code can be successfully run.

Just to clarify, your not actually using any CUDA Fortran features right now but see this runtime error when adding the “-Mcuda” flag? In this case, is the nvcc version and the CUDA version pgfortran is using the same? Which nvcc version are you using? Which PGI version?

PGI 19.4 will default to using CUDA 10.0, so you may need to use “-Mcuda=cuda10.1” or “-Mcuda=cuda9.2”, or set the “CUDA_HOME” environment variable to the same CUDA SDK install directory as your nvcc, to get the CUDA versions to match.

-Mat

Hello Mat, thanks for the reply.

Does this run successfully as well? My answer below is assuming that the code can be successfully run.

Yes

Just to clarify, your not actually using any CUDA Fortran features right now but see this runtime error when adding the “-Mcuda” flag?

Yes, exactly

In this case, is the nvcc version and the CUDA version pgfortran is using the same? Which nvcc version are you using? Which PGI version?

I am using PGI via module load. After “module load pgi/19.4”, the command “pgfortran --version” gives
pgfortran 19.4-0 LLVM 64-bit target on x86-64 Linux -tp skylake
PGI Compilers and Tools
Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.

The command “nvcc --version” gives
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148

So I’d assume PGI 19.4 uses cuda 10.0 by default (as you mentioned) while nvcc is stuck with version 9.2 . Tried running everything with “-Mcuda=9.2” but got the same errors.

I also tried running with “-Mcuda=cc70” and set nvcc arch flags to sm_70, code70, but got the same error as well.

About the CUDA_HOME variable, I believe it is not set (“echo $CUDA_HOME” prints nothing).

Ok, so it’s not sounding like version mismatch issue.

Another possibility is that CUDA Fortran will use RDC by default, while with CUDA C, RDC is disabled by default.

Try using “-Mcuda=cuda9.2,nordc”, or add “-rdc=true” to your CUDA options.

If that doesn’t work, I’m not sure then. Can you send a reproducing example to PGI Customer Service (trs@pgroup.com) and ask Alex to forward it to me so I can take a look?

-Mat

Tried “-Mcuda=cuda9.2,nordc”, same error; and all with “-Mcuda=9.2” and “nvcc … -rdc=true”, still same error.

Just sent an email with a small code (CUFFT of an array of 5 samples), with the requested instructions. Thanks again.

Hi Victor,

I was able to reproduce the error but with only CUDA 9.2. Compiling with CUDA 10.0 or 10.1 worked fine.

I also found with CUDA 9.2, I could work around the error by not including “-lcudart” on the link. pgfortran will also include -lcudart and it seems that including it twice may be to blame. Unclear exactly why.

Also, I’d recommend that you use the PGI flag “-Mcudalib=cufft” instead of adding -lcufft directly. This is a convenience flag where we’ll put the version of CUFFT that matches the CUDA version on the link so you don’t need to add a “-L” path.

% pgfortran -o fortran-test.exe  fortran-test.o fortran_cufft1d_interface.o fortran_cufft1d.o -Mcuda=cuda9.2 -V19.4 -Mcudalib=cufft
% ./fortran-test.exe
 Hello world fortran!
 Original array
 array(            1 )=    1.000000
 array(            2 )=    2.000000
 array(            3 )=    3.000000
 array(            4 )=    4.000000
 array(            5 )=    5.000000
GPU memory usage: used = 376.94MB, free = 15753.56MB, total = 16130.50MB

 FFT fwd
 arrayF(            1 )=    15.00000      -2.0861626E-07
 arrayF(            2 )=   -2.500000        3.440955
 arrayF(            3 )=   -2.500000       0.8122992

 FFT back
 array(            1 )=    1.000000
 array(            2 )=    2.000000
 array(            3 )=    3.000000
 array(            4 )=    4.000000
 array(            5 )=    5.000000

Hope this helps,
Mat