Hi all 🙂
Sometimes I check whether new versions of the HPC-SDK are available and try to install the latest one. Trying to move from version 24.5, I installed 24.11 (unfortunately, version 25.x no longer supports K80 GPUs 😞), but now I’m experiencing a runtime issue that I didn’t have before.
Here is the smallest example to reproduce the problem, which occurs with any version of nvfortran
from 24.7 (inclusive) to 24.11.
MODULE PINNEDTYPE
Implicit None
Type bond
Real(8), Pinned :: lx,ly,lz
End Type
Type(bond), Allocatable, Pinned, Target :: array_of_bonds(:)
!$OMP THREADPRIVATE(array_of_bonds)
END MODULE PINNEDTYPE
PROGRAM MAIN
USE CUDAFOR
USE PINNEDTYPE, Only: array_of_bonds
Implicit None
Integer, Parameter :: nthreads=2
Call omp_set_num_threads(nthreads)
!$OMP PARALLEL DEFAULT(PRIVATE)
Allocate(array_of_bonds(2))
!$OMP END PARALLEL
END PROGRAM MAIN
Compiling with nvfortran -cuda -mp -o fail_pinned_alloc.x fail_pinned_alloc.f90
and running the executable, the following error is issued:
"__pinned_alloc04: cuMemHostAlloc returns error code 201"
Unfortunately, I cannot try with the 25.x compilers.
Any help is welcome 🙂
Thank you! 😊
Hi a.rpallo,
Thanks for the report and nice reproducer. The “good” news is that I can reproduce the issue on our current compiler so created a problem report, TPR #37517 and sent it to engineering for investigation.
The bad news is that any fix wouldn’t be until later so you wont be able to access it.
The work around would to move the declaration of the array and use in “threadprivate” from the module to the program body.
Though, I’m not sure 24.11 will work on a K80. It’s my understanding that CUDA 11.0 was the last CUDA Version to support K80s. In 24.11 we stopped shipping 11.0 with the compilers, moving to 11.8 as the oldest version of CUDA included. In other words, you may need to stick with NVHPC 24.5 anyways unless you point 24.11 to use a CUDA 11.0 install.
-Mat
Thanks, Mat, for the quick reply :)
Regarding version 24.11, aside from the issue I mentioned, it seems to work with the K80s. It is version 25.x that does not run at all.
Will the fix you are working on be only for version 25.x, or also for 24.11?
Thanks again,
best regards :)
Arnaldo
We don’t go back and patch older releases so any fix would only be in a future release.
FYI, TPR #37517 has been fixed in our 25.7 release.
-Mat
Ok, thank you for the information :)