atomic funciton CUDA FOrtran support 64-bit?

Hi,
In the manual, it is said that all atomic functions require arguments of type integer(kind=4). Does it means 32-bit only?
In C CUDA, from CC 1.2, atomic function can allow 64-bit arguments. I’m not sure if CUDA Fortran atomic function support double-precision data?

Tuan.

Hi Tuan,

The CUDA C atomic functions are callable from CUDA Fortran so if it’s supported in CUDA C, then you can call these function. However, we don’t have true CUDA Fortran support for atomics. It is something we are working on but no ETA yet.

  • Mat

You mean all atomic functions mentioned in Sect. 3.6.5 of PGI CUDA Fortran manual are indeed reference to CUDA C function?
If so, I guess the manual is not correct as it is written that “both arguments must be of type integer(kind=4)”. I believe that, in Fortran, the argument should be 4 bytes; yet in C CUDA, it is said that it support arguments of 8 bytes (since CC 1.3)

It will help me a lot if you can explain this for me.

Thanks,
Tuan

I am getting an error when using atomic functions.

[hatazaki@hpc19 ~/gomi]$ cat foo.cuf
module gpu
integer,device::atom
contains
attributes(global) subroutine test()
integer::i,j
i=blockdim%x*(blockidx%x-1)+threadidx%x
if (i==1) then
atom=0
j=atomicadd(atom,1)
endif
end subroutine test
end module gpu
program test
use gpu
call test<>>()
end program test
[hatazaki@hpc19 ~/gomi]$ pgf90 foo.cuf
NOTE: your trial license will expire in 7 days, 9.32 hours.
NOTE: your trial license will expire in 7 days, 9.32 hours.
/tmp/pgcudaforsyT2npDZtn1.gpu(21): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (signed char *, signed char *)

1 error detected in the compilation of "/tmp/pgnvd2zTI6gYXK9Y.nv0".
PGF90-F-0000-Internal compiler error. pgnvd job exited with nonzero status code       0 (foo.cuf: 10)
PGF90/x86-64 Linux 10.4-0: compilation aborted

Do I need to submit a bug report?

Takao

Hi Takao,

Do I need to submit a bug report?

No. The implementation of CUDA Fortran is a bit behind the Spec, and atomics are not yet fully supported. Up until this week, we’ve had little interest in atomics (one one request in four months), but this week we’ve gotten about 5. Because of the new interest, I’ve asked our engineers to up the priority for atomics.

  • Mat

Mat,

understood. I would like to request that CUDA Fortran supports __threadfence() intrinsic as well.

Regards,

Takao

Hi Takao,

I added a feature request (TPR#16892) for __threadfence and sent it to engineering.

  • Mat

Hi tthh,

FYI, support for “__threadfence()” will be available in the 10.6 compiler.

Usage: “call threadfence()”

  • Mat