Volatile in Kernels

Denis_Altmann · July 15, 2011, 8:01am

I tried to create a device subroutine that synchronizes multiple cuda blocks. For this to work it is necessary to declare a variable volatile:

attributes(device) subroutine syncblocks(syncval)

implicit none

integer,volatile :: syncval
integer :: dummy

call threadfence()
call syncthreads()
if (threadIdx%x==1) then
dummy=atomicAdd(syncval,1)
do
if (syncval == gridDim%x) then
exit
end if
end do
end if
call syncthreads()

end subroutine syncblocks

If I write the equivalent in CUDA C it works quite fine:

device void syncblocks(int* syncval) {
threadfence();
syncthreads();
if (threadIdx.x==0) {
atomicAdd(syncval,1);
while((volatile int)syncval<gridDim.x) {
}
}
syncthreads();
}

Unfortunately the Fortran version does not work. This is because the volatile keyword is simply dropped during the translation to CUDA C. The above kernel results in:

extern “C” device void syncblocks(signed char* _psyncval)
{
int dummy;
int xthreadidx_x;
int xgriddim_x;
xthreadidx_x = (int)(threadIdx.x+1);
xgriddim_x = (int)gridDim.x;
threadfence();
syncthreads();
if( ((xthreadidx_x)!=(1))) goto _BB_6;
dummy = __pgi_atomicAddi((signed char)(_psyncval), 1);
_BB_5: ;
if( ((((int*)(_psyncval)))!=(xgriddim_x))) goto _BB_5;
_BB_6: ;
__syncthreads();
}

If volatile is not supported for device code in CUDA Fortran, I think the compiler should give an error or at least a warning.

MatColgrove · July 18, 2011, 8:36pm

Hi Denis,

I talked with our engineers and they will get this fixed in August’s 11.8 release. Volatile should get passed to the generated CUDA C code.

Best Regards,
Mat

Topic		Replies	Views
volatile in Cuda Fortran Legacy PGI Compilers	2	3249	November 18, 2011
volatile in CUDA Fortran Legacy PGI Compilers	5	5945	August 25, 2012
volatile data Legacy PGI Compilers	3	1885	March 8, 2012
Use/Copy 'volatile' array in CUDA CUDA Programming and Performance	1	1969	November 2, 2009
Compiler optimisation effecting code correctness CUDA Programming and Performance	5	779	November 25, 2011
Do CUDA volatiles work? CUDA Programming and Performance	0	2211	November 23, 2009
NVCC fails to generate volatile memory instructions CUDA Programming and Performance	0	2776	June 29, 2011
Volatile keyword? CUDA Programming and Performance	4	3191	December 22, 2009
Using Volatile/shared memory in OpenACC Kernels Region Legacy PGI Compilers	2	3157	April 26, 2016
atomic function do not support volatile parameter ? CUDA Programming and Performance	1	1441	January 19, 2014

Volatile in Kernels

end subroutine syncblocks

device void __syncblocks(int* syncval) { __threadfence(); __syncthreads(); if (threadIdx.x==0) { atomicAdd(syncval,1); while((volatile int)syncval<gridDim.x) { } } __syncthreads(); }

Related topics

device void syncblocks(int* syncval) {
threadfence();
syncthreads();
if (threadIdx.x==0) {
atomicAdd(syncval,1);
while((volatile int)syncval<gridDim.x) {
}
}
syncthreads();
}