Parallel Reduction on Kepler


I found a post from Justin Luitjens and Mark Harris about parallel reduction using CUDA on Kepler. I am converting it from C++ to Fortran. To finish it I need to know a way to convert the following lines.

int2 val = reinterpret_cast<int2*>(in)[i];
sum + = val.x + val.y;


I am not a heavy-duty Fortran user, so this is just a generic pointer. The traditional way to reinterprete data at the bit level is use of EQUIVALENCE. The modern Fortran 90 way is to use TRANSFER.