using 1byte, 2byte, 4byte in CUDA Fortran

I have an array of data with values pretty small. I’m thinking of using 1byte or 2byte to save memory; but not sure if current CUDA Fortran support using 1byte, 2byte data or not? and if so, does it help much with performance as I afraid using 1byte data may take more machine instructions than using 4byte integer.

Another question is that I have an array of pairs (a,b) with values fit to 4bits. Am thinking of using an array of 8bits value; i.e. to save ‘a’ in the higher 4bits, and ‘b’ to the lower 4bits of an 8bits values. During the execution of the kernel, I need to extract ‘a’ and ‘b’ as two separate variables; does CUDA Fortran has the function to extract the higher 4bits and the lower 4bits.

Any one can comment?


Hi Tuan,

INTEGER kinds 1, 2, and 4 are all supported in CUDA Fortran as are the Fortran bit-wise intrinsics (iand,ior, ieor, ibits, ishft). So you should be able to do everything you’re wanting to do.

Performance-wise, what your saving is amount of data copied. In you kernels, you’ll want to convert the data to the most normal size (32 bits for a GPU) and do your integer arithmetic in that type. Instruction throughput for ints is the same as for shorts and chars in any modern architecture.

Hope this helps,

Thanks a lot Mat.