32/64 bit question

TrailingStop · February 15, 2024, 5:57am

Hi,

I understand that all GPUs are 32 bit as of today and there seems to be no need to switch to 64 bit in the near future. However there are some 64-bit supporting PTX functions available. For example the simple bitwise logical operation AND/OR/XOR are available for 32 bit data as well as for 64 bit data.

Is it save to assume that the 64 bit version is never slower that 2 calls to the 32 bit version?

Thanks.

striker159 · February 15, 2024, 7:53am

The throughput of bitwise logical operations on 64-bit integers is not listed in the official table (CUDA C++ Programming Guide).
This means that the following should apply

Other instructions and functions are implemented on top of the native instructions. The implementation may be different for devices of different compute capabilities, and the number of native instructions after compilation may fluctuate with every compiler version.

According to the SASS on compiler explorer (Compiler Explorer) nvcc 12.3.1 uses 2 32-bit logical operations for 64-bit.

Other 64 bit operations can be significantly slower than their 32 bit version, most notably 64-bit floating point arithmetic on consumer gpus.

TrailingStop · February 15, 2024, 8:04am

Hi,

thanks for your information. I was looking for a ‘general rule of thumb’ which one to use if you have the choice like:

xor.b64 res_ui64, x_ui64, y_ui64

vs

xor.b32 res_ui32_0, x_ui32_0, y_ui32_0
xor.b32 res_ui32_1, x_ui32_1, y_ui32_1

with ui64 → ( ui32_0 << 32 ) | ui32_1

Thanks.

striker159 · February 15, 2024, 8:27am

That is PTX code which will be further compiled to optimized SASS code.
I would simply use the ptx instruction intended for the respective datatype, so xor.b64 for 64-bit values.

Topic		Replies	Views
64 bit integer operations CUDA Programming and Performance	6	7785	July 9, 2008
PTX,... does comparing a bit either a 0 or 1 take 64 bits? CUDA Programming and Performance	3	497	April 13, 2018
estimate 64bit integer instruction throughput CUDA Programming and Performance	4	841	September 29, 2018
64 bit integer shift instruction throughput CUDA Programming and Performance	3	6766	June 8, 2011
16-bit vs 32-bit Integer Arithmetic Performance CUDA Programming and Performance cuda	3	904	April 21, 2024
64 vs 32 bit Why 64 bit code is significantly slower than 32 bit code? CUDA Programming and Performance	19	4247	October 11, 2010
CUDA FAQ posted CUDA Programming and Performance	3	6323	May 22, 2007
Are 64-bit integer instructions natively supported by GPU? CUDA Programming and Performance	1	2322	October 5, 2009
64-bit versus 32-bit CUDA code Any benefit at all? CUDA Programming and Performance	5	12948	November 3, 2009
64 bit add.cc (among others) CUDA Programming and Performance	9	2467	October 3, 2014

32/64 bit question

Related topics