do bool/ char types imply inherent type conversion?

hello,

i suppose bool and char instructions/ arithmetic are handled by the SM’s SPUs

would this then imply inherent type conversions to 32bit, and would it thus be better to avoid bool/ char, and rather work with unsigned int/ int as much as possible?

the programming guide’s throughput of Native Arithmetic Instructions table has an entry for Type conversions from 8-bit and 16-bit integer to 32-bit types, which had me wondering…

C and C++ specify that when evaluating an expression, any operand having a type narrower than ‘int’ is converted to ‘int’ first. If the hardware provides instructions for operations on 8-bit and / or 16-bit data, compilers can take advantage of them as long as the resulting code behaves exactly as if it is following the semantics specified by the language standard.

Current GPUs have extremely limited support for operations on 8-bit and 16-bit data, so in most cases the compiler has to convert to a 32-bit type when evaluating an expression, either because there is no direct hardware support for a particular operation or to satisfy language semantics. It follows that in general programmers should prefer use of ‘int’, unless there is a good reason to use ‘unsigned int’ or a narrower integer data type (signed or unsigned). I am reasonably sure this is spelled out in the documentation, most likely the Best Practices Guide.

“C and C++ specify that when evaluating an expression, any operand having a type narrower than ‘int’ is converted to ‘int’ first”

…missed this

" I am reasonably sure this is spelled out in the documentation, most likely the Best Practices Guide"

…missed this

all noted; thanks

and bool is very convenient; char is just bool on steroids

is there even still “a good reason” to use such types - i am not sure whether the memory storage space argument would even hold anymore

some of the built-in structures like (x.y.z of) threadIdx._ are of type unsigned int; but i suppose this is a different story altogether

Narrow integer types occur in various image and signal processing use cases. In the context of CUDA it may be a good idea to handle these in terms of 32-bit aggregate types like ‘uchar4’, ‘ushort2’, etc. In fact CUDA offers a bunch of intrinsic device functions for the processing of such packed data. These enjoy hardware support on Kepler-class GPUs, but are mostly emulated on Maxwell-class GPUs, and completely emulated on Fermi-class GPUs. The emulation is highly optimized, however, and due to the advantages of fetching data from memory in 32-bit chunks they should be given serious consideration even on non-Kepler platforms.

As for ‘int’ vs ‘unsigned int’, since the latter is subject to modulo 2**n wrap-around requirements according to C/C++ standards, the compiler can often optimize code based on ‘int’ computations more aggressively, for example when dealing with address arithmetic. However, in some use cases, such as multi-precision integer arithmetic, one obviously wants and needs ‘unsigned int’ arithmetic. I do not have an overview where CUDA APIs use unsigned integer types (other than ‘size_t’, which is used for memory sizing as in other standard C functions); I would assume there are good reasons to use unsigned integers where they occur.