signed vs unsigned int for indexes and sizes

I think I watched a video by one of the NVidia guys that mentioned that signed ints can be faster (I forget if this referred to GPU or CPU or both). I believe the rationale is that with unsigned types the compiler must guarantee wrap-around semantics, which prevents some optimizations.

I experimented with using signed and unsigned ints inside the kernels, and noticed no difference for types of the same size.

Are there reasons to prefer signed or unsigned ints for indexes and sizes (other than the unsigned ones being able to store twice bigger numbers, but being dangerously close to wrapping around 0)?

Which do you use and why?

An experienced wise software engineer once gave me this advice: All integer data in your C/C++ program wants to be ‘int’, until there is a really good reason for it not to be ‘int’. Many years on, I would support that statement based on my own experience. Be especially aware of mixing ‘int’ and ‘unsigned int’, that can lead to nasty bugs due to unexpected conversions from ‘int’ to ‘unsigned int’.

‘int’ is preferred for indexing arrays as this allows for various compiler optimizations since signed integer overflow is undefined, while overflow for ‘unsigned int’ must obey wrap-around semantics. This applies equally to device and host code. This doesn’t mean that this preference will result in speedup in a particular use case as code could be limited by any number of factors other than instruction throughput. With the maximum current memory size supported by GPUs (12 GB) an ‘int’ variable will usually be sufficient to address any arrays that occur in real-life applications. For example an array of ‘float’ could be just under 8GB in size and still be fully indexable by an ‘int’.

You would want to be more careful with sizes, which are usually expressed in units of bytes. You may need a ‘size_t’, which is the usual way in C/C++ to store size-in-byte data (see all standard library functions that use such arguments). At the same time, you would want to try to avoid extensive computations with ‘size_t’ as this could be slow, especially when ‘size_t’ is a 64-bit type.

In C++, a fairly typical line (for me, anyway) is something like

for(std::size_t i = 0; i < v.size(); ++i)

If you use ‘int’, you’ll get nagged by the compiler about comparing signed and unsigned types (as you probably should be), so I’ve actually stuck with ‘size_t’ (for CPU code) for many years. Not sure if I should reconsider.

As I said, ‘size_t’ is appropriate in some contexts, particularly general purpose library routines that must be able to deal with any size object. For all other instances of ‘size_t’ consider that all 64-bit integer arithmetic is emulated on the GPU (which is fundamentally a 32-bit processor), and thus its use may detract from performance.

Compiler warnings about comparisons between signed and unsigned integer types are a good thing, as this can be the source of nasty bugs, as a subset of the dangers of the mixing of signed and unsigned types that I alluded to above.

one example of a possible pitfall:

njuffa, I understand what you are saying. I also agree that one should try not to mix types. I am not sure I agree that the default go-to type should be “int”. By the default type I mean, if you are defining a matrix class, what type do “size” “num_cols”, “operator()”, etc. use?

IIRC, Fortran didn’t even have unsigned types, that’s why BLAS and LAPACK use “int”. If you are using STL (or presumably Thrust?), and CUDA, they effectively encourage you to use “size_t” unless you have reasons not to.

I haven’t decided yet, but what I dislike about “int” are these subtle limits like the “8GB” one you mentioned. Using ints makes it much more likely that your program will work flawlessly on all your tests, and then break catastrophically and unexpectedly on a bigger problem.

The bugs that “size_t” might cause (wrapping around 0) seem to be of the kind that would be detected early on.

I’m raising this thread from the grave due to this relevant and fresh blog post:

How undefined signed overflow enables optimizations in GCC

It’s probably safe to assume that Krister’s examples are applicable to any optimizing C/C++ compiler including NVCC.

Chandler Carruth also talked about it in his CppCon 2016 talk:

I really don’t know what to think of it


I just static_cast<int>(v.size()) in the cases where I’m using a loop or something like that. C++ isn’t a weakly-typed language which is fine.