conversion from size_t to float always results in zero?

Hi,

I tracked down some strange behaviour of my CUDA kernel when I tried to assign a variable of type size_t to a variable of type float:

size_t a = 5;
float b, c;

b = a; // b is 0.0
// but:
c = (int)a // c is 5.0

Is this a bug or a feature?

Regards,
enuhtac

I think this falls into the undefined behaviour category.

In this case size_t is probably a 64 bit unsigned integer, so you are effectively demoting and implicitly casting a 64 bit integer to a float. It might be interedting to see what PTX the compiler generated, but my guess is there were 32 bits thrown away before the cast. That is probably where the value went. With any hints about what to do, I doubt it is reasonable to expect the compiler to do much else.

Regardless of whether size_t maps to unsigned int or unsigned long long int, the conversion to float is well-defined, however a warning would be appropriate about “loss of information”. If your observation applies to the CUDA 3.2 toolchain, I would encourage you to file a bug, attaching a self-contained repro case. Please also record your platform. Thanks.

Hm…

I tried to write a simple test kernel with just one size_t to float assignment. This gives the correct result. But in my more complex kernel I definitely had problems with this. Maybe this is connected to my other posting ‘dealing with 3d grids for cfd simulations’ where I describe that under certain circumstances the loop body of nested loops is not executed (although it should). As I did my size_t experiments inside this loop, maybe the use of size_t in contrast to int has an influence to the fact if the loop body is executed or not. So my size_t to float assignment could have been skipped. But this is just speculation…

enuhtac

ok - just anything I do inside this nested loop can suddenly result in zero (or prevent the loop from being execute - I’m not sure). E.g if I try to assign a product of two non-zero values the result is zero, but assigning each factor on it’s own gives the correct result…