I’m by no means a guru on the C language/runtime specification, but shouldn’t the following code return a signed integer (when setting signed_value), not an unsigned integer?
[codebox]unsigned int unsigned_value = 123; // Set unsigned value
int signed_value = -unsigned_value; // Set signed value from unary operator on unsigned value… expected signed int, got unsigned int?[/codebox]
If I’m not mistaken, signed_value should be set to -123, but the current implementation of CUDA (not sure about 2.1 beta, I’m unable to test that for various reasons) sets it to (2^32)-123 (I’m assuming) - in any case, it doesn’t return -123.
I’m assuming the unary minus (-) operator for unsigned integers is returning an unsigned integer, when it should return a signed integer? In any case, I’d expect a signed/unsigned mismatch warning of sorts - which nvcc doesn’t emit either. (This issue has caught me out numerous times.)
I also expect this ‘might’ be intended (but undocumented?) behavior due to potentially lost data (invalid conversions for unsigned integers over 2^16) and a lack of runtime error checking on CUDA devices to catch out the potentially lost data.
Like I said, it was assumption (I didn’t bother checking if -123 and (2^32)-123 were the same in binary between the two representations) - the fact of the matter is signed_value != -123.
Edit: I’ll varify this at home before I make any more accusations - I’ve been getting ‘odd’ behaviour from my work machine for a while now - ranging from incorrect device parameters (eg: multiprocessor count) to for loops in one kernel being 10x slower than an identical for loop in another kernel.
Afaik its not cast automatically, the unsigned int is treated as int which completely screws up any calculations you do mixing int and unsigned int.
I had the same problem with an unsigned int i as counter for my for loop, which I used for calculations (with signed int) inside the loop. Mixing both just didn’t work, manual casts helped.
Indeed, I came to the same conclusion (however for things like int2 vs uint2, it ended up being simpler just to use int2 for ‘everything’).
It’s just a bit strange, because taking code that compiles/runs fine in gcc or icc, then throwing it at nvcc, doesn’t produce the same results (as you said, not automatic casting in these cases).