Possible CUDA runtime library error?

Smokey · November 27, 2008, 2:16am

Hey all,

I’m by no means a guru on the C language/runtime specification, but shouldn’t the following code return a signed integer (when setting signed_value), not an unsigned integer?

[codebox]unsigned int unsigned_value = 123; // Set unsigned value

int signed_value = -unsigned_value; // Set signed value from unary operator on unsigned value… expected signed int, got unsigned int?[/codebox]

If I’m not mistaken, signed_value should be set to -123, but the current implementation of CUDA (not sure about 2.1 beta, I’m unable to test that for various reasons) sets it to (2^32)-123 (I’m assuming) - in any case, it doesn’t return -123.

I’m assuming the unary minus (-) operator for unsigned integers is returning an unsigned integer, when it should return a signed integer? In any case, I’d expect a signed/unsigned mismatch warning of sorts - which nvcc doesn’t emit either. (This issue has caught me out numerous times.)

I also expect this ‘might’ be intended (but undocumented?) behavior due to potentially lost data (invalid conversions for unsigned integers over 2^16) and a lack of runtime error checking on CUDA devices to catch out the potentially lost data.

Setup:

CUDA 2.0

Windows Vista (32bit)

Quadro FX 570

Driver Version: 177.84

alex_dubinsky · November 27, 2008, 2:47am

(2^32)-123 IS -123

Smokey · November 27, 2008, 3:11am

Like I said, it was assumption (I didn’t bother checking if -123 and (2^32)-123 were the same in binary between the two representations) - the fact of the matter is signed_value != -123.

Edit: I’ll varify this at home before I make any more accusations - I’ve been getting ‘odd’ behaviour from my work machine for a while now - ranging from incorrect device parameters (eg: multiprocessor count) to for loops in one kernel being 10x slower than an identical for loop in another kernel.

alex_dubinsky · November 27, 2008, 9:53am

Bogus results are often caused by a crashed kernel.

Your work machine reports incorrect multiprocessor count? What does it report, and what is your card?

MegaVolt · March 27, 2009, 7:05pm

int signed_value = - (int) unsigned_value;

Afaik its not cast automatically, the unsigned int is treated as int which completely screws up any calculations you do mixing int and unsigned int.
I had the same problem with an unsigned int i as counter for my for loop, which I used for calculations (with signed int) inside the loop. Mixing both just didn’t work, manual casts helped.

Smokey · March 30, 2009, 11:17pm

Indeed, I came to the same conclusion (however for things like int2 vs uint2, it ended up being simpler just to use int2 for ‘everything’).

It’s just a bit strange, because taking code that compiles/runs fine in gcc or icc, then throwing it at nvcc, doesn’t produce the same results (as you said, not automatic casting in these cases).

Edit: fixed variable type typos.