Hello,

I’m currently working on a small and simple Cuda interval library (a bachelor thesis) and need some help with the rounding modes.

The Programming Guide states:

and later on in the appendix:

My question is, how can I “statically” set the rounding mode to “round-towards-zero”?

I know, there are C intrinsics with all 4 rounding modes. Yet being intrinsics/functions they are very slow.

And a little bit off-topic:

I tried to multiply large float numbers with __fmul_rz,

e.g. __fmul_rz(2x10^32, 2x10^32)

and the expected result should be +infinity, however the result is the number “below infinity”,

i.e the highest possible float: 3.4028235x10^38

It seems as __fmulr_rz would “round down” infinity. I’m not sure if this is my fault or not, because I am using the 3.0 SDK Debug Emulator (which is deprecated)

Thanks for any help!